首页> 外文期刊>International journal of fuzzy system applications >Bangla User Adaptive Word Speech Recognition: Approaches and Comparisons
【24h】

Bangla User Adaptive Word Speech Recognition: Approaches and Comparisons

机译:孟加拉语用户自适应单词语音识别:方法与比较

获取原文
获取原文并翻译 | 示例
           

摘要

The paper presents Bangla word speech recognition using two novel approaches with a comprehensive analysis. The first approach is based on spectral analysis and fuzzy logic and the second one uses Mel-Frequency Cepstral Coefficients (MFCC) analysis and feed-forward back-propagation neural networks. As human speech is imprecise and ambiguous, fuzzy logic - the base of which is indeed linguistic ambiguity, could serve as a precise tool for analyzing and recognizing human speech. The authors' systems revolve around the visual representations of voiced signals - the Fourier energy spectrum and the MFCC. The essences of a Fourier energy spectrum and the MFCC are matrices that include information about properties of a sound by storing energy and frequency in discrete time. The decision making process of their systems is based on fuzzy logic and neural networks. Experimental results demonstrate that their fuzzy logic based system is 86% accurate whereas the Artificial Neural Networks (ANN) based system is 90% accurate compared to a commercial Hidden Markov Model (HMM) based speech recognizer that shows 73% accuracy on an average. Moreover, the authors 'research derives that, even though ANN gives a better recognition accuracy than the fuzzy logic based system, the fuzzy logic based system is more accurate when it comes to "more difficult" or "polysyllabic" words. In terms of runtime performance, the fuzzy logic based system outperforms the ANN based Bangla speech recognition system.
机译:本文通过两种新颖的方法,对孟加拉语单词语音识别进行了全面分析。第一种方法基于频谱分析和模糊逻辑,第二种方法使用梅尔频率倒谱系数(MFCC)分析和前馈反向传播神经网络。由于人类语音不精确且含糊不清,因此模糊逻辑(其基础确实是语言上的歧义)可以用作分析和识别人类语音的精确工具。作者的系统围绕语音信号的视觉表示-傅立叶能谱和MFCC。傅立叶能谱和MFCC的本质是通过在离散时间内存储能量和频率来包含有关声音属性的信息的矩阵。他们的系统的决策过程基于模糊逻辑和神经网络。实验结果表明,与基于商业隐式马尔可夫模型(HMM)的语音识别器相比,基于模糊逻辑的系统的准确度平均为73%,而基于人工神经网络(ANN)的系统的准确度为90%。此外,作者的研究得出的结论是,即使ANN的识别准确度比基于模糊逻辑的系统要好,但是当涉及到“难度更大”或“多音节”的单词时,基于模糊逻辑的系统也会更加准确。在运行时性能方面,基于模糊逻辑的系统优于基于ANN的Bangla语音识别系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号