Bangla User Adaptive Word Speech Recognition: Approaches and Comparisons

Adnan Firoze; Shamsul Arifin; Rashedur M. Rahman

首页> 外文期刊>International journal of fuzzy system applications >Bangla User Adaptive Word Speech Recognition: Approaches and Comparisons

【24h】

Bangla User Adaptive Word Speech Recognition: Approaches and Comparisons

机译：孟加拉语用户自适应单词语音识别：方法与比较

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The paper presents Bangla word speech recognition using two novel approaches with a comprehensive analysis. The first approach is based on spectral analysis and fuzzy logic and the second one uses Mel-Frequency Cepstral Coefficients (MFCC) analysis and feed-forward back-propagation neural networks. As human speech is imprecise and ambiguous, fuzzy logic - the base of which is indeed linguistic ambiguity, could serve as a precise tool for analyzing and recognizing human speech. The authors' systems revolve around the visual representations of voiced signals - the Fourier energy spectrum and the MFCC. The essences of a Fourier energy spectrum and the MFCC are matrices that include information about properties of a sound by storing energy and frequency in discrete time. The decision making process of their systems is based on fuzzy logic and neural networks. Experimental results demonstrate that their fuzzy logic based system is 86% accurate whereas the Artificial Neural Networks (ANN) based system is 90% accurate compared to a commercial Hidden Markov Model (HMM) based speech recognizer that shows 73% accuracy on an average. Moreover, the authors 'research derives that, even though ANN gives a better recognition accuracy than the fuzzy logic based system, the fuzzy logic based system is more accurate when it comes to "more difficult" or "polysyllabic" words. In terms of runtime performance, the fuzzy logic based system outperforms the ANN based Bangla speech recognition system.

机译：本文通过两种新颖的方法，对孟加拉语单词语音识别进行了全面分析。第一种方法基于频谱分析和模糊逻辑，第二种方法使用梅尔频率倒谱系数（MFCC）分析和前馈反向传播神经网络。由于人类语音不精确且含糊不清，因此模糊逻辑（其基础确实是语言上的歧义）可以用作分析和识别人类语音的精确工具。作者的系统围绕语音信号的视觉表示-傅立叶能谱和MFCC。傅立叶能谱和MFCC的本质是通过在离散时间内存储能量和频率来包含有关声音属性的信息的矩阵。他们的系统的决策过程基于模糊逻辑和神经网络。实验结果表明，与基于商业隐式马尔可夫模型（HMM）的语音识别器相比，基于模糊逻辑的系统的准确度平均为73％，而基于人工神经网络（ANN）的系统的准确度为90％。此外，作者的研究得出的结论是，即使ANN的识别准确度比基于模糊逻辑的系统要好，但是当涉及到“难度更大”或“多音节”的单词时，基于模糊逻辑的系统也会更加准确。在运行时性能方面，基于模糊逻辑的系统优于基于ANN的Bangla语音识别系统。

著录项

来源
《International journal of fuzzy system applications》 |2013年第3期|共36页
作者
Adnan Firoze; Shamsul Arifin; Rashedur M. Rahman;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动控制、自动控制系统;
关键词
Artificial Neural Networks (ANN); Backpropagation; Cepstrum; Fuzzy Logic; Melody (MEL) Scale; Mel-Frequency Cepstral Coefficients (MFCC); Segmentation; Spectogram; Speech Recognition; Short-Time Fourier Transform (STFT);

机译：人工神经网络（ANN）;反向传播;倒谱;模糊逻辑;旋律（MEL）等级;梅尔频率倒谱系数（MFCC）;分段;频谱图;语音识别;短时傅立叶变换（STFT）;

相似文献

外文文献
中文文献
专利

1. Bangla User Adaptive Word Speech Recognition: Approaches and Comparisons [J] . Adnan Firoze, Shamsul Arifin, Rashedur M. Rahman International journal of fuzzy system applications . 2013,第3期

机译：孟加拉语用户自适应单词语音识别：方法与比较
2. A comparison of two word-recognition tasks in multitalker babble: Speech Recognition in Noise Test (SPRINT) and Words-in-Noise Test (WIN). [J] . Wilson RH, Cates WB Journal of the American Academy of Audiology . 2008,第7期

机译：比较多才欺骗中的两个单词识别任务：噪声测试中的语音识别（Sprint）和噪声 - 噪声测试（Win）。
3. Prediction of Speech Recognition in Cochlear Implant Users by Adapting Auditory Models to Psychophysical Data [J] . Svante Stadler, Arne Leijon EURASIP journal on advances in signal processing . 2009,第14期

机译：通过适应听觉模型的心理物理数据预测人工耳蜗使用者的语音识别。
4. A new word separation algorithm for continuous Bangla Speech Recognition [C] . Chowdhury N., Sattar M.A. Computers and Information Technology, 2009. ICCIT '09 . 2009

机译：一种新的用于连续孟加拉语音识别的分词算法
5. Language competence and information processing strategy: A comparison of first and second language word recognition in connected speech [D] . Hayashi, Takuo 1987

机译：语言能力和信息处理策略：关联语音中第一语言和第二语言单词识别的比较
6. Comparison of Two Music Training Approaches on Music and Speech Perception in Cochlear Implant Users [O] . Christina D. Fuller, John J. Galvin III, Bert Maat, 2018

机译：两种人工耳蜗使用者音乐和言语感知训练方法的比较
7. Utilizing hearing assistive technology (HAT) to assess speech recognition: Comparison of word recognition scores obtained by hearing instrument users [O] . Schutzenhofer Stephanie 2009

机译：利用助听技术（HAT）评估语音识别：听力仪器用户获得的单词识别分数的比较
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Bangla User Adaptive Word Speech Recognition: Approaches and Comparisons

摘要

著录项

相似文献

相关主题

期刊订阅