首页> 外文会议>Annual conference of the International Speech Communication Association;INTERSPEECH 2010 >Recognizing Cochlear Implant-like Spectrally Reduced Speech with HMM-based ASR: Experiments with MFCCs and PLP Coefficients
【24h】

Recognizing Cochlear Implant-like Spectrally Reduced Speech with HMM-based ASR: Experiments with MFCCs and PLP Coefficients

机译:基于HMM的ASR识别人工耳蜗般的频谱减少语音:MFCC和PLP系数的实验

获取原文

摘要

In this paper, we investigate the recognition of cochlear implant-like spectrally reduced speech (SRS) using conventional speech features (MFCCs and PLP coefficients) and HMM-based ASR. The SRS was synthesized from subband temporal envelopes extracted from original clean speech for testing, whereas the acoustic models were trained on a different set of original clean speech signals of the same speech database. It was shown that changing the bandwidth of the subband temporal envelopes had no significant effect on the ASR word accuracy. In addition, increasing the number of frequency subbands of the SRS from 4 to 16 improved significantly the system performance. Furthermore, the ASR word accuracy attained with the original clean speech, by using both MFCC-based and PLP-based speech features, can be achieved by using the 16-, 24-, or 32-subband SRS. The experiments were carried out by using the Tl-digits speech database and the HTK speech recognition toolkit.
机译:在本文中,我们研究了使用常规语音特征(MFCC和PLP系数)和基于HMM的ASR对耳蜗状植入式频谱缩减语音(SRS)的识别。 SRS是从原始原始语音中提取的子带时间包络合成的,用于测试,而声学模型是在同一语音数据库的另一组原始原始语音信号上进行训练的。结果表明,改变子带时域包络的带宽对ASR字精度没有明显影响。此外,将SRS的子频带数量从4个增加到16个,可以显着改善系统性能。此外,通过同时使用基于MFCC和基于PLP的语音功能,可以通过使用16、24或32子带SRS来获得原始原始语音所获得的ASR字精度。通过使用T1位数语音数据库和HTK语音识别工具包进行了实验。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号