首页> 外文OA文献 >Isolated word recognition from in-ear microphone data using Hidden Markov Models (HMM)
【2h】

Isolated word recognition from in-ear microphone data using Hidden Markov Models (HMM)

机译:使用隐马尔可夫模型(HMM)从入耳式麦克风数据中进行隔离的单词识别

摘要

This thesis is part of an ongoing larger scale research study started in 2004 at the Naval Postgraduate School (NPS) which aims to develop a speech-driven human-machine interface for the operation of semi-autonomous military robots in noisy operational environments. Earlier work included collecting a small database of isolated word utterances of seven words from 20 adult subjects using an in-ear microphone. The research conducted here develops a speaker-independent isolated word recognizer from these acoustic signals based on a discrete-observation Hidden Markov Model (HMM). The study implements the HMM-based isolated word recognizer in three steps. The first step performs the endpoint detection and speech segmentation by using short-term temporal analysis. The second step includes speech feature extraction using static and dynamic MFCC parameters and vector quantization of continuous-valued speech features. Finally, the last step involves the discrete-observation HMM-based classifier for isolated word recognition. Experimental results show the average classification performance around 92.77%. The most significant result of this study is that the acoustic signals originating from speech organs and collected within the external ear canal via the in-ear microphone can be used for isolated word recognition. The second dataset collected under low signal-to-noise ratio conditions with additive noise results in 79% recognition accuracy in the HMM-based classifier. We also compared the classification results of the data collected within the ear canal and outside the mouth via the same microphone. The second dataset collected under low signal-to-noise ratio conditions with additive noise results in 79% recognition accuracy in the HMM-based classifier. We also compared the classification results of the data collected within the ear canal and outside the mouth via the same microphone. Average classification rates obtained for the data collected outside the mouth shows significant performance degradation (down to 63%), over that observed with the data collected from within the ear canal (down to 86%). The ear canal dampens high frequencies. As a result, the HMM model derived for the data with dampened higher frequencies does not accurately fit the data collected outside the mouth, resulting in degraded recognition performances.
机译:本论文是2004年在海军研究生院(NPS)进行的一项正在进行的大规模研究的一部分,该研究的目的是开发语音驱动的人机界面,用于在嘈杂的操作环境中操作半自动军事机器人。早期的工作包括使用入耳式麦克风收集来自20个成人受试者的七个单词的孤立单词发音的小型数据库。这里进行的研究基于离散观测隐马尔可夫模型(HMM),从这些声音信号中开发出与说话者无关的隔离单词识别器。该研究分三步实现了基于HMM的隔离单词识别器。第一步,通过使用短期时间分析来执行端点检测和语音分割。第二步包括使用静态和动态MFCC参数进行语音特征提取以及对连续值语音特征进行矢量量化。最后,最后一步涉及用于离散单词识别的基于离散观测HMM的分类器。实验结果表明,平均分类性能约为92.77%。这项研究的最重要结果是,源自语音器官并通过耳内麦克风在外耳道内收集的声学信号可用于孤立的单词识别。在具有附加噪声的低信噪比条件下收集的第二个数据集,在基于HMM的分类器中的识别精度为79%。我们还比较了通过同一麦克风在耳道内和嘴外收集的数据的分类结果。在具有附加噪声的低信噪比条件下收集的第二个数据集,在基于HMM的分类器中的识别精度为79%。我们还比较了通过同一麦克风在耳道内和嘴外收集的数据的分类结果。与从耳道内收集的数据(下降到86%)相比,在嘴外收集的数据获得的平均分类率显示出明显的性能下降(下降到63%)。耳道可抑制高频。结果,为具有较高衰减频率的数据导出的HMM模型无法准确拟合嘴外收集的数据,从而导致识别性能下降。

著录项

  • 作者

    Kurcan Remzi Serdar;

  • 作者单位
  • 年度 2006
  • 总页数
  • 原文格式 PDF
  • 正文语种
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号