首页> 外文会议>Proceedings of the speech recognition workshop >Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System
【24h】

Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System

机译:SRI Hub4分区评估连续语音识别系统的声学建模

获取原文
获取原文并翻译 | 示例

摘要

We describe the development of the SRI system evaluated in the 1996 DARPA continuous speech recognition (CSR) Hub4 partitioned evaluation (PE). The task for the Hub4 evaluation was to recognize speech from broadcast television and radio shows. Recognizing such speech by machines poses many challenges. First, the segments to be recognized could be very long. This introduces a problem in training and recognition becauseof the consequentincreased system memory requirement. A simple segmentation technique is used to break long segments into shorter, more manageable lengths. The speech from broadcast news sources exhibits a variety of difficult acoustic conditions, such as spontaneous speech, band-limited speech, and speech in the presence of noise, music, or background speakers. Such background conditions lead to significant degradation in performance. We describe techniques, based on acoustic adaptation, that adapt recognition models to the different acoustic background conditions, so as to improve recognition performance. We also present a novel algorithm that clusters the test data segments so that the resulting clusters are homogeneous with respect to speakers. This is followed by acoustic adaptation to the individual clusters, resulting in a significant performance improvement. Finally, we briefly describe our studies in language modeling for the Hub4 evaluation which is detailed further in another paper in these proceedings.
机译:我们描述了1996年DARPA连续语音识别(CSR)Hub4分区评估(PE)中评估的SRI系统的开发。 Hub4评估的任务是识别广播电视和广播节目中的语音。用机器识别这种语音提出了许多挑战。首先,要识别的段可能会很长。由于随之增加的系统内存需求,这在训练和识别中引入了问题。一种简单的分段技术用于将长分段分成更短,更易于管理的长度。来自广播新闻源的语音表现出各种困难的声学条件,例如自发语音,带限语音以及存在噪音,音乐或背景说话者的语音。这样的背景条件导致性能显着下降。我们介绍了基于声学自适应的技术,该技术可使识别模型适应不同的声学背景条件,从而提高识别性能。我们还提出了一种新颖的算法,该算法对测试数据段进行聚类,以使生成的聚类相对于扬声器而言是同质的。接下来是对各个群集的声学适应,从而显着提高了性能。最后,我们简要描述了我们对Hub4评估的语言建模研究,这些研究将在另一篇论文中进一步详细介绍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号