Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System

机译：SRI Hub4分区评估连续语音识别系统的声学建模

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We describe the development of the SRI system evaluated in the 1996 DARPA continuous speech recognition (CSR) Hub4 partitioned evaluation (PE). The task for the Hub4 evaluation was to recognize speech from broadcast television and radio shows. Recognizing such speech by machines poses many challenges. First, the segments to be recognized could be very long. This introduces a problem in training and recognition becauseof the consequentincreased system memory requirement. A simple segmentation technique is used to break long segments into shorter, more manageable lengths. The speech from broadcast news sources exhibits a variety of difficult acoustic conditions, such as spontaneous speech, band-limited speech, and speech in the presence of noise, music, or background speakers. Such background conditions lead to significant degradation in performance. We describe techniques, based on acoustic adaptation, that adapt recognition models to the different acoustic background conditions, so as to improve recognition performance. We also present a novel algorithm that clusters the test data segments so that the resulting clusters are homogeneous with respect to speakers. This is followed by acoustic adaptation to the individual clusters, resulting in a significant performance improvement. Finally, we briefly describe our studies in language modeling for the Hub4 evaluation which is detailed further in another paper in these proceedings.

机译：我们描述了1996年DARPA连续语音识别（CSR）Hub4分区评估（PE）中评估的SRI系统的开发。 Hub4评估的任务是识别广播电视和广播节目中的语音。用机器识别这种语音提出了许多挑战。首先，要识别的段可能会很长。由于随之增加的系统内存需求，这在训练和识别中引入了问题。一种简单的分段技术用于将长分段分成更短，更易于管理的长度。来自广播新闻源的语音表现出各种困难的声学条件，例如自发语音，带限语音以及存在噪音，音乐或背景说话者的语音。这样的背景条件导致性能显着下降。我们介绍了基于声学自适应的技术，该技术可使识别模型适应不同的声学背景条件，从而提高识别性能。我们还提出了一种新颖的算法，该算法对测试数据段进行聚类，以使生成的聚类相对于扬声器而言是同质的。接下来是对各个群集的声学适应，从而显着提高了性能。最后，我们简要描述了我们对Hub4评估的语言建模研究，这些研究将在另一篇论文中进一步详细介绍。

著录项

来源
《Proceedings of the speech recognition workshop》|1997年|127-132|共6页
会议地点 Chantilly VA(US)
作者
Ananth Sankar; Larry Heck; Andreas Stolcke;
展开▼
作者单位

Speech Technology And Research Laboratory SRI International Menlo Park, California;

Speech Technology And Research Laboratory SRI International Menlo Park, California;

Speech Technology And Research Laboratory SRI International Menlo Park, California;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自然科学理论与方法论;自动模拟理论（自动仿真理论）;
关键词

相似文献

外文文献
中文文献
专利

1. On the efficiency of classical RASTA filtering for continuous speech recognition: Keeping the balance between acoustic pre-processing and acoustic modelling [J] . Johan de Veth, Louis Boves Speech Communication . 2003,第3a4期

机译：关于用于连续语音识别的经典RASTA过滤的效率：保持声学预处理与声学建模之间的平衡
2. Acoustic model combinations for continuous speech recognition system [J] . R.K. Aggarwal, Mayank Dave International journal of computational systems engineering . 2012,第2期

机译：连续语音识别系统的声学模型组合
3. Modern standard Arabic speech corpus for implementing and evaluating automatic continuous speech recognition systems [J] . Mohammad Abd-Alrahman Mahmoud Abushariah, Raja Noor Ainon, Roziati Zainuddin, Journal of the Franklin Institute . 2012,第7期

机译：用于实现和评估自动连续语音识别系统的现代标准阿拉伯语语音语料库
4. Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System [C] . DARPA speech recognition workshop . 1997

机译：SRI Hub4分区评估的声学建模连续语音识别系统
5. Integrate template matching and statistical modeling for continuous speech recognition. [D] . Sun, Xie. 2011

机译：集成模板匹配和统计建模，可进行连续语音识别。
6. Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models [O] . A. Paats, T. Alumäe, E. Meister, 2018

机译：一项爱沙尼亚放射线语音识别系统临床表现的回顾性分析：不同声学和语言模型的影响
7. First Automatic Fongbe Continuous Speech Recognition System: Development of Acoustic Models and Language Models [O] . Laleye, Fréjus,, Besacier, Laurent, Ezin, Eugène,, 2016

机译：首款自动Fongbe连续语音识别系统：声学模型和语言模型的发展

Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System

摘要

著录项

相似文献

相关主题

期刊订阅