Air traffic control speech recognition system cross-task & speaker adaptation

de Cordoba R.; Ferreiros J.; San-Segundo R.; Macias-Guarasa J.; Montero J.M.; Fernandez F.; DHaro L.F.; Pardo J.M.

首页> 外文期刊>IEEE Aerospace and Electronic Systems Magazine >Air traffic control speech recognition system cross-task & speaker adaptation

【24h】

Air traffic control speech recognition system cross-task & speaker adaptation

机译：空中交通管制语音识别系统跨任务和说话者自适应

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present an overview of the most common techniques used in automatic speech recognition to adapt a general system to a different environment (known as cross-task adaptation) such as in an air traffic control system (ATC). The conditions present in ATC are very specific: very spontaneous, the presence of noise, and high speed speech. So, with a typical speech recognizer the recognition results are unsatisfactory. We have to decide on the best option for the modeling: to develop acoustic models specific to those conditions from scratch using the data available for the new environment, or to carry out cross-task adaptation starting from reliable HMM models (usually requiring less data in the target domain). We begin with a description of the main techniques considered for cross-task adaptation, namely maximum a posteriori (MAP), maximum likelihood linear regression (MLLR), and the two together. We have applied each in two speech recognizers for air traffic control tasks, one for spontaneous speech and the other for a command interface. We show the performance of these techniques and compare them with the development of a new system from scratch. We also show the results obtained for speaker adaptation using a variable amount of adaptation data. The main conclusion is that MLLR can outperform MAP when a large number of transforms is used, and MLLR followed by MAP is the best option. All of these techniques are better than developing a new system from scratch, showing the effectiveness of mean and variance adaptation.

机译：我们概述了自动语音识别中用于使通用系统适应不同环境（称为跨任务适应）的最常见技术，例如空中交通管制系统（ATC）。 ATC中存在的条件非常具体：非常自发，存在噪音和高速语音。因此，对于典型的语音识别器，识别结果并不令人满意。我们必须决定建模的最佳选择：使用新环境中的可用数据从头开始针对特定条件开发声学模型，或者从可靠的HMM模型开始进行跨任务自适应（通常需要较少的数据）。目标域）。我们从对跨任务适应性考虑的主要技术的描述开始，即最大后验（MAP），最大似然线性回归（MLLR），以及两者结合。我们在两个语音识别器中分别应用了空中交通管制任务，一个用于自发语音，另一个用于命令界面。我们将展示这些技术的性能，并将它们与从头开始开发新系统进行比较。我们还显示了使用可变数量的自适应数据进行说话人自适应所获得的结果。主要结论是，当使用大量转换时，MLLR可以胜过MAP，而MLLR后跟MAP是最好的选择。所有这些技术都比从头开始开发新系统要好，这表明均值和方差自适应的有效性。

著录项

来源
《IEEE Aerospace and Electronic Systems Magazine》 |2006年第9期|p.12-17|共6页
作者
de Cordoba R.; Ferreiros J.; San-Segundo R.; Macias-Guarasa J.; Montero J.M.; Fernandez F.; DHaro L.F.; Pardo J.M.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类航空;
关键词
air traffic control; hidden Markov models; maximum likelihood estimation; regression analysis; speech recognition; HMM models; acoustic models; air traffic control system; automatic speech recognition; command interface; cross-task adaptation; maximum a posteriori;

机译：空中交通管制;隐马尔可夫模型;最大似然估计;回归分析;语音识别;HMM模型;声学模型;空中交通管制系统;自动语音识别;命令界面;跨任务自适应;最大后验;

相似文献

外文文献
中文文献
专利

1. Air traffic control speech recognition system cross-task & speaker adaptation [J] . de Cordoba R., Ferreiros J., San-Segundo R., IEEE Aerospace and Electronic Systems Magazine . 2006,第期

机译：空中交通管制语音识别系统跨任务和说话者自适应
2. Speaker clustering and transformation for speaker adaptation in speech recognition systems [J] . Padmanabhan M., Bahl L.R. IEEE Transactions on Speech and Audio Proceeding . 1998,第1期

机译：语音识别系统中的说话人适应和说话人聚类和转换
3. An Unsupervised Speaker Adaptation Method for Lecture-Style Spontaneous Speech Recognition Using Multiple Recognition Systems [J] . Seiichi NAKAGAWA, Tomohiro WATANABE, Hiromitsu NISHIZAKI, IEICE Transactions on Information and Systems . 2005,第3期

机译：基于多重识别系统的演讲风格自发语音识别的无监督说话人自适应方法
4. Speaker clustering and transformation for speaker adaptation in large-vocabulary speech recognition systems [C] . Padmanabhan, M., Bahl, . 1996

机译：大词汇量语音识别系统中说话人的聚类和转换，以适应说话人
5. Speaker Characteristic-based Acoustic Model Adaptation Method for Speaker Recognition Systems [D] . Millington, Daniel S. 2011

机译：基于说话者特征的说话人识别系统声学模型自适应方法
6. Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition [O] . Myungjong Kim, Younggwan Kim, Joohong Yoo, -1

机译：KL-HMM的正则化说话人适应用于音调异常语音识别
7. Speaker Clustering And Transformation For Speaker Adaptation In Large-Vocabulary Speech Recognition Systems [O] . M. Padmanabhan, L. R. Bahl, D. Nahamoo, 1995

机译：大词汇量语音识别系统中说话人聚类和说话人适应的转换

Air traffic control speech recognition system cross-task & speaker adaptation

摘要

著录项

相似文献

相关主题

期刊订阅