LANDMARK BASED LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION

机译：基于地标大词汇连续语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Large vocabulary automatic speech recognition relies usually on Hidden Markov Models (HMM) which make little use of phonetic or extra-linguistic knowledge. As an alternative landmark based speech recognition relies on precise signal landmarks and exploits distinctive features. Different types of landmarks can be used: phonetic, speaker, speech type, video, etc. In this paper we will focus on two kinds of landmarks: speaker and phonetic. We propose a theoretical framework to combine both approaches by introducing prior knowledge in a non-stationary HMM based decoder. As a case study we investigate how speaker landmarks issued out of speaker segmentation can be used for speech recognition and also how broad phonetic landmarks can be integrated in a HMM based decoder in order to focus on the best search path. We will show that in this case every phonetic class brings a small improvement, the best improvement being obtained with glides. Using all broad phonetic classes brings a significant improvement by reducing the error rate from 23% to 14% on a broadcast news transcription task. We also experimentally demonstrate that landmarks do not need to be detected with precise boundaries and can be used to fasten the beam search algorithm.

机译：大型词汇自动语音识别通常依赖于隐藏的马尔可夫模型（HMM），这几乎没有使用语音或语言知识。作为基于替代的地标语音识别依赖于精确的信号界标并利用独特的特征。可以使用不同类型的地标：语音，扬声器，语音类型，视频等。在本文中，我们将专注于两种地标：扬声器和语音。我们提出了一个理论框架，通过在基于非静止的HMM的解码器中引入先验知识来结合两种方法。作为一个案例研究，我们调查发言者分割发出的扬声器地标如何用于语音识别，以及如何集成在基于HMM的解码器中的广泛的语音地标，以便专注于最佳的搜索路径。我们将显示在这种情况下，每种语音级都会带来小的改进，通过滑动获得的最佳改进。使用所有广泛的语音类别通过在广播新闻转录任务上将错误率降低到14％的错误率来带来显着的改善。我们还通过实验证明了不需要用精确的边界检测地标，并且可用于固定光束搜索算法。

著录项

来源
《Conference on Speech Technology and Human-Computer Dialogue》|2007年||共8页
会议地点
作者
Daniel MORARU; Guillaume GRAVIER;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
Speech recognition; Speaker segmentation; Phonetic landmarks; Hidden Markov models;

机译：语音识别;扬声器分割;语音地标;隐藏的马尔可夫模型;

相似文献

外文文献
中文文献
专利

1. Acoustic-Phonetic Approaches for Improving Segment-Based Speech Recognition for Large Vocabulary Continuous Speech [J] . Krerksak Likitsupin, Proadpran Punyabukkana, Chai Wutiwiwatchai, Engineering journal . 2016,第2期

机译：改进大词汇量连续语音基于片段的语音识别的声学方法
2. Korean large vocabulary continuous speech recognition with morpheme-based recognition units [J] . Oh-Wook Kwon, Jun Park Speech Communication . 2003,第3a4期

机译：具有基于词素的识别单元的韩语大词汇量连续语音识别
3. An improved two-stage mixed language model approach for handling out-of-vocabulary words in large vocabulary continuous speech recognition [J] . Bert Reveil, Kris Demuynck, Jean-Pierre Martens Computer speech and language . 2014,第1期

机译：一种改进的两阶段混合语言模型方法，用于处理大词汇量连续语音识别中的词汇外单词
4. LANDMARK BASED LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION [C] . Daniel MORARU, Guillaume GRAVIER Conference on Speech Technology and Human-Computer Dialogue . 2007

机译：基于地标大词汇连续语音识别
5. An Error Detection and Correction Framework to Improve Large Vocabulary Continuous Speech Recognition [D] . Zhou, Zhengyu 2009

机译：一种提高大词汇量连续语音识别能力的错误检测与纠正框架
6. Toward clinical application of landmark-based speech analysis: Landmark expression in normal adult speech [O] . Keiko Ishikawa, Joel MacAuslan, Suzanne Boyce -1

机译：基于界标语音分析的临床应用：正常成人语音中的地标表达
7. Acoustic-Phonetic Approaches for Improving Segment-Based Speech Recognition for Large Vocabulary Continuous Speech [O] . Krerksak Likitsupin, Proadpran Punyabukkana, Chai Wutiwiwatchai, 2016

机译：改善基于分段的语音识别大词汇连续语音的声学语音方法

LANDMARK BASED LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION

摘要

著录项

相似文献

相关主题

期刊订阅