Leveraging topical and positional cues for language modeling in speech recognition

Hsuan-Sheng Chiu; Kuan-Yu Chen; Berlin Chen

首页> 外文期刊>Multimedia Tools and Applications >Leveraging topical and positional cues for language modeling in speech recognition

【24h】

Leveraging topical and positional cues for language modeling in speech recognition

机译：利用主题和位置提示进行语音识别中的语言建模

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper investigates language modeling with topical and positional information for large vocabulary continuous speech recognition. We first compare among a few topic models both theoretically and empirically, including document topic models and word topic models. On the other hand, since for some spoken documents such as broadcast news stories, the composition and the word usage of documents of the same style are usually similar, the documents hence can be separated into partitions consisting of identical rhetoric or topic styles by the literary structures, like introductory remarks, elucidations of methodology or affairs, conclusions of the articles, references or footnotes of reporters, etc. We hence present two position-dependent language models for speech recognition by integrating word positional information into the exiting n-gram and topic models. The experiments conducted on broadcast news transcription seem to indicate that such position-dependent models obtain comparable results to the existing «-gram and topic models.

机译：本文研究了具有主题和位置信息的语言建模，以用于大词汇量连续语音识别。我们首先在理论上和经验上比较几个主题模型，包括文档主题模型和单词主题模型。另一方面，由于对于诸如广播新闻报导之类的某些口头文件来说，相同样式的文件的构成和单词用法通常是相似的，因此，这些文件可以由文学者分成由相同的修辞或主题样式组成的分区。结构，例如引言，方法论或事务的说明，文章的结论，记者的参考文献或脚注等。因此，我们通过将单词位置信息整合到现有的n-gram和主题中，提出了两种基于位置的语言模型，用于语音识别。楷模。在广播新闻转录上进行的实验似乎表明，这种与位置相关的模型可以获得与现有«-gram和主题模型类似的结果。

著录项

来源
《Multimedia Tools and Applications》 |2014年第2期|1465-1481|共17页
作者
Hsuan-Sheng Chiu; Kuan-Yu Chen; Berlin Chen;
展开▼
作者单位

Department of Computer Science & Information Engineering, National Taiwan Normal University, Taipei, Taiwan;

Department of Computer Science & Information Engineering, National Taiwan Normal University, Taipei, Taiwan;

Department of Computer Science & Information Engineering, National Taiwan Normal University, Taipei, Taiwan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech recognition; Language model; Positional information; Topical information; Language model adaptation;

机译：语音识别;语言模型;位置信息;专题信息;语言模型适应;

相似文献

外文文献
中文文献
专利

1. Leveraging relevance cues for language modeling in speech recognition [J] . Berlin Chen, Kuan-Yu Chen Information Processing & Management . 2013,第4期

机译：利用相关提示在语音识别中进行语言建模
2. Syllable language models for Mandarin speech recognition: Exploiting character language models [J] . Liu X., Hieronymus J.L., Gales M.J.F., The Journal of the Acoustical Society of America . 2013,第1期

机译：普通话语音识别的音节语言模型：利用字符语言模型
3. Comparison of Performance of Enhanced Morpheme-based Language Model with Different Word-based Language Models for Improving the Performance of Tamil Speech Recognition System [J] . S. SARASWATHI, T.V. GEETHA ACM transactions on Asian language information processing . 2007,第3期

机译：增强的基于词素的语言模型与不同的基于单词的语言模型的性能比较，以提高泰米尔语语音识别系统的性能
4. Distribution-Based Feature Normalization for Robust Speech Recognition Leveraging Context and Dynamics Cues [C] . Yu-Chen Kao, Berlin Chen Conference of the International Speech Communication Association . 2013

机译：基于分配的特征标准化，用于杠杆语音识别利用上下文和动态线索
5. Arabic language modeling with stem-derived morphemes for automatic speech recognition. [D] . Heintz, Ilana. 2010

机译：具有词干衍生语素的阿拉伯语言建模，可实现自动语音识别。
6. Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition [O] . Edvin Pakoci, Branislav Popović, Darko Pekar 2019

机译：在塞尔维亚大型词汇语音识别的语言建模中使用形态学数据
7. Multimodal Fusion for Cued Speech Language Recognition [O] . Argyropoulos Savvas, Tzovaras Dimitrios, Strintzis Michael G. 2007

机译：用于提示语音语言识别的多模式融合

Leveraging topical and positional cues for language modeling in speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅