首页> 外文会议>European Conference on Speech Communication and Technology v.2; 20010903-20010907; Aalborg; DK >Improved Data-Driven Generation of Pronunciation Dictionaries Using an Adapted Word List
【24h】

Improved Data-Driven Generation of Pronunciation Dictionaries Using an Adapted Word List

机译:使用自适应单词列表改进了数据驱动的发音词典的生成

获取原文
获取原文并翻译 | 示例

摘要

Data-driven approaches to learning pronunciation variants for phonetic dictionaries have to deal with the problem of acquiring a sufficient amount of training data. The reason is not the size of the databases, but the unfavorable distribution of word frequencies in natural speech, which is known as Zipfs law. In this paper we suggest a method which reorganizes a phonetic dictionary according to a given speech database in order to maximize the number of word models for which pronunciation variants can be learned with this corpus. Reorganization takes place automatically by analyzing the orthographic and phonetic transcriptions of the corpus. The method produces an alternative word list consisting of units ranging from partial words to multi-words. The efficiency and the limits of the approach are discussed on the basis of experiments carried out on the German VERBMOBIL corpus.
机译:数据驱动的方法来学习语音词典的发音变体必须解决获取足够数量的训练数据的问题。原因不是数据库的大小,而是自然语音中词频的不利分布,这被称为Zipfs定律。在本文中,我们提出了一种根据给定的语音数据库重新组织语音词典的方法,以最大程度地利用该语料库学习语音变体的单词模型数量。通过分析语料库的正字法和音标会自动进行重组。该方法产生由范围从部分单词到多单词的单位组成的替代单词列表。在对德国VERBMOBIL语料库进行的实验的基础上,讨论了该方法的效率和局限性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号