首页> 外文会议>European Conference on Speech Communication and Technology v.3; 20010903-20010907; Aalborg; DK >Representation of Large Lexica Using Finite-State Transducers for the Multilingual Text-to-Speech Synthesis Systems
【24h】

Representation of Large Lexica Using Finite-State Transducers for the Multilingual Text-to-Speech Synthesis Systems

机译:使用有限状态换能器的多语言文本语音合成系统表示大型Lexica

获取原文
获取原文并翻译 | 示例

摘要

Large external language resources used for multilingual text processing in TTS systems represent a big problem because of needed space and slow look-up time. Representation of large lexica using finite-state transducers is mainly motivated by considerations of space and time efficiency. In the paper we present a method and results of compiling large German phonetic and morphology lexica (CISLEX) into corresponding finite-state transducers (FSTs), both with about 300.000 words. For both lexica a great reduction in size and optimal access time was achieved. The starting size for German phonetic lexicon was 12.526 MB and 18.49 MB for morphology lexicon. The final size of the corresponding FST was only 2.78 MB for the phonetic lexicon and 6.33 MB for the morphology lexicon. At the same time the look-up time is optimal, since it depends only on the length of the input word and not on the size of the lexicon.
机译:由于所需的空间和缓慢的查找时间,用于TTS系统中的多语言文本处理的大量外部语言资源构成了一个大问题。使用有限状态换能器表示大型词典,主要是出于对空间和时间效率的考虑。在本文中,我们介绍了一种将大型德国语音和词法词典(CISLEX)编译为相应的有限状态转换器(FST)的方法和结果,它们均具有约300.000个单词。对于这两种词典,都实现了尺寸的极大减小和最佳访问时间。德国语音词典的起始大小为12.526 MB,形态词典的起始大小为18.49 MB。语音词典的相应FST的最终大小仅为2.78 MB,形态词典的最终大小仅为6.33 MB。同时,查找时间是最佳的,因为它仅取决于输入单词的长度,而不取决于词典的大小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号