首页> 外文会议>European Conference on Speech Communication and Technology v.4; 20010903-20010907; Aalborg; DK >Crosslingual Speech Recognition with Multilingual Acoustic Models Based on Agglomerative and Tree-Based Triphone Clustering
【24h】

Crosslingual Speech Recognition with Multilingual Acoustic Models Based on Agglomerative and Tree-Based Triphone Clustering

机译:基于聚集和基于树的三音子聚类的多语言声学模型的跨语言语音识别

获取原文
获取原文并翻译 | 示例

摘要

The paper describes our ongoing work on crosslingual speech recognition based on multilingual triphone hidden Markov models. Multilingual acoustic models were built using two different clustering procedures: agglomerative triphone clustering and tree-based triphone clustering. The agglomerative clustering procedure is based on measuring the similarity of triphones on a phoneme level where the monophone similarity is estimated by the Houtgast algorithm. The tree-based clustering procedure is based on common broad classes. The Slovenian, German and Spanish 1000 FDB SpeechDat(Ⅱ) databases were used for training. The crosslingual speech recognition was performed on the Norwegian 1000 FDB SpeechDat(Ⅱ) database. No adaptation or training with the Norwegian database was used. The mapping of Norwegian phonemes was done with the IPA scheme. Five different Norwegian recognition vocabularies were generated. The best crosslingual system achieved a recognition rate of 45.03%, while the reference Norwegian system achieved 78.32%.
机译:本文介绍了我们正在进行的基于多语言三音机隐马尔可夫模型的跨语言语音识别工作。使用两种不同的聚类程序建立了多语言声学模型:聚集三音机聚类和基于树的三音机聚类。聚集聚类过程基于在音素级别上测量三音素的相似度,其中通过Houtgast算法估计单音素的相似度。基于树的聚类过程基于常见的广泛类。使用斯洛文尼亚语,德语和西班牙语的1000 FDB SpeechDat(Ⅱ)数据库进行训练。在Norwegian 1000 FDB SpeechDat(Ⅱ)数据库上进行了跨语言语音识别。没有使用挪威数据库进行改编或培训。挪威音素的映射是通过IPA方案完成的。生成了五种不同的挪威识别词汇表。最佳的双语系统达到了45.03%的识别率,而参考挪威语系统达到了78.32%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号