首页> 外文会议>International Conference on Advances in Big Data, Computing and Data Communication Systems >HMM-based Speech Synthesis System incorporated with Language Identification for Low-resourced Languages
【24h】

HMM-based Speech Synthesis System incorporated with Language Identification for Low-resourced Languages

机译:基于HMM的语音合成系统,结合了低资源语言的语言识别

获取原文

摘要

Text-to-speech (TTS) synthesis systems are of benefit towards learning new or foreign languages. These systems are currently available for various major languages but not available for low-resourced languages. Scarcity of these systems may lead to challenges in learning new languages specifically low-resourced languages. Development of language-specific systems like TTS and Language identification (LID) have an important task to address in mitigating the historical linguistic effects of discrimination and domination imposed onto low-resourced indigenous languages. This paper presents the development of a multi-language LID+TTS synthesis system that generate audio of input text using the predicted language in four South African languages, namely: Tshivenda, Sepedi, Xitsonga and IsiNdebele. On the front-end, is the LID module that detects language of the input text before the TTS synthesis module produces output audio. The LID module is trained on a 4 million words dataset resulted with 99% accuracy outperforming the state-of-the-art systems. A robust method for building TTS voices called hidden Markov model method is used to build new voices in the selected languages. The quality of the voices is measured using the mean opinion score and word error rate metrics that resulted with positive results on the understandability, naturalness, pleasantness, intelligibility and overall impression of the system of the newly created TTS voices. The system is available as a website service.
机译:文本到语音(TTS)合成系统对学习新的或外语有益。这些系统目前可用于各种主要语言,但不适用于低资源语言。这些系统的稀缺可能导致学习新语言的挑战专门低资源的语言。特定于TTS和语言识别(LID)等语言特定系统的开发有一个重要的任务,可以解决减轻鉴别和统治的历史语言影响,施加到低资源的土着语言。本文介绍了多语言盖+ TTS综合系统的开发,使用四种南非语言中的预测语言生成输入文本的音频,即:Tshivenda,Sepedi,Xitsonga和Isindebele。在前端,是在TTS合成模块产生输出音频之前检测输入文本的语言的盖模块。盖模块培训,在400万字数据上培训,导致99%的精度优于最先进的系统。用于构建名为Hidden Markov Model方法的TTS声音的强大方法用于在所选语言中构建新的声音。使用平均意见分数和单词错误率指标来测量声音的质量,导致积极的结果对新创建的TTS声音系统的可理解性,自然,愉悦度,可懂度和整体印象产生积极的结果。该系统可作为网站服务提供。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号