【24h】

Prosody generation in TTS system for Azeri

机译:TTS系统中用于Azeri的韵律生成

获取原文

摘要

Naturalness in Text-to-Speech (TTS) systems is very important in achieving high quality waveform. The naturalness of the waveform is highly correlated to phonetic coverage and prosodic features such as loudness, duration and pitch. This paper addresses the implementation of a prosodic TTS for Azeri. The TTS system to which the prosodic information is added, is a concatenative synthesizer based on diphones. For adding prosody and increasing naturalness, we have obtained a primary pitch curve for each word, based on the location of the stressed syllable. Also using sentence type effects, the final pitch contour has been modified. As far as we know, the output speech that is produced with this system is the first prosodic Azeri synthetic speech ever created. High intelligibility and acceptable naturalness of the synthesized speech have been confirmed by subjective listening tests.
机译:文本语音转换(TTS)系统中的自然度对于实现高质量波形非常重要。波形的自然度与语音覆盖率和诸如响度,持续时间和音调之类的韵律特征高度相关。本文讨论了针对Azeri的韵律TTS的实现。添加了韵律信息的TTS系统是基于双音素的串联合成器。为了增加韵律和增加自然性,我们根据重读音节的位置为每个单词获取了主音高曲线。同样使用句子类型的效果,最终音高轮廓已被修改。据我们所知,此系统产生的输出语音是有史以来第一个韵律的阿塞拜疆语合成语音。主观听觉测试已经证实了合成语音的高清晰度和可接受的自然性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号