首页> 外文会议>2018 First Asian Conference on Affective Computing and Intelligent Interaction >Emphatic Speech Synthesis and Control Based on Characteristic Transferring in End-to-End Speech Synthesis
【24h】

Emphatic Speech Synthesis and Control Based on Characteristic Transferring in End-to-End Speech Synthesis

机译:端到端语音合成中基于特征转移的重点语音合成与控制

获取原文
获取原文并翻译 | 示例

摘要

End-to-end text-to-speech (E2E TTS) synthesis has achieved great success. This work investigates the emphatic speech synthesis and control mechanisms in the E2E framework and proposes an E2E-based method for transferring emphasis characteristic between speakers. Characteristic differences between emphatic and neutral speech are learned from a smallscale corpus containing parallel neutral and emphasis speech utterances recorded by one speaker and further transferred to another speaker so that we can generate emphatic speech with latter speakers voice. Emphasis embedding is injected to the encoder of the extended E2E TTS model to capture the aforementioned differences; while the decoder and attention module are used to decode those differences into synthetic neutral / emphatic speech. Speaker codes linked to the decoder and attention module provide the E2E model the ability for characteristic transferring between speakers. To control the emphatic strength, an encoder memory manipulation mechanism is proposed. Experimental results indicate the effectiveness of our proposed model.
机译:端到端文本到语音(E2E TTS)合成取得了巨大的成功。这项工作研究了E2E框架中强调语音的合成和控制机制,并提出了一种基于E2E的方法来在说话者之间传递强调特征。强调语音和中性语音之间的特征差异是从一个小语料库中获悉的,该语料库包含由一个说话者录制的平行的中性和强调语音话语,然后进一步转移给另一位说话者,这样我们就可以使用后一个说话者的语音来产生强调语音。强调嵌入被注入到扩展的E2E TTS模型的编码器中,以捕获上述差异。而解码器和注意模块则用于将这些差异解码为合成的中性/强调语音。链接到解码器和注意模块的扬声器代码为E2E模型提供了在扬声器之间传递特征的能力。为了控制强调强度,提出了一种编码器存储器操纵机制。实验结果表明了我们提出的模型的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号