...
首页> 外文期刊>Computer speech and language >Concept-to-Speech generation with knowledge sharing for acoustic modelling and utterance filtering
【24h】

Concept-to-Speech generation with knowledge sharing for acoustic modelling and utterance filtering

机译:从概念到语音的生成与知识共享,用于声学建模和话语过滤

获取原文
获取原文并翻译 | 示例
           

摘要

A Concept-to-Speech (CTS) system converts the conceptual representation of a sentence-to-be-spoken into speech. While some CTS systems consist of independently built text generation and Text-to-Speech (TTS) modules, the majority of the existing CTS systems enhance the connection between these two modules with a prosodic prediction module that utilizes linguistic knowledge from the text generator to predict prosodic features for TTS generation. However, knowledge embodied within the individual modules has the potential to be shared in more ways. This paper describes knowledge sharing for acoustic modelling and utterance filtering in a Mandarin CTS system. First, syntactic information generated by the text generator is propagated to a hidden Markov model (HMM) based acoustic model within the TTS module and replaces the symbolic prosodic phrasing features therein. Our experimental results show that this approach alleviates the local hard-decision problem in automatic prosodic phrasing for Mandarin CTS systems and achieves a comparable performance to the traditional approach without explicit prosodic phrasing. Second, the acoustic features of multiple synthetic utterances expressing the same input concept are utilized to evaluate the utterance candidates. With this 'post-processing' mechanism, our CTS system is able to filter out inferior synthetic utterances and find an acceptable candidate to express the input concept.
机译:语音到概念(CTS)系统将要说的句子的概念表示转换为语音。尽管某些CTS系统由独立构建的文本生成和文本语音转换(TTS)模块组成,但是大多数现有的CTS系统都使用韵律预测模块来增强这两个模块之间的连接,该韵律预测模块利用文本生成器的语言知识进行预测TTS生成的韵律功能。但是,包含在各个模块中的知识有可能以更多方式共享。本文介绍了普通话CTS系统中声学建模和话语过滤的知识共享。首先,将文本生成器生成的语法信息传播到TTS模块内基于隐马尔可夫模型(HMM)的声学模型,并替换其中的符号韵律句法特征。我们的实验结果表明,该方法减轻了普通话CTS系统自动韵律短语中的局部硬决策问题,并且在没有显式韵律短语的情况下达到了与传统方法相当的性能。其次,利用表达相同输入概念的多个合成发音的声学特征来评估候选发音。通过这种“后处理”机制,我们的CTS系统能够过滤出劣质的合成话语,并找到可接受的候选词来表达输入概念。

著录项

  • 来源
    《Computer speech and language》 |2016年第7期|46-67|共22页
  • 作者单位

    National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, Anhui 230027, PR China;

    National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, Anhui 230027, PR China,University of Science and Technology of China, No. 96, JinZhai Road, Hefei, Anhui, China;

    National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, Anhui 230027, PR China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Concept-to-Speech; Speech synthesis; Hidden Markov model; Natural language generation;

    机译:从概念到语音;语音合成;隐马尔可夫模型;自然语言生成;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号