...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Statistical Text-to-Speech Synthesis Based on Segment-Wise Representation With a Norm Constraint
【24h】

Statistical Text-to-Speech Synthesis Based on Segment-Wise Representation With a Norm Constraint

机译:基于范数约束的分段明智表示的统计文本语音合成

获取原文
获取原文并翻译 | 示例
           

摘要

In statistical HMM-based text-to-speech systems (STTS), speech feature dynamics is modeled by first- and second-order feature frame differences, which, typically, do not satisfactorily represent frame to frame feature dynamics present in natural speech. The reduced dynamics results in over-smoothing of speech features, often sounding as muffled synthesized speech. In this correspondence, we propose a method to enhance a baseline STTS system by introducing a segment-wise model representation with a norm constraint. The segment-wise representation provides additional degrees of freedom in speech feature determination. We exploit these degrees of freedom for increasing the speech feature vector norm to match a norm constraint. As a result, statistically generated speech features are less over-smoothed, resulting in more natural sounding speech, as judged by listening tests.
机译:在基于统计HMM的文本语音转换系统(STTS)中,语音特征动力学是通过一阶和二阶特征帧差异建模的,该差异通常不能令人满意地表示自然语音中存在的帧到帧特征动态。降低的动态性会导致语音功能过分平滑,通常听起来像是含糊的合成语音。在这种对应关系中,我们提出了一种通过引入具有范数约束的分段模型表示来增强基线STTS系统的方法。分段表示在语音特征确定中提供了额外的自由度。我们利用这些自由度来增加语音特征向量范数以匹配范数约束。结果,通过听觉测试判断,统计生成的语音特征不太平滑,导致语音听起来更加自然。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号