首页> 外文OA文献 >Listeners’ weighting of acoustic cues to synthetic speech naturalness: A multidimensional scaling analysis
【2h】

Listeners’ weighting of acoustic cues to synthetic speech naturalness: A multidimensional scaling analysis

机译:听众对合成语音自然的声学线索加权:多维缩放分析

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The quality of current commercial speech synthesis systems is now so high that system improvements are being made at subtle sub- and supra-segmental levels. Human perceptual evaluation of such subtle improvements requires a highly sophisticated level of perceptual attention to specific acoustic characteristics or cues. However, it is not well understood what acoustic cues listeners attend to by default when asked to evaluate synthetic speech. It may, therefore, be potentially quite difficult to design an evaluation method that allows listeners to concentrate on only one dimension of the signal, while ignoring others that are perceptually more important to them.The aim of the current study was to determine which acoustic characteristics of unit-selection synthetic speech are most salient to listeners when evaluating the naturalness of such speech. This study made use of multidimensional scaling techniques to analyse listeners’ pairwise comparisons of synthetic speech sentences. Results indicate that listeners place a great deal of perceptual importance on the presence of artifacts and discontinuities in the speech, somewhat less importance on aspects of segmental quality, and very little importance on stress/intonation appropriateness. These relative differences in importance will impact on listeners’ ability to attend to these different acoustic characteristics of synthetic speech, and should therefore be taken into account when designing appropriate methods of synthetic speech evaluation.
机译:目前的商业语音合成系统的质量现在如此之高,系统改进正在微妙的子和上段水平。人类感知评估这种微妙改善需要对特定声学特征或线索的高度复杂的感知性注意力。但是,当被要求评估合成语音时,默认情况下,声音迪斯窃听者默认情况并不顺利。因此,可能很难设计一种允许听众仅在信号的一个维度集中精力的评估方法,同时忽略对它们更重要的其他人。目前研究的目的是确定哪些声学特性在评估此类演讲的自然时,单位选择合成语音对听众最突出。本研究利用多维缩放技术来分析了综合语音句子的倾听者成对比较。结果表明,听众对演讲中的文物和不连续性的存在造成了很大的感知重要性,对节段性质量方面的重要性略显不大,对压力/语调适当性很重要。重要的重要差异将影响听众参与这些不同声学特征的合成语音的能力,因此在设计合成语音评估的适当方法时应考虑。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号