首页> 外文学位 >Speech synthesis algorithms for voice conversion.
【24h】

Speech synthesis algorithms for voice conversion.

机译:用于语音转换的语音合成算法。

获取原文
获取原文并翻译 | 示例

摘要

The first goal of this research was to create a software-based voice conversion system to independently and automatically modify the characteristics of human voice. The system was intended to generate high quality test tokens for speech science and psychoacoustic studies. The second goal was to develop algorithms to convert voice from one speaker to that of another speaker. The results of this study will be of interest to researchers in speech analysis, speech synthesis and speaker identification.; The key ideas for our voice conversion system are based on the source-tract production model, which is a highly parametric representation for speech analysis and synthesis. The software system consists of three subsystems, a speech analyzer, a parameters modifier and a speech synthesizer, which extracts, modifies and synthesizes five types of acoustic features, respectively. The features are the formant frequency and bandwidth, the shape of the glottal pulse, the voicetype classification, the pitch contour and the gain contour. The first two types of parameters are frame-based, and they represent the speaker's characteristics of the vocal tract and the glottal folds, respectively. The final three parameters form the controlling parameters for our system. One major feature of our acoustic model is that the controlling parameters are independent of the other parameters so that they control the way of how the frame-based information concatenates, such as changing the speaking rate or increasing the voice volume. This makes it possible to mimic the characteristics of another speaker's voice, including the prosodic features.; The voice conversion algorithms are based on a speaker adaptation model that treats speaker differences as arising from a parametric transformation. The voice conversion task is then realized as the mapping between two set of parameters. Several experiments were conducted to test the performance of our voice conversion algorithms. The affine transformation method proved to be effective for converting single-syllable words, but less so for sentences. Perhaps this is because a sentence has more locally dynamic changes than the capacity of our linear mapping methods. One possible way to improve is to include a phoneme detector in our system and estimate the piecewise mapping functions instead of one linear function for the entire speech.
机译:这项研究的第一个目标是创建一个基于软件的语音转换系统,以独立自动地修改人类语音的特征。该系统旨在为语音科学和心理声学研究生成高质量的测试令牌。第二个目标是开发算法,将语音从一个说话者转换为另一个说话者。这项研究的结果将对语音分析,语音合成和说话人识别方面的研究人员感兴趣。我们的语音转换系统的关键思想基于源代码生产模型,该模型是语音分析和合成的高度参数表示。该软件系统由三个子系统组成,分别是语音分析器,参数修改器和语音合成器,它们分别提取,修改和合成五种类型的声学特征。特征包括共振峰频率和带宽,声门脉冲的形状,音型分类,音高轮廓和增益轮廓。前两种参数是基于帧的,它们分别代表说话者的声道和声门褶皱特征。最后三个参数构成我们系统的控制参数。我们的声学模型的一个主要特征是控制参数与其他参数无关,因此它们控制了基于帧的信息如何连接的方式,例如改变语速或增加语音量。这样就可以模仿其他说话者的声音特征,包括韵律特征。语音转换算法基于说话人自适应模型,该模型将说话人差异视为由参数转换引起的差异。然后,将语音转换任务实现为两组参数之间的映射。进行了一些实验来测试我们的语音转换算法的性能。仿射变换方法被证明对转换单音节单词有效,但对句子则不那么有效。可能是因为句子比我们线性映射方法的功能具有更多的局部动态变化。一种可能的改进方法是在我们的系统中包括一个音素检测器,并估计分段映射函数,而不是整个语音的一个线性函数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号