Speech synthesis algorithms for voice conversion.

机译：用于语音转换的语音合成算法。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The first goal of this research was to create a software-based voice conversion system to independently and automatically modify the characteristics of human voice. The system was intended to generate high quality test tokens for speech science and psychoacoustic studies. The second goal was to develop algorithms to convert voice from one speaker to that of another speaker. The results of this study will be of interest to researchers in speech analysis, speech synthesis and speaker identification.; The key ideas for our voice conversion system are based on the source-tract production model, which is a highly parametric representation for speech analysis and synthesis. The software system consists of three subsystems, a speech analyzer, a parameters modifier and a speech synthesizer, which extracts, modifies and synthesizes five types of acoustic features, respectively. The features are the formant frequency and bandwidth, the shape of the glottal pulse, the voicetype classification, the pitch contour and the gain contour. The first two types of parameters are frame-based, and they represent the speaker's characteristics of the vocal tract and the glottal folds, respectively. The final three parameters form the controlling parameters for our system. One major feature of our acoustic model is that the controlling parameters are independent of the other parameters so that they control the way of how the frame-based information concatenates, such as changing the speaking rate or increasing the voice volume. This makes it possible to mimic the characteristics of another speaker's voice, including the prosodic features.; The voice conversion algorithms are based on a speaker adaptation model that treats speaker differences as arising from a parametric transformation. The voice conversion task is then realized as the mapping between two set of parameters. Several experiments were conducted to test the performance of our voice conversion algorithms. The affine transformation method proved to be effective for converting single-syllable words, but less so for sentences. Perhaps this is because a sentence has more locally dynamic changes than the capacity of our linear mapping methods. One possible way to improve is to include a phoneme detector in our system and estimate the piecewise mapping functions instead of one linear function for the entire speech.

机译：这项研究的第一个目标是创建一个基于软件的语音转换系统，以独立自动地修改人类语音的特征。该系统旨在为语音科学和心理声学研究生成高质量的测试令牌。第二个目标是开发算法，将语音从一个说话者转换为另一个说话者。这项研究的结果将对语音分析，语音合成和说话人识别方面的研究人员感兴趣。我们的语音转换系统的关键思想基于源代码生产模型，该模型是语音分析和合成的高度参数表示。该软件系统由三个子系统组成，分别是语音分析器，参数修改器和语音合成器，它们分别提取，修改和合成五种类型的声学特征。特征包括共振峰频率和带宽，声门脉冲的形状，音型分类，音高轮廓和增益轮廓。前两种参数是基于帧的，它们分别代表说话者的声道和声门褶皱特征。最后三个参数构成我们系统的控制参数。我们的声学模型的一个主要特征是控制参数与其他参数无关，因此它们控制了基于帧的信息如何连接的方式，例如改变语速或增加语音量。这样就可以模仿其他说话者的声音特征，包括韵律特征。语音转换算法基于说话人自适应模型，该模型将说话人差异视为由参数转换引起的差异。然后，将语音转换任务实现为两组参数之间的映射。进行了一些实验来测试我们的语音转换算法的性能。仿射变换方法被证明对转换单音节单词有效，但对句子则不那么有效。可能是因为句子比我们线性映射方法的功能具有更多的局部动态变化。一种可能的改进方法是在我们的系统中包括一个音素检测器，并估计分段映射函数，而不是整个语音的一个线性函数。

著录项

作者
Hsiao, Yung-Sheng.;
展开▼
作者单位

University of Florida.;

展开▼
授予单位 University of Florida.;
学科 Engineering Electronics and Electrical.; Speech Communication.; Computer Science.
学位 Ph.D.
年度 1996
页码 169 p.
总页数 169
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;语言学;自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. 用于语音转换的有效基音频率转换算法 [J] . 宋鹏, 金赟, 包永强, 东南大学学报（英文版） . 2012,第002期
2. A Novel Entropy Based Algorithm to Remove Silence from Speech andClassifying the Residue as Voiced/unvoiced Regions [J] . R. Johny Elton, P. Vasuki, J. Mohanalin Asian Journal of Information Technology . 2016,第19期

机译：一种基于熵的新型算法，可去除语音中的沉默并将残渣分类为有声/无声区域
3. HMM-based Speech Synthesis with Multiple Individual Voices using Exemplar-based Voice Conversion [J] . Trung-Nghia Phung International journal of computer science and network security . 2017,第5期

机译：使用基于示例的语音转换，具有多个单独语音的基于HMM的语音合成
4. Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm [J] . Thomas M. R. P., Gudnason J., Naylor P. A. Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第1期

机译：用YAGA算法估计语音中声门的闭合和打开瞬间。
5. Voiced/Unvoiced Decision Algorithm for HMM-based Speech Synthesis [C] . Shiyin Kang, Zhiwei Shuang, Quansheng Duan, International Speech Communication Association . 2009

机译：基于HMM的语音合成的浊音/清音决策算法
6. Design of network-adaptive multi rate speech coding algorithm based on multi level quantization for voice over IP telephony [D] . Chandrasekar, Narendar P. 2006

机译：IP语音上基于多级量化的网络自适应多速率语音编码算法设计
7. Development of the Arabic Voice Pathology Database and Its Evaluation by Using Speech Features and Machine Learning Algorithms [O] . Tamer A. Mesallam, Mohamed Farahat, Khalid H. Malki, 2017

机译：阿拉伯语音病理数据库的开发及其语音特征和机器学习算法的评估
8. SPEECH-TO-SINGING SYNTHESIS: CONVERTING SPEAKING VOICES TO SINGING VOICES BY CONTROLLING ACOUSTIC FEATURES UNIQUE TO SINGING VOICES [O] . Takeshi Saitou, Masataka Goto 2009

机译：语音合成：通过控制独特的语音特征将语音转换为语音

Speech synthesis algorithms for voice conversion.

摘要

著录项

相似文献

相关主题

期刊订阅