Unit Selection Using k-Nearest Neighbor Search for Concatenative Speech Synthesis

机译：使用k最近邻搜索进行级联语音合成的单位选择

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a new approach to rapidly identifying adequate synthesis units in extremely large speech corpora. Our aim is to develop a concatenative speech synthesis system with high performance (both speech quality and throughput) for various practical applications. Utilizing very large speech corpora allows more natural sounding synthesized speech to be created; the downside is an increase in the time taken to locate the synthesis units needed. The key to overcoming this problem is introducing state-of-the art database retrieval technologies. The first selection step, based on simple hash search, tabulates all synthesis unit candidates. The second step selects N best candidates using nearest neighbor search, a typical database retrieval technique. Finally, the best sequence of synthesis units is determined by Viterbi search. A runtime measurement test and subjective experiment are carried out. Their results confirm that the proposed approach reduces the runtime by about 40% compared to using only hash search with no degradation in the quality of synthesized speech for a 15 hour corpus.

机译：我们提出了一种新方法，可以快速识别超大型语音语料库中的适当合成单元。我们的目标是为各种实际应用开发具有高性能（语音质量和吞吐量）的级联语音合成系统。使用非常大的语音语料库可以创建听起来更自然的合成语音。缺点是增加了定位所需合成单元所需的时间。解决此问题的关键是引入最新的数据库检索技术。第一步选择基于简单的哈希搜索，将所有合成单元候选列表化。第二步使用最近的邻居搜索（一种典型的数据库检索技术）选择N个最佳候选者。最后，通过维特比搜索确定最佳的合成单位序列。进行了运行时测量测试和主观实验。他们的结果证实，与仅使用散列搜索相比，在15小时语料库中合成语音的质量没有下降的情况下，所提出的方法将运行时间减少了约40％。

著录项

来源
《3rd international universal communications symposium 2009》|2009年|P.379-382|共4页
会议地点 Tokyo(JP);Tokyo(JP)
作者
Hideyuki Mizuno; rnSatoshi Takahashi;
展开▼
作者单位

NTT Cyber Space Laboratories 1-1 Hikari-no-oka, Yokosuka-Shi, Kanagawa, 239-0847, Japan;

rnNTT Cyber Space Laboratories 1-1 Hikari-no-oka, Yokosuka-Shi, Kanagawa, 239-0847, Japan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类通信;
关键词
text to speech; concatenative speech synthesis; synthesis unit selection; nearest neighbor search;

机译：文字转语音;级联语音合成；合成单元选择；最近邻居搜索;

相似文献

外文文献
中文文献
专利

1. Fast Concatenative Speech Synthesis Using Pre-Fused Speech Units Based on the Plural Unit Selection and Fusion Method [J] . Masatsune TAMURA, Tatsuya MIZUTANI, Takehiko KAGOSHIMA IEICE Transactions on Information and Systems . 2007,第2期

机译：基于多个单元选择和融合方法的预融合语音单元快速级联语音合成
2. A Concatenative Speech Synthesis Method Using Context Dependent Phoneme Sequences with Variable Length as Search Units [J] . Hiroyuki SEGI, Tohru TAKAGI 電子情報通信学会技術研究報告. 音声. Speech . 2003,第264期

机译：一种基于上下文的变长音素序列作为搜索单元的语音合成方法
3. A Concatenative Speech Synthesis Method Using Context Dependent Phoneme Sequences with Variable Length as Search Units [J] . Hiroyuki SEGI, Tohru TAKAGI 電子情報通信学会技術研究報告. 音声. Speech . 2003,第264期

机译：使用具有可变长度的上下文依赖性音素序列作为搜索单元的连接性语音合成方法
4. Unit selection using k-nearest neighbor search for concatenative speech synthesis [C] . Hideyuki Mizuno, Satoshi Takahashi Proceedings of the 3rd International Universal Communication Symposium . 2009

机译：使用k最近邻搜索进行级联语音合成的单元选择
5. Voting Nearest Neighbors: SVM Constraints Selection Algorithm Based on K-Nearest Neighbors [D] . Moreira da Costa, Leandro. 2019

机译：投票最近的邻居：基于K-Indect邻居的SVM约束选择算法
6. Privacy-Enhancing k-Nearest Neighbors Search over Mobile Social Networks [O] . Yuxi Li, Fucai Zhou, Yue Ge, 2021

机译：Privacy-Enhancation K-Interligh邻居搜索移动社交网络
7. ADMISSIBLE STOPPING IN VITERBI BEAM SEARCH FOR UNIT SELECTION IN CONCATENATIVE SPEECH SYNTHESIS [O] . Shinsuke Sakai, Tatsuya Kawahara, Satoshi Nakamura 2009

机译：Viterbi束搜索中的可允许停顿，用于认知语音合成中的单元选择

Unit Selection Using k-Nearest Neighbor Search for Concatenative Speech Synthesis

摘要

著录项

相似文献

相关主题

期刊订阅