PFHTS-IDSS: A Hybrid HTS-based Framework for Indonesian Speech Synthesis via Phoneme and Full-context Lab

Lei Zhenfeng; Zhai Junjun; Chen Juntao; Liu Wenhao; Yang Shuangyuan; ul Haq Anwar

首页> 外文期刊>International Journal of Pattern Recognition and Artificial Intelligence >PFHTS-IDSS: A Hybrid HTS-based Framework for Indonesian Speech Synthesis via Phoneme and Full-context Lab

【24h】

PFHTS-IDSS: A Hybrid HTS-based Framework for Indonesian Speech Synthesis via Phoneme and Full-context Lab

机译：PFHTS-IDS：通过音素和全面上下文实验室的印度尼西亚语音合成的基于混合HTS的框架

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In recent years, globalization has highlighted the importance of having machines that can truly provide customized communication for different languages. Majority of the research in the field focus on developing technologies for widely used languages such as English. In this study, we apply HMM-based speech synthesis (HTS) technology for Indonesian language. The proposed hybrid HTS-based framework, PFHTS-IDSS, uses phoneme and full-context lab to synthesize Indonesian with higher accuracy. First, we identify a list of Indonesian phonemes according to the initial-final structure of Chinese language. Based on this, we add zero-initials that match the Indonesian acoustic performance and HTS, which can make the synthesized speech natural and smooth. Second, we consider Indonesian phonemes as synthetic units to synthesize speech through the triphone and full-context lab. In addition, we design context properties of the full-context lab and the corresponding question set to train the acoustic model, which can eliminate machine sounds. Experimental results suggest that the accuracy of phoneme segmentation (PSA) and the naturalness of speech synthesis (SSN) are significantly improved via PFHTS-IDSS. Especially, the PSA of selecting phonemes as synthetic units reaches 88.3% and the corresponding SSN based on full-context lab is 4.1. The results demonstrated by PFHTS-IDSS presented in this paper may be used in multilingual free interactive system to promote better communication in terms of voice navigation, intelligent speaker and question-answering system.

机译：近年来，全球化突出了拥有为不同语言提供定制通信的机器的重要性。该领域的大部分研究侧重于开发用于广泛使用的语言的技术，如英语。在这项研究中，我们为印度尼西亚语言应用基于HMM的语音合成（HTS）技术。所提出的基于混合HTS的框架，PFHTS-IDS，使用音素和全面上下文实验室以更高的准确度合成印度尼西亚。首先，根据汉语的最终最终结构，确定印度尼西亚音素列表。基于此，我们添加了与印度尼西亚声学性能和HTS匹配的零初始，这可以使合成语音自然和光滑。其次，我们认为印度尼西亚音素作为综合单位来通过三磡和全面貌实验室综合演讲。此外，我们设计全文实验室的上下文属性和相应的问题设置为培训声学模型，可以消除机器声音。实验结果表明，通过PFHTS-IDS显着改善了音素分割（PSA）的准确性和语音合成（SSN）的自然度。特别是，选择音素作为合成单元的PSA达到88.3％，基于全面实验室的相应SSN是4.1。本文提出的PFHTS-IDS证明的结果可用于多语种自由互动系统，以促进语音导航，智能扬声器和问答系统的更好通信。

著录项

来源
《International Journal of Pattern Recognition and Artificial Intelligence》 |2021年第4期|2158004.1-2158004.25|共25页
作者
Lei Zhenfeng; Zhai Junjun; Chen Juntao; Liu Wenhao; Yang Shuangyuan; ul Haq Anwar;
展开▼
作者单位

Xiamen Univ Sch Informat Xiamen 361005 Peoples R China;

Yunnan Univ Sch Informat Sci & Engn Kunming 650504 Peoples R China;

Guangzhou City Polytech Dept Informat Technol Guangzhou 510405 Peoples R China;

Chinese Peoples Armed Police Force Dezhou Branch Dezhou 253000 Peoples R China;

Xiamen Univ Sch Informat Xiamen 361005 Peoples R China;

Xiamen Univ Sch Informat Xiamen 361005 Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Indonesian; phoneme segmentation; speech synthesis; machine learning;

机译：印度尼西亚;音素分割;语音合成;机器学习;

相似文献

外文文献
中文文献
专利

1. Segmentation and Classification of Vowel Phonemes ofAssamese Speech Using a Hybrid Neural Framework [J] . MousmitaSarma, Kandarpa KumarSarma Applied computational intelligence and soft computing . 2012,第8期

机译：混合神经网络对阿萨姆语语音元音音素的分割与分类
2. Segmentation and Classification of Vowel Phonemes of Assamese Speech Using a Hybrid Neural Framework [J] . Mousmita Sarma, Kandarpa Kumar Sarma Applied computational intelligence and soft computing . 2012,第期

机译：混合神经网络对阿萨姆语语音元音音素的分割与分类
3. Framework for Choosing a Set of Syllables and Phonemes for Lithuanian Speech Recognition [J] . Sigita LAURINCIUKAITE, Antanas LIPEIKA Informatica . 2007,第3期

机译：为立陶宛语语音识别选择一组音节和音素的框架
4. Automatic Acquisition of Phoneme Models and Its Application to Phoneme Labeling of a Large Size of Speech Corpus [C] . Motoyuki Suzuki, Teruhiko Maeda, Hiroki Mori, Discovery science . 1998

机译：音素模型的自动获取及其在大型语音语料库音素标注中的应用
5. Optimization frameworks for the design, synthesis, supply chain, and strategic planning of novel hybrid energy processes. [D] . Elia, Josephine Anastasia. 2013

机译：用于新型混合能源工艺的设计，合成，供应链和战略规划的优化框架。
6. Distinct representations of phonemes syllables and supra-syllabic sequences in the speech production network [O] . Maya G. Peeva, Frank H. Guenther, Jason A. Tourville, -1

机译：语音生产网络中的音素音节和Supra-syllabic序列的不同表示
7. ERROR DETECTION OF GRAPHEME-TO-PHONEME CONVERSION IN TEXT-TO-SPEECH SYNTHESIS USING SPEECH SIGNAL AND LEXICAL CONTEXT [O] . Vythelingum, Kévin, Estève, Yannick, Rosec, Olivier 2017

机译：语音信号和词法上下文在文本到语音合成中的语法到语音转换的错误检测

PFHTS-IDSS: A Hybrid HTS-based Framework for Indonesian Speech Synthesis via Phoneme and Full-context Lab

摘要

著录项

相似文献

相关主题

期刊订阅