首页> 外国专利> METHOD FOR GENERATING SPEAKER-ADAPTED SPEECH SYNTHESIZER MODEL WITH A FEW SAMPLES USING A FINE-TUNING BASED ON DEEP CONVOLUTIONAL NEURAL NETWORK AI

METHOD FOR GENERATING SPEAKER-ADAPTED SPEECH SYNTHESIZER MODEL WITH A FEW SAMPLES USING A FINE-TUNING BASED ON DEEP CONVOLUTIONAL NEURAL NETWORK AI

机译:基于深度卷积神经网络的微调生成带有少量样本的说话人自适应语音合成器模型的方法

摘要

The present invention relates to an artificial intelligence for synthesizing speech, and in particular, a method for generating a speech-suited speech synthesis model with a small amount of samples through fine-tuning based on deep synthetic neural network artificial intelligence, wherein the method comprises text Converting text into a number representing text information using an encoder (character embedding); Converting a target voice file into speaker embedding using a speaker encoder; Converting text embedding and speaker embedding into a context vector using linguistic knowledge, phoneme, and phoneme knowledge using personalized attention; Transforming a context vector into a predicted mel-spectrogram using an audio decoder; And generating a waveform-type voice file by using the predicted Mel-spectrogram and SR using the vocoder. Through the technique of generating a speaker-compatible speech synthesis model with a small amount of sample provided by the present invention, data required for a speaker-compatible speech synthesis model has been greatly reduced from about 5 hours to about 10 minutes. This saves time and money required to create a speech synthesis system.
机译:本发明涉及一种合成语音的人工智能,尤其涉及一种基于深度合成神经网络人工智能的微调生成少量样本的适合语音的语音合成模型的方法,所述方法包括:文本使用编码器将文本转换为表示文本信息的数字(字符嵌入);使用扬声器编码器将目标语音文件转换为扬声器嵌入;使用语言知识,音素和使用个性化注意的音素知识将文本嵌入和说话者嵌入转换为上下文向量;使用音频解码器将上下文向量转换为预测的Mel频谱图;并通过使用声码器使用预测的梅尔频谱图和SR生成波形类型的语音文件。通过本发明提供的利用少量样本生成说话人兼容语音合成模型的技术,说话人兼容语音合成模型所需的数据已从大约5小时减少到大约10分钟。这节省了创建语音合成系统所需的时间和金钱。

著录项

  • 公开/公告号KR20200092505A

    专利类型

  • 公开/公告日2020-08-04

    原文格式PDF

  • 申请/专利权人 네오데우스 주식회사;

    申请/专利号KR20190004350

  • 发明设计人 박세찬;

    申请日2019-01-13

  • 分类号G10L13/08;G06N3/08;G10L13/033;G10L19/02;G10L25/30;

  • 国家 KR

  • 入库时间 2022-08-21 11:06:14

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号