首页> 外国专利> Speaker feature extraction apparatus and the speaker feature extraction method, speech recognition device, voice synthesis device, as well as, program recording medium

Speaker feature extraction apparatus and the speaker feature extraction method, speech recognition device, voice synthesis device, as well as, program recording medium

机译:说话者特征提取设备和说话者特征提取方法,语音识别设备,语音合成设备以及程序记录介质

摘要

PROBLEM TO BE SOLVED: To extract speaker characteristics with good accuracy from a smaller quantity of utterance data. SOLUTION: Acoustic models are stored in a first acoustic model storage section 7a to an n-th acoustic model storage section 7n by each of n pieces of speaker clusters in the acoustic model storage sections 7. The vocal tract length normalization coefficient αdetermined by estimating likelihood by equation (a) according to a reference of maximizing the likelihood of the acoustic models of learning speakers for the acoustic models of all the learning speakers by using a nonlinear frequency warping obtained by applying a correction factor β to vocal tract length normalization coefficient α is used for clustering of the learning speakers of this case as the distance between the respective learning speakers. The distances between the respective learning speakers are set in accordance with the information on the vocal tract lengths which are the fluctuating factors of the physiological characteristics and the correction information of the ways and habits of the utterance, by which the learning speakers are clustered with the speaker characteristics extracted with good accuracy by taking the speakers' habits into consideration from a smaller quantify of the utterance data as the distances between the respective learning speakers.
机译:要解决的问题:从少量的发声数据中以高精度提取出扬声器的特征。解决方案:声学模型通过声学模型存储部分7中的n个扬声器簇中的每一个存储在第一声学模型存储部分7a至第n声学模型存储部分7n中。声道长度归一化系数α由通过使用通过应用校正因子β获得的非线性频率翘曲来最大化所有学习说话者的声学模型的学习说话者的声学模型的似然性的参考,通过方程式(a)来估计似然性。声道长度归一化系数α用于将这种情况下的学习说话者聚类为各个学习说话者之间的距离。根据作为生理特征波动因素的声道长度信息以及发声方式和习惯的校正信息来设置各个学习说话者之间的距离,通过这些信息,学习说话者与通过将说话者数据的较小量化(即各个学习说话者之间的距离)考虑在内,可以以较高的准确性提取说话者特征。

著录项

  • 公开/公告号JP3646060B2

    专利类型

  • 公开/公告日2005-05-11

    原文格式PDF

  • 申请/专利权人 シャープ株式会社;

    申请/专利号JP20000382371

  • 发明设计人 山口 耕市;八幡 洋一郎;

    申请日2000-12-15

  • 分类号G10L15/06;G10L13/08;G10L15/14;G10L21/04;

  • 国家 JP

  • 入库时间 2022-08-21 22:29:07

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号