...
首页> 外文期刊>EURASIP journal on audio, speech, and music processing >Employing Second-Order Circular Suprasegmental Hidden Markov Models to Enhance Speaker Identification Performance in Shouted Talking Environments
【24h】

Employing Second-Order Circular Suprasegmental Hidden Markov Models to Enhance Speaker Identification Performance in Shouted Talking Environments

机译:使用二阶圆形超分段隐马尔可夫模型增强在说话环境中说话人的识别性能

获取原文
获取原文并翻译 | 示例
           

摘要

Speaker identification performance is almost perfect in neutral talking environments. However, the performance is deteriorated significantly in shouted talking environments. This work is devoted to proposing, implementing, and evaluating new models called Second-Order Circular Suprasegmental Hidden Markov Models (CSPHMM2s) to alleviate the deteriorated performance in the shouted talking environments. These proposed models possess the characteristics of both Circular Suprasegmental Hidden Markov Models (CSPHMMs) and Second-Order Suprasegmental Hidden Markov Models (SPHMM2s). The results of this work show that CSPHMM2s outperform each of First-Order Left-to-Right Suprasegmental Hidden Markov Models (LTRSPHMMls), Second-Order Left-to-Right Suprasegmental Hidden Markov Models (LTRSPHMM2s), and First-Order Circular Suprasegmental Hidden Markov Models (CSPHMM1s) in the shouted talking environments. In such talking environments and using our collected speech database, average speaker identification performance based on LTRSPHMMls, LTRSPHMM2s, CSPHMMls, and CSPHMM2s is 74.6%, 78.4%, 78.7%, and 83.4%, respectively. Speaker identification performance obtained based on CSPHMM2s is close to that obtained based on subjective assessment by human listeners.
机译:在中立的谈话环境中,说话人识别性能几乎是完美的。但是,在喧闹的谈话环境中,性能会大大降低。这项工作致力于提出,实施和评估称为二阶圆形超分段隐马尔可夫模型(CSPHMM2s)的新模型,以缓解在喧闹的谈话环境中性能下降的问题。这些提议的模型具有圆形超分段隐马尔可夫模型(CSPHMM)和二阶超分段隐马尔可夫模型(SPHMM2s)的特征。这项工作的结果表明,CSPHMM2的性能优于一阶左至右超分割隐马尔可夫模型(LTRSPHMMls),二阶左至右超分割隐马尔可夫模型(LTRSPHMM2s)和一阶圆形超分割隐匿模型喧闹的谈话环境中的马尔可夫模型(CSPHMM1)。在这样的谈话环境中,并使用我们收集的语音数据库,基于LTRSPHMM1,LTRSPHMM2,CSPHMM1和CSPHMM2的平均说话者识别性能分别为74.6%,78.4%,78.7%和83.4%。基于CSPHMM2s获得的说话人识别性能接近于基于人类听众的主观评估获得的说话人识别性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号