首页> 外文会议>2018 IEEE 4th Information Technology and Mechatronics Engineering Conference >Threshold Re-weighting Attention Mechanism for Speaker Verification
【24h】

Threshold Re-weighting Attention Mechanism for Speaker Verification

机译:说话人验证的阈值重加权注意机制

获取原文
获取原文并翻译 | 示例

摘要

It is difficult for the method of average pooling to get the optimal utterance-level features in applications of end-to-end speaker verification, because the importance of each frame is considered to be equivalent. A novel end-to-end architecture of ResCNN based on threshold re-weighting attention mechanism is proposed. Firstly, attention mechanism is introduced into the process of converting frame-level into utterance-level features to obtain the important frames then the larger weights are given by training. Secondly, the weights less than the average value of all weights are set to zero due to the fact that less speaker information is contained, and others are re-weighting to obtain new coefficients. Experimental results show that the equal error rate (EER) of the proposed method is 10.88% on the Voxceleb1 dataset, which is 1.41% lower than that of the average pooling method. This shows that the frames containing more speaker information can be selected by the proposed method more effectively, thus the performance of speaker verification system is improved. Furthermore, the extended experiment shows that the proposed method is also applicable for noisy scenes.
机译:由于平均帧的重要性被认为是等效的,因此平均池合并方法很难在端到端说话者验证应用中获得最佳话语级别特征。提出了一种新的基于阈值重加权注意机制的ResCNN端到端架构。首先,将注意力机制引入到将帧级特征转换为话语级特征以获得重要帧的过程中,然后通过训练给出更大的权重。其次,由于包含较少的说话者信息,因此将小于所有权重平均值的权重设置为零,并对其他权重进行重新加权以获得新系数。实验结果表明,该方法在Voxceleb1数据集上的均等错误率(EER)为10.88%,比平均合并方法低1.41%。这表明通过所提出的方法可以更有效地选择包含更多说话者信息的帧,从而提高了说话者验证系统的性能。此外,扩展实验表明,该方法也适用于嘈杂的场景。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号