首页> 外文会议>International conference on latent variable analysis and signal separation >Speaker Verification Using Adaptive Dictionaries in Non-negative Spectrogram Deconvolution
【24h】

Speaker Verification Using Adaptive Dictionaries in Non-negative Spectrogram Deconvolution

机译:非负谱图反卷积中使用自适应词典的说话人验证

获取原文

摘要

This article presents a new method for speaker verification, which is based on the non-negative matrix deconvolution (NMD) of the magnitude spectrogram of an observed utterance. In contrast to typical methods known from the literature, which are based on the assumption that the desired signal dominates (for example GMM-UBM, joint factor analysis, i-vectors), compositional models such as NMD describe a recording as a non-negative combination of latent components. The proposed model represents a spectrogram of a signal as a sum of spectro-temporal patterns that span durations of order about 150 ms, while many state of the art automatic speaker recognition systems model a probability distribution of features extracted from much shorter excerpts of speech signal (about 50 ms). Longer patterns carry information about dynamical aspects of modeled signal, for example information about accent and articulation. We use a parametric dictionary in the NMD and the parameters of the dictionary carry information about the speakers' identity. The experiments performed on the CHiME corpus show that with the proposed approach achieves equal error rate comparable to an i-vector based system.
机译:本文介绍了一种用于说话人验证的新方法,该方法基于观察到的话语幅度谱图的非负矩阵去卷积(NMD)。与从文献中得知的典型方法相反,这些方法基于假设所需信号占主导地位(例如GMM-UBM,联合因子分析,i矢量)的假设,组合模型(例如NMD)将记录描述为非负值潜在组件的组合。所提出的模型将信号频谱图表示为跨时长约150 ms的频谱时间模式的总和,而许多先进的自动说话人识别系统都对从语音信号的较短摘录中提取的特征的概率分布进行建模(约50毫秒)。较长的模式会携带有关建模信号的动态方面的信息,例如有关重音和发音的信息。我们在NMD中使用参数字典,字典的参数携带有关说话者身份的信息。在CHiME语料库上进行的实验表明,与基于i-vector的系统相比,该方法可实现相同的错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号