Speaker Verification Using Adaptive Dictionaries in Non-negative Spectrogram Deconvolution

机译：非负谱图反卷积中使用自适应词典的说话人验证

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This article presents a new method for speaker verification, which is based on the non-negative matrix deconvolution (NMD) of the magnitude spectrogram of an observed utterance. In contrast to typical methods known from the literature, which are based on the assumption that the desired signal dominates (for example GMM-UBM, joint factor analysis, i-vectors), compositional models such as NMD describe a recording as a non-negative combination of latent components. The proposed model represents a spectrogram of a signal as a sum of spectro-temporal patterns that span durations of order about 150 ms, while many state of the art automatic speaker recognition systems model a probability distribution of features extracted from much shorter excerpts of speech signal (about 50 ms). Longer patterns carry information about dynamical aspects of modeled signal, for example information about accent and articulation. We use a parametric dictionary in the NMD and the parameters of the dictionary carry information about the speakers' identity. The experiments performed on the CHiME corpus show that with the proposed approach achieves equal error rate comparable to an i-vector based system.

机译：本文介绍了一种用于说话人验证的新方法，该方法基于观察到的话语幅度谱图的非负矩阵去卷积（NMD）。与从文献中得知的典型方法相反，这些方法基于假设所需信号占主导地位（例如GMM-UBM，联合因子分析，i矢量）的假设，组合模型（例如NMD）将记录描述为非负值潜在组件的组合。所提出的模型将信号频谱图表示为跨时长约150 ms的频谱时间模式的总和，而许多先进的自动说话人识别系统都对从语音信号的较短摘录中提取的特征的概率分布进行建模（约50毫秒）。较长的模式会携带有关建模信号的动态方面的信息，例如有关重音和发音的信息。我们在NMD中使用参数字典，字典的参数携带有关说话者身份的信息。在CHiME语料库上进行的实验表明，与基于i-vector的系统相比，该方法可实现相同的错误率。

著录项

来源
《International conference on latent variable analysis and signal separation》|2016年|462-469|共8页
会议地点
作者
Szymon Drgas; Tuomas Virtanen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Joint speaker separation and recognition using non-negative matrix deconvolution with adaptive dictionary [J] . Szymon Drgas, Tuomas Virtanen Computer speech and language . 2021,第Nova期

机译：使用非负面矩阵对自适应词典的联合扬声器分离和识别
2. Adaptive Deconvolution on the Non-negative Real Line [J] . Mabon Gwennaelle Scandinavian journal of statistics . 2017,第3期

机译：非负实线上的自适应反卷积
3. Robust Speaker Verification With Joint Sparse Coding Over Learned Dictionaries [J] . Haris B.C., Sinha R. Information Forensics and Security, IEEE Transactions on . 2015,第10期

机译：通过对学习词典的联合稀疏编码进行可靠的说话人验证
4. Speaker Verification Using Adaptive Dictionaries in Non-negative Spectrogram Deconvolution [C] . Szymon Drgas, Tuomas Virtanen International Conference on Latent Variable Analysis and Signal Separation . 2015

机译：扬声器验证在非负频谱图解卷中的自适应词典
5. A fast learning algorithm for adaptive wavelets with application to fuzzy neural-based speaker verification. [D] . Lim, Chang-Gyoon. 1997

机译：一种自适应小波快速学习算法，应用于基于模糊神经的说话人验证。
6. Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species [O] . Jimmy Ludeña-Choez, Raisa Quispe-Soncco, Ascensión Gallardo-Antolín -1

机译：基于非负矩阵分解的鸟类声谱图分解用于鸟类的声学分类
7. Rapid speaker adaptation with speaker adaptive training and non-negative matrix factorization [O] . Zhang Xueru, Demuynck Kris, Van hamme Hugo 2011

机译：具有说话人自适应训练和非负矩阵分解的快速说话人自适应

Speaker Verification Using Adaptive Dictionaries in Non-negative Spectrogram Deconvolution

摘要

著录项

相似文献

相关主题

期刊订阅