Large-Scale Training of Pairwise Support Vector Machines for Speaker Recognition

Cumani S.; Laface P.

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Large-Scale Training of Pairwise Support Vector Machines for Speaker Recognition

【24h】

Large-Scale Training of Pairwise Support Vector Machines for Speaker Recognition

机译：成对支持向量机对说话人识别的大规模训练

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

State–of–the–art systems for text–independent speaker recognition use as their features a compact representation of a speaker utterance, known as “i–vector.” We recently presented an efficient approach for training a Pairwise Support Vector Machine (PSVM) with a suitable kernel for i–vector pairs for a quite large speaker recognition task. Rather than estimating an SVM model per speaker, according to the “one versus all” discriminative paradigm, the PSVM approach classifies a trial, consisting of a pair of i–vectors, as belonging or not to the same speaker class. Training a PSVM with large amount of data, however, is a memory and computational expensive task, because the number of training pairs grows quadratically with the number of training i–vectors. This paper demonstrates that a very small subset of the training pairs is necessary to train the original PSVM model, and proposes two approaches that allow discarding most of the training pairs that are not essential, without harming the accuracy of the model. This allows dramatically reducing the memory and computational resources needed for training, which becomes feasible with large datasets including many speakers. We have assessed these approaches on the extended core conditions of the NIST 2012 Speaker Recognition Evaluation. Our results show that the accuracy of the PSVM trained with a sufficient number of speakers is 10%-30% better compared to the one obtained by a PLDA model, depending on the testing conditions. Since the PSVM accuracy increases with the training set size, but PSVM training does not scale well for large numbers of speakers, our selection techniques become relevant for training accurate discriminative classifiers.

机译：独立于文本的说话人识别的最新系统将说话人话语的紧凑表示形式称为“ i-vector”。我们最近提出了一种训练成对支持向量机（PSVM）的有效方法，该工具具有适用于i-向量对的内核，可用于相当大的说话人识别任务。 PSVM方法不是根据“一个对所有”的区分范例来估计每个说话者的SVM模型，而是将一个由一对i-vector组成的试验分类为属于或不属于同一说话者类别。但是，使用大量数据训练PSVM是一项内存和计算量巨大的任务，因为训练对的数量随训练i-vector的数量呈二次方增长。本文演示了训练原始PSVM模型所需的训练对中的一小部分，并提出了两种方法，可以丢弃大多数不是必需的训练对，而不会损害模型的准确性。这可以大大减少训练所需的内存和计算资源，这对于包含许多说话者的大型数据集来说是可行的。我们在NIST 2012演讲者认可度评估的扩展核心条件下评估了这些方法。我们的结果表明，根据测试条件的不同，经过足够数量的扬声器训练的PSVM的精度比PLDA模型获得的精度高10％-30％。由于PSVM的准确度随训练集大小的增加而增加，但是PSVM的训练对于大量的讲话者而言并不能很好地扩展，因此我们的选择技术对于训练精确的判别式分类器变得重要。

著录项

来源
《Audio, Speech, and Language Processing, IEEE Transactions on》 |2014年第11期|1590-1600|共11页
作者
Cumani S.; Laface P.;
展开▼
作者单位

Dipartimento di Automatica e Informatica, Politecnico di Torino, Italy;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Computational modeling; Speaker recognition; Speech; Speech processing; Support vector machines; Training; Vectors; PLDA; Speaker recognition; i–vector; pairwise support vector machines; support vectors;

机译：计算建模;说话人识别;言语;语音处理;支持向量机;训练;向量;PLDA;说话人识别;我–矢量成对支持向量机支持向量;

相似文献

外文文献
中文文献
专利

1. Speaker Recognition System for Limited Speech Data Using High-Level Speaker Specific Features and Support Vector Machines [J] . Satyanand Singh, Assaf Mansour H., Nitin Agarwal, International Journal of Applied Engineering Research . 2017,第19aPta1期

机译：使用高级扬声器特定功能和支持向量机有限语音数据的扬声器识别系统
2. DeepSVM-fold: protein fold recognition by combining support vector machines and pairwise sequence similarity scores generated by deep learning networks [J] . Bin Liu, Chen-Chen Li, Ke Yan Briefings in bioinformatics . 2020,第5期

机译：Deepsvm-Fold：通过组合支持向量机和深度学习网络产生的成对序列相似度分数来识别蛋白质折叠识别
3. Automatic modulation recognition for DVB-S2 using pairwise support vector machines [J] . Mohsen Farhang, Ali Ghaleh, Hamid Dehghani International journal of autonomous and adaptive communications systems . 2018,第2期

机译：使用成对支持向量机对DVB-S2进行自动调制识别
4. Training Pairwise Support Vector Machines with large scale datasets [C] . Cumani Sandro, Laface Pietro IEEE International Conference on Acoustics, Speech and Signal Processing . 2014

机译：使用大规模数据集训练成对支持向量机
5. Kernel optimization for support vector machines: Application to speaker verification. [D] . Hatch, Andrew Oliver. 2006

机译：支持向量机的内核优化：应用于说话者验证。
6. Prediction of Protein–Protein Interaction with Pairwise Kernel Support Vector Machine [O] . Shao-Wu Zhang, Li-Yang Hao, Ting-He Zhang 2014

机译：成对核支持向量机预测蛋白质与蛋白质的相互作用
7. Large scale training of Pairwise Support Vector Machines for speaker recognition [O] . Cumani, Sandro, Laface, Pietro 2014

机译：成对支持向量机的大规模训练，用于说话人识别
8. Phonetic Speaker Recognition with Support Vector Machines. [R] . Campbell, W. M., Campbell, J. P., Reynolds, D. A., 2016

机译：支持向量机的语音说话人识别。

Large-Scale Training of Pairwise Support Vector Machines for Speaker Recognition

摘要

著录项

相似文献

相关主题

期刊订阅