Multi-task Knowledge Distillation with Rhythm Features for Speaker Verification

Ruyun Li; Peng Ouyang; Dandan Song; Shaojun Wei

首页> 外文期刊>Computer Science & Information Technology >Multi-task Knowledge Distillation with Rhythm Features for Speaker Verification

【24h】

Multi-task Knowledge Distillation with Rhythm Features for Speaker Verification

机译：具有节奏特征的多任务知识蒸馏，用于扬声器验证

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相关主题

摘要

Recently, speaker embedding extracted by deep neural networks (DNN) has performed well in speaker verification (SV). However, it is sensitive to different scenarios, and it is too computationally intensive to be deployed on portable devices. In this paper, we first combine rhythm and MFCC features to improve the robustness of speaker verification. The rhythm feature can reflect the distribution of phonemes and help reduce the average error rate (EER) in speaker verification, especially in intra-speaker verification. In addition, we propose a multitask knowledge distillation architecture that transfers the embedding-level and label-level knowledge of a well-trained large teacher to a highly compact student network. The results show that rhythm features and multi-task knowledge distillation significantly improve the performance of the student network. In the ultra-short duration scenario, using only 14.9% of the parameters in the teacher network, the student network can even achieve a relative EER reduction of 32%.

机译：最近，深神经网络（DNN）提取的扬声器嵌入在扬声器验证（SV）中表现良好。但是，它对不同的场景敏感，并且在便携式设备上部署过度计算。在本文中，我们首先结合节奏和MFCC功能来提高扬声器验证的鲁棒性。节奏功能可以反映音素的分布，并有助于降低扬声器验证中的平均误差率（eer），尤其是在讲话者验证中。此外，我们提出了一个多任务知识蒸馏架构，将培训级别的大师的嵌入水平和标签级知识传输到高度紧凑的学生网络。结果表明，节奏特征和多任务知识蒸馏显着提高了学生网络的性能。在超短持续时间方案中，仅使用教师网络中的14.9％的参数，学生网络甚至可以实现32％的相对eer。

著录项

来源
《Computer Science & Information Technology》 |2020年第5期|共14页
作者
Ruyun Li; Peng Ouyang; Dandan Song; Shaojun Wei;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Multi-task learningKnowledge distillationRhythm variationAngular softmaxSpeaker verification.;

机译：多任务学习知识蒸馏馏分体变形的SoftmaxSpeaker验证。;

Multi-task Knowledge Distillation with Rhythm Features for Speaker Verification

摘要

著录项

相关主题

期刊订阅