A bilevel framework for joint optimization of session compensation and classification for speaker identification

Chen Chen; Wang Wei; He Yongjun; Han Jiqing

首页> 外文期刊>Digital Signal Processing >A bilevel framework for joint optimization of session compensation and classification for speaker identification

【24h】

A bilevel framework for joint optimization of session compensation and classification for speaker identification

机译：一个Bilevel框架，用于联合优化会话补偿和扬声器识别分类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The i-vector framework based system is one of the most popular systems in speaker identification (SID). In this system, session compensation is usually employed first and then the classifier. For any session-compensated representation of i-vector, there is a corresponding identification result, so that both the stages are related. However, in current SID systems, session compensation and classifier are usually optimized independently. An incomplete knowledge about the session compensation to the identification task may lead to involving uncertainties. In this paper, we propose a bilevel framework to jointly optimize session compensation and classifier to enhance the relationship between the two stages. In this framework, we use the sparse coding (SC) to obtain the session-compensated feature by learning an overcomplete dictionary, and employ the softmax classifier and support vector machine (SVM) in classifying respectively. Moreover, we present a joint optimization of the dictionary and classifier parameters under a discriminative criterion for classifier with conditions for SC. In addition, the proposed methods are evaluated on the King-ASR-010, VoxCeleb and RSR2015 databases. Compared with typical session compensation techniques, such as linear discriminant analysis (LDA) and nonparametric discriminant analysis (NDA), our methods can be more robust to complex session variability. Moreover, compared with the typical classifiers in i-vector framework, i.e. the cosine distance scoring (CDS) and probabilistic linear discriminant analysis (PLDA), our methods can be more suitable for SID (multiclass task). (C) 2019 Elsevier Inc. All rights reserved.

机译：基于I形载体的框架系统是扬声器识别（SID）中最受欢迎的系统之一。在该系统中，通常首先使用会话补偿，然后是分类器。对于I形载体的任何会话补偿表示，存在相应的识别结果，从而两个阶段都是相关的。但是，在当前的SID系统中，会话补偿和分类器通常独立优化。对识别任务的会议赔偿的不完整知识可能导致涉及不确定性。在本文中，我们提出了一个Bilevel框架，共同优化会话补偿和分类器，以增强两个阶段之间的关系。在此框架中，我们使用稀疏编码（SC）来通过学习过度顺序字典来获取会话补偿功能，并分别在分类中使用SoftMax分类器和支持向量机（SVM）。此外，我们在具有SC条件的分类器的判别标准下呈现字典和分类器参数的联合优化。此外，所提出的方法在King-ASR-010，VoxceREB和RSR2015数据库上进行评估。与典型的会话补偿技术相比，例如线性判别分析（LDA）和非参数判别分析（NDA），我们的方法对于复杂的会话变异性更加坚固。此外，与I形式框架中的典型分类器相比，即余弦距离评分（CD）和概率线性判别分析（PLDA），我们的方法可以更适合SID（多字数任务）。（c）2019 Elsevier Inc.保留所有权利。

著录项

来源
《Digital Signal Processing》 |2019年第2019期|共12页
作者
Chen Chen; Wang Wei; He Yongjun; Han Jiqing;
展开▼
作者单位

Harbin Inst Technol Sch Comp Sci &

Technol Harbin 150001 Heilongjiang Peoples R China;

Harbin Inst Technol Sch Comp Sci &

Technol Weihai 264209 Shandong Peoples R China;

Harbin Univ Sci &

Technol Sch Comp Sci &

Technol Harbin 150080 Heilongjiang Peoples R China;

Harbin Inst Technol Sch Comp Sci &

Technol Harbin 150001 Heilongjiang Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类数字信号处理;
关键词
Speaker identification; Session compensation; Joint optimization; Bilevel framework;

机译：扬声器识别;会议赔偿;联合优化;贝蒂尔框架;

相似文献

外文文献
中文文献
专利

1. A bilevel framework for joint optimization of session compensation and classification for speaker identification [J] . Chen Chen, Wang Wei, He Yongjun, Digital Signal Processing . 2019,第期

机译：一个Bilevel框架，用于联合优化会话补偿和扬声器识别分类
2. Session compensation using binary speech representation for speaker recognition [J] . Gabriel Hernandez-Sierra, Jose R. Calvo, Jean-Francois Bonastre, Pattern recognition letters . 2014,第nova1期

机译：使用二进制语音表示进行会话补偿以进行说话人识别
3. Robust Session Variability Compensation for SVM Speaker Verification [J] . Hyunson Seo, Chi-Sang Jung, Hong-Goo Kang Audio, Speech, and Language Processing, IEEE Transactions on . 2011,第6期

机译：用于SVM说话人验证的强大会话可变性补偿
4. Group nonnegative matrix factorisation with speaker and session variability compensation for speaker identification [C] . Romain Serizel, Slim Essid, Gal Richard IEEE International Conference on Acoustics, Speech and Signal Processing . 2016

机译：具有说话人和会话可变性补偿的群组非负矩阵分解，用于说话人识别
5. A software based speaker identification system using Gaussian mixture model classification. [D] . Reynolds, Ryan M. 2005

机译：使用高斯混合模型分类的基于软件的说话人识别系统。
6. Joint Classification and Prediction CNN Framework for Automatic Sleep Stage Classification [O] . Huy Phan, Fernando Andreotti, Navin Cooray, -1

机译：自动睡眠阶段分类的联合分类和预测CNN框架
7. Intersession Variability Compensation in Language and Speaker Identification [O] . Hubeika Valiantsina 2008

机译：会话间的语言和说话人识别差异补偿
8. Prosodic Speaker Verification using Subspace Multinomial Models with Intersession Compensation. [R] . Kockmann, M., Burget, L., Glembek, O., 2013

机译：使用带间隙补偿的子空间多项式模型进行韵律说话人验证。

A bilevel framework for joint optimization of session compensation and classification for speaker identification

摘要

著录项

相似文献

相关主题

期刊订阅