首页> 外文期刊>Computer speech and language >A speaker verification backend with robust performance across conditions
【24h】

A speaker verification backend with robust performance across conditions

机译:扬声器验证后端,具有跨条件的强大性能

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we address the problem of speaker verification in conditions unseen or unknown during development. A standard method for speaker verification consists of extracting speaker embeddings with a deep neural network and processing them through a backend composed of probabilistic linear discriminant analysis (PLDA) and global logistic regression score calibration. This method is known to result in systems that work poorly on conditions different from those used to train the calibration model. We propose to modify the standard backend, introducing an adaptive calibrator that uses duration and other automatically extracted side-information to adapt to the conditions of the inputs. The backend is trained discriminatively to optimize binary cross-entropy. When trained on a number of diverse datasets that are labeled only with respect to speaker, the proposed backend consistently and, in some cases, dramatically improves calibration, compared to the standard PLDA approach, on a number of held-out datasets, some of which are markedly different from the training data. Discrimination performance is also consistently improved. We show that joint training of the PLDA and the adaptive calibrator is essential - the same benefits cannot be achieved when freezing PLDA and fine-tuning the calibrator. To our knowledge, the results in this paper are the first evidence in the literature that it is possible to develop a speaker verification system with robust out-of-the-box performance on a large variety of conditions.
机译:在本文中,我们在开发期间不知情或未知的条件下解决了发言者核查问题。扬声器验证的标准方法包括用深神经网络提取扬声器嵌入式,并通过由概率线性判别分析(PLDA)和全局逻辑回归评分校准组成的后端处理它们。已知这种方法导致系统在与用于训练校准模型的条件不同的条件下工作。我们建议修改标准后端,引入使用持续时间和其他自动提取的副信息的自适应校准器,以适应输入的条件。后端判断训练,以优化二进制交叉熵。当培训仅在扬声器标记的许多不同数据集时,与标准PLDA方法相比,在某些情况下,在某些情况下,始终如一地提高校准,其中一些包含的数据集与培训数据显着不同。歧视性能也一直有所改善。我们表明PLDA的联合培训和自适应校准器是必不可少的 - 在冻结PLDA和微调校准器时,无法实现相同的益处。为了我们的知识,本文的结果是文献中的第一个证据表明,可以在各种条件下开发一个具有强大的开箱性能的扬声器验证系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号