首页> 外文会议>International Symposium on Chinese Spoken Language Processing >Channel Interdependence Enhanced Speaker Embeddings for Far-Field Speaker Verification
【24h】

Channel Interdependence Enhanced Speaker Embeddings for Far-Field Speaker Verification

机译:通道相互依存增强扬声器嵌入用于远场扬声器验证

获取原文

摘要

Recognizing speakers from a distance using far-field microphones is difficult because of the environmental noise and reverberation distortion. In this work, we tackle these problems by strengthening the frame-level processing and feature aggregation of x-vector networks. Specifically, we restructure the dilated convolutional layers into Res2Net blocks to generate multi-scale frame-level features. To exploit the relationship between the channels, we introduce squeeze-and-excitation (SE) units to rescale the channels’ activations and investigate the best places to put these SE units in the Res2Net blocks. Based on the hypothesis that layers at different depth contain speaker information at different granularity levels, multi-block feature aggregation is introduced to propagate and aggregate the features at various depths. To optimally weight the channels and frames during feature aggregation, we propose a channel-dependent attention mechanism. Combining all of these enhancements leads to a network architecture called channel-interdependence enhanced Res2Net (CE-Res2Net). Results show that the proposed network achieves a relative improvement of about 16% in EER and 17% in minDCF on the VOiCES 2019 Challenge’s evaluation set.
机译:由于环境噪音和混响失真,难以使用远场麦克风识别扬声器。在这项工作中,我们通过加强X-矢量网络的帧级处理和特征聚合来解决这些问题。具体地,我们将扩张的卷积层重构为RES2Net块以产生多尺度帧级别特征。为了利用渠道之间的关系,我们引入挤压和激励(SE)单位来重新归类通道的激活,并调查将这些SE单元放入Res2Net块中的最佳位置。基于不同深度的层的假设包含不同粒度水平的扬声器信息,引入多块特征聚合以在各种深度传播并聚合特征。为了在特征聚合期间最佳地重写频道和帧,我们提出了一种依赖于通道的关注机制。组合所有这些增强功能导致网络架构,称为通道 - 相互依存增强型RES2NET(CE-RES2NET)。结果表明,该网络在2019年挑战的评估集中达到了大约16%的相对提高约16%,17%的思想。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号