首页> 外文期刊>Computer speech and language >On the use of blind channel response estimation and a residual neural network to detect physical access attacks to speaker verification systems
【24h】

On the use of blind channel response estimation and a residual neural network to detect physical access attacks to speaker verification systems

机译:关于使用盲信道响应估计和剩余神经网络来检测对扬声器验证系统的物理访问攻击

获取原文
获取原文并翻译 | 示例
       

摘要

Spoofing attacks have been acknowledged as a serious threat to automatic speaker verification (ASV) systems. In this paper, we are specifically concerned with replay attack scenarios. As a countermeasure to the problem, we propose a front-end based on the blind estimation of the channel response magnitude and as a back-end a residual neural network. Our hypothesis is that the magnitude response of the channel, obtained by subtracting the log-magnitude spectrum of the observed signal from the prediction of the log-magnitude spectrum average of the observed signal's clean counterpart, will capture the nuances of room ambiences, recordings and playback devices. The performance of these features is investigated on a benchmark back-end, based on a Gaussian mixture model and on a deep neural network classifier. Our experiments are performed on the 2017 and 2019 Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof) datasets. The benchmark systems are the same as used in the challenges and are based on constant-Q. cepstral coefficients (CQCC) and linear-frequency cepstral coefficients (LFCC) features. Experimental results on the 2017 dataset show that the proposed method outperforms the two benchmarks, providing equal-error rates (EER) as low as 7.57% and 11.64%, respectively, for the development and evaluation sets. On the ASVspoof 2019 dataset, in turn, the proposed method outperformed the benchmark using a residual neural network as back-end by yielding tandem detection cost function (t-DCF) and EER as low as 0.1086 and 4.26% on the evaluation set. Lastly, an instrumental (objective) quality assessment is performed on the two datasets and the impact of quality variability on spoofing detection accuracy is discussed.
机译:欺骗攻击已被视为对自动扬声器验证(ASV)系统的严重威胁。在本文中,我们特别关注重播攻击情景。作为问题的对策,我们提出了一种基于信道响应幅度的盲估计和作为剩余神经网络的盲目估计的前端。我们的假设是通过从观察到的信号清洁对应物的日志幅度平均值的预测中减去观察信号的对数幅度谱来获得的信道的幅度响应将捕捉房间常规,录像和录音的细微差别播放设备。基于高斯混合模型和深神经网络分类器,在基准后端进行这些特征的性能。我们的实验是在2017年和2019年的自动扬声器验证欺骗和对策挑战(ASVSpoof)数据集上进行。基准系统与挑战中使用的相同,并且基于常数Q。抗康斯兰系数(CQCC)和线性频率谱系数(LFCC)特征。 2017年数据集上的实验结果表明,该方法分别优于两个基准,为开发和评估集提供了低至7.57%和11.64%的平等误差率(eer)。在ASVSpof 2019数据集上,又该方法通过在评估集上产生串联检测成本函数(T-DCF)和低至0.1086和4.26%,使用剩余神经网络作为后端使用剩余神经网络的基准。最后,对两个数据集进行了乐器(目标)质量评估,讨论了对欺骗检测精度的质量变化的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号