首页> 外文期刊>Computer speech and language >Spoofing voice verification systems with statistical speech synthesis using limited adaptation data
【24h】

Spoofing voice verification systems with statistical speech synthesis using limited adaptation data

机译:使用有限的自适应数据通过统计语音合成欺骗语音验证系统

获取原文
获取原文并翻译 | 示例
           

摘要

State-of-the-art speaker verification systems are vulnerable to spoofing attacks using speech synthesis. To solve the issue, high-performance synthetic speech detectors (SSDs) for attack methods have been proposed recently. Here, as opposed to developing new detectors, we investigate new attack strategies. Investigating new techniques that are specifically tailored for spoofing attacks that can spoof the voice verification system and are difficult to detect is expected to increase the security of voice verification systems by enabling the development of better detectors. First, we investigated the vulnerability of an i-vector based verification system to attacks using statistical speech synthesis (SSS), with a particular focus on the case where the attacker has only a very limited amount of data from the target speaker. Even with a single adaptation utterance, the false alarm rate was found to be 23%. Still, SSS-generated speech is easy to detect (Wu et al., 2015a, 2015b), which dramatically reduces its effectiveness. For more effective attacks with limited data, we propose a hybrid statistical/concatenative synthesis approach and show that hybrid synthesis significantly increases the false alarm rate in the verification system compared to the baseline SSS method. Moreover, proposed hybrid synthesis makes detecting synthetic speech more difficult compared to SSS even when very limited amount of original speech recordings are available to the attacker. To further increase the effectiveness of the attacks, we propose a linear regression method that transforms synthetic features into more natural features. Even though the regression approach is more effective at spoofing the detectors, it is not as effective as the hybrid synthesis approach in spoofing the verification system. An interpolation approach is proposed to combine the linear regression and hybrid synthesis methods, which is shown to provide the best spoofing performance in most cases.
机译:最先进的扬声器验证系统容易受到使用语音合成的欺骗攻击的影响。为了解决该问题,最近已经提出了用于攻击方法的高性能合成语音检测器(SSD)。在这里,与开发新的检测器相反,我们研究了新的攻击策略。研究专门针对可能欺骗语音验证系统并且难以检测的欺骗攻击的新技术,有望通过开发更好的检测器来提高语音验证系统的安全性。首先,我们研究了基于i向量的验证系统对使用统计语音合成(SSS)进行攻击的脆弱性,特别关注攻击者仅从目标说话者那里获得非常少量数据的情况。即使只有一种适应性话语,错误警报率也被发现为23%。尽管如此,SSS生成的语音仍易于检测(Wu等人,2015a,2015b),这大大降低了其有效性。为了使用有限的数据进行更有效的攻击,我们提出了一种混合统计/串联综合方法,并表明与基线SSS方法相比,混合综合显着提高了验证系统中的误报率。而且,即使攻击者可获得非常有限的原始语音记录,与SSS相比,提出的混合合成也使得检测合成语音更加困难。为了进一步提高攻击的有效性,我们提出了一种线性回归方法,可以将合成特征转换为更自然的特征。即使回归方法在欺骗检测器方面更有效,但在欺骗验证系统方面不如混合综合方法有效。提出了一种将线性回归和混合综合方法相结合的插值方法,该方法在大多数情况下可提供最佳的欺骗性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号