Spoofing voice verification systems with statistical speech synthesis using limited adaptation data

Ali Khodabakhsh; Amir Mohammadi; Cenk Demiroglu

首页> 外文期刊>Computer speech and language >Spoofing voice verification systems with statistical speech synthesis using limited adaptation data

【24h】

Spoofing voice verification systems with statistical speech synthesis using limited adaptation data

机译：使用有限的自适应数据通过统计语音合成欺骗语音验证系统

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

State-of-the-art speaker verification systems are vulnerable to spoofing attacks using speech synthesis. To solve the issue, high-performance synthetic speech detectors (SSDs) for attack methods have been proposed recently. Here, as opposed to developing new detectors, we investigate new attack strategies. Investigating new techniques that are specifically tailored for spoofing attacks that can spoof the voice verification system and are difficult to detect is expected to increase the security of voice verification systems by enabling the development of better detectors. First, we investigated the vulnerability of an i-vector based verification system to attacks using statistical speech synthesis (SSS), with a particular focus on the case where the attacker has only a very limited amount of data from the target speaker. Even with a single adaptation utterance, the false alarm rate was found to be 23%. Still, SSS-generated speech is easy to detect (Wu et al., 2015a, 2015b), which dramatically reduces its effectiveness. For more effective attacks with limited data, we propose a hybrid statistical/concatenative synthesis approach and show that hybrid synthesis significantly increases the false alarm rate in the verification system compared to the baseline SSS method. Moreover, proposed hybrid synthesis makes detecting synthetic speech more difficult compared to SSS even when very limited amount of original speech recordings are available to the attacker. To further increase the effectiveness of the attacks, we propose a linear regression method that transforms synthetic features into more natural features. Even though the regression approach is more effective at spoofing the detectors, it is not as effective as the hybrid synthesis approach in spoofing the verification system. An interpolation approach is proposed to combine the linear regression and hybrid synthesis methods, which is shown to provide the best spoofing performance in most cases.

机译：最先进的扬声器验证系统容易受到使用语音合成的欺骗攻击的影响。为了解决该问题，最近已经提出了用于攻击方法的高性能合成语音检测器（SSD）。在这里，与开发新的检测器相反，我们研究了新的攻击策略。研究专门针对可能欺骗语音验证系统并且难以检测的欺骗攻击的新技术，有望通过开发更好的检测器来提高语音验证系统的安全性。首先，我们研究了基于i向量的验证系统对使用统计语音合成（SSS）进行攻击的脆弱性，特别关注攻击者仅从目标说话者那里获得非常少量数据的情况。即使只有一种适应性话语，错误警报率也被发现为23％。尽管如此，SSS生成的语音仍易于检测（Wu等人，2015a，2015b），这大大降低了其有效性。为了使用有限的数据进行更有效的攻击，我们提出了一种混合统计/串联综合方法，并表明与基线SSS方法相比，混合综合显着提高了验证系统中的误报率。而且，即使攻击者可获得非常有限的原始语音记录，与SSS相比，提出的混合合成也使得检测合成语音更加困难。为了进一步提高攻击的有效性，我们提出了一种线性回归方法，可以将合成特征转换为更自然的特征。即使回归方法在欺骗检测器方面更有效，但在欺骗验证系统方面不如混合综合方法有效。提出了一种将线性回归和混合综合方法相结合的插值方法，该方法在大多数情况下可提供最佳的欺骗性能。

著录项

来源
《Computer speech and language》 |2017年第3期|20-37|共18页
作者
Ali Khodabakhsh; Amir Mohammadi; Cenk Demiroglu;
展开▼
作者单位

Electrical and Computer Engineering Department, Ozyegin University, Istanbul, Turkey;

Electrical and Computer Engineering Department, Ozyegin University, Istanbul, Turkey;

Electrical and Computer Engineering Department, Ozyegin University, Istanbul, Turkey;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Statistical speech synthesis; Hybrid speech synthesis; Spoofing verification systems; Speaker adaptation; Synthetic speech detection;

机译：统计语音合成;混合语音合成;欺骗验证系统;说话者适应;合成语音检测;

相似文献

外文文献
中文文献
专利

1. Prosodically Rich Speech Synthesis Interface Using Limited Data of Celebrity Voice [J] . Takashi Nose, Taiki Kamei Journal of Computer and Communications . 2016,第16期

机译：使用名人语音的有限数据的语音丰富的语音合成接口
2. Deep domain adaptation for anti-spoofing in speaker verification systems [J] . Himawan Ivan, Villavicencio Fernando, Sridharan Sridha, Computer speech and language . 2019,第NOVa期

机译：扬声器验证系统中的深度域自适应以防欺骗
3. Deep domain adaptation for anti-spoofing in speaker verification systems [J] . Himawan Ivan, Villavicencio Fernando, Sridharan Sridha, Computer speech and language . 2019,第Nova期

机译：扬声器验证系统中防欺骗的深域改编
4. Spoofing attacks to i-vector based voice verification systems using statistical speech synthesis with additive noise and countermeasure [C] . Mustafa Caner Özbay, Ali Khodabakhsh, Amir Mohammadi, European Signal Processing Conference . 2016

机译：使用具有加性噪声的统计语音合成和对策，对基于i向量的语音验证系统进行欺骗攻击
5. Hidden Markov models for visual speech synthesis in limited data environments. [D] . Arb, Harold Allan. 2001

机译：用于有限数据环境中视觉语音合成的隐马尔可夫模型。
6. A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion [O] . Othman Lachhab, Joseph Di Martino, Elhassane Ibn Elhaj, -1

机译：基于统计语音转换的混合系统改善食道语音识别的初步研究
7. Vulnerability of Speaker Verification Systems Against Voice Conversion Spoofing Attacks: the Case of Telephone Speech [O] . Tomi Kinnunen, Zhi-zheng Wu, Kong Aik Lee, 2012

机译：说话人验证系统针对语音转换欺骗攻击的脆弱性：电话语音的情况

Spoofing voice verification systems with statistical speech synthesis using limited adaptation data

摘要

著录项

相似文献

相关主题

期刊订阅