首页> 外文学位 >Feature and model transformation techniques for robust speaker verification.
【24h】

Feature and model transformation techniques for robust speaker verification.

机译:功能和模型转换技术可实现可靠的说话人验证。

获取原文
获取原文并翻译 | 示例

摘要

Speaker verification is to verify the identity of a speaker based on his or her own voice. It has potential applications in securing remote access services such as phonebanking and mobile-commerce. This dissertation addresses the robustness issue of speaker verification systems in three different angles: speaker modeling, feature transformation, and model transformation.; This dissertation begins with an investigation on the effectiveness of three kernel-based neural networks for speaker modeling. These networks include probabilistic decision-based neural networks (PDBNNs), Gaussian mixture models (GMMs), and elliptical basis function networks (EBFNs).; The effect of handset variation can be suppressed by transforming clean speech models to fit the handset-distorted speech. To this end, this dissertation proposes a model-based transformation technique that combines handset-dependent model transformation and reinforced learning. Specifically, the approach transforms the clean speaker model and clean background model to fit the distorted speech by using maximum-likelihood linear regression (MLLR), which is followed by adapting the transformed models via PDBNN's reinforced learning.; In addition to model-based approaches, handset variation can also be suppressed by feature-based approaches. Current feature-based approaches typically identify the handset being used as one of the known handsets in a handset database and use the a priori knowledge about the identified handset to modify the features. However, it will be much more practical and cost effective if handset detector-free systems are adopted. To this end, this dissertation proposes a blind compensation algorithm to handle the situation in which no a priori knowledge about the handset is available (i.e., a handset model which is not in the handset database is used). Specifically, a composite statistical model formed by the fusion of a speaker model and a background model is used to represent the characteristics of enrollment speech. Based on the difference between the claimant's speech and the composite model, a stochastic matching type of approach is proposed to transform the claimant's speech to a region close to the enrollment speech. Therefore, the algorithm can now estimate the transformation online without the necessity of detecting the handset types. (Abstract shortened by UMI.)
机译:说话者验证是根据说话者自己的声音来验证说话者的身份。它在保护远程访问服务(例如电话银行和移动商务)方面具有潜在的应用程序。本文从三个方面解决了说话人验证系统的鲁棒性问题:说话人建模,特征转换和模型转换。本文首先研究了三个基于核的神经网络在说话人建模中的有效性。这些网络包括基于概率的决策神经网络(PDBNN),高斯混合模型(GMM)和椭圆基函数网络(EBFN)。可以通过转换干净的语音模型以适应手机失真的语音来抑制手机变化的影响。为此,本文提出了一种基于模型的转换技术,该技术结合了依赖于手机的模型转换和强化学习。具体而言,该方法通过使用最大似然线性回归(MLLR)来转换干净的说话者模型和干净的背景模型,以适应失真的语音,然后通过PDBNN的强化学习来调整转换后的模型。除了基于模型的方法外,手机的变化也可以通过基于特征的方法来抑制。当前基于特征的方法通常将手机用作手机数据库中的已知手机之一,并使用有关所识别手机的先验知识来修改功能。但是,如果采用无手机检测器的系统,将更加实用,并且更具成本效益。为此,本文提出了一种盲补偿算法,以处理没有关于手机的先验知识可用的情况(即,使用不在手机数据库中的手机型号)。具体而言,通过将说话者模型和背景模型融合而形成的复合统计模型用于表示注册语音的特征。基于索赔人语音与复合模型之间的差异,提出了一种随机匹配类型的方法,将索赔人语音转换为靠近注册语音的区域。因此,该算法现在可以在线估计转换,而无需检测手机类型。 (摘要由UMI缩短。)

著录项

  • 作者

    Yiu, Kwok Kwong.;

  • 作者单位

    Hong Kong Polytechnic University (People's Republic of China).;

  • 授予单位 Hong Kong Polytechnic University (People's Republic of China).;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2005
  • 页码 158 p.
  • 总页数 158
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号