首页> 外文期刊>Computer speech and language >Voice activity detection based on statistical models and machine learning approaches
【24h】

Voice activity detection based on statistical models and machine learning approaches

机译:基于统计模型和机器学习方法的语音活动检测

获取原文
获取原文并翻译 | 示例
           

摘要

The voice activity detectors (VADs) based on statistical models have shown impressive performances especially when fairly precise statistical models are employed. Moreover, the accuracy of the VAD utilizing statistical models can be significantly improved when machine-learning techniques are adopted to provide prior knowledge for speech characteristics. In the first part of this paper, we introduce a more accurate and flexible statistical model, the generalized gamma distribution (GΓD) as a new model in the VAD based on the likelihood ratio test. In practice, parameter estimation algorithm based on maximum likelihood principle is also presented. Experimental results show that the VAD algorithm implemented based on GΓD outperform those adopting the conventional Laplacian and Gamma distributions. In the second part of this paper, we introduce machine learning techniques such as a minimum classification error (MCE) and support vector machine (SVM) to exploit automatically prior knowledge obtained from the speech database, which can enhance the performance of the VAD. Firstly, we present a discriminative weight training method based on the MCE criterion. In this approach, the VAD decision rule becomes the geometric mean of optimally weighted likelihood ratios. Secondly, the SVM-based approach is introduced to assist the VAD based on statistical models. In this algorithm, the SVM efficiently classifies the input signal into two classes which are voice active and voice inactive regions with nonlinear boundary. Experimental results show that these training-based approaches can effectively enhance the performance of the VAD.
机译:基于统计模型的语音活动检测器(VAD)表现出令人印象深刻的性能,尤其是在采用相当精确的统计模型时。此外,当采用机器学习技术为语音特性提供先验知识时,利用统计模型的VAD的准确性可以大大提高。在本文的第一部分中,我们基于似然比检验介绍了一种更准确,更灵活的统计模型,即广义伽玛分布(GΓD)作为VAD中的新模型。在实践中,还提出了基于最大似然原理的参数估计算法。实验结果表明,基于GΓD的VAD算法优于传统的Laplacian和Gamma分布算法。在本文的第二部分中,我们介绍了机器学习技术,例如最小分类错误(MCE)和支持向量机(SVM),以自动利用从语音数据库中获得的先验知识,从而可以提高VAD的性能。首先,我们提出了一种基于MCE准则的判别体重训练方法。在这种方法中,VAD决策规则成为最佳加权似然比的几何平均值。其次,引入了基于SVM的方法来基于统计模型来辅助VAD。在该算法中,SVM有效地将输入信号分为两类,即具有非线性边界的语音活动区域和非语音活动区域。实验结果表明,这些基于训练的方法可以有效地提高VAD的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号