Voice activity detection based on statistical models and machine learning approaches

Jong Won Shin; Joon-Hyuk Chang; Nam Soo Kim

首页> 外文期刊>Computer speech and language >Voice activity detection based on statistical models and machine learning approaches

【24h】

Voice activity detection based on statistical models and machine learning approaches

机译：基于统计模型和机器学习方法的语音活动检测

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The voice activity detectors (VADs) based on statistical models have shown impressive performances especially when fairly precise statistical models are employed. Moreover, the accuracy of the VAD utilizing statistical models can be significantly improved when machine-learning techniques are adopted to provide prior knowledge for speech characteristics. In the first part of this paper, we introduce a more accurate and flexible statistical model, the generalized gamma distribution (GΓD) as a new model in the VAD based on the likelihood ratio test. In practice, parameter estimation algorithm based on maximum likelihood principle is also presented. Experimental results show that the VAD algorithm implemented based on GΓD outperform those adopting the conventional Laplacian and Gamma distributions. In the second part of this paper, we introduce machine learning techniques such as a minimum classification error (MCE) and support vector machine (SVM) to exploit automatically prior knowledge obtained from the speech database, which can enhance the performance of the VAD. Firstly, we present a discriminative weight training method based on the MCE criterion. In this approach, the VAD decision rule becomes the geometric mean of optimally weighted likelihood ratios. Secondly, the SVM-based approach is introduced to assist the VAD based on statistical models. In this algorithm, the SVM efficiently classifies the input signal into two classes which are voice active and voice inactive regions with nonlinear boundary. Experimental results show that these training-based approaches can effectively enhance the performance of the VAD.

机译：基于统计模型的语音活动检测器（VAD）表现出令人印象深刻的性能，尤其是在采用相当精确的统计模型时。此外，当采用机器学习技术为语音特性提供先验知识时，利用统计模型的VAD的准确性可以大大提高。在本文的第一部分中，我们基于似然比检验介绍了一种更准确，更灵活的统计模型，即广义伽玛分布（GΓD）作为VAD中的新模型。在实践中，还提出了基于最大似然原理的参数估计算法。实验结果表明，基于GΓD的VAD算法优于传统的Laplacian和Gamma分布算法。在本文的第二部分中，我们介绍了机器学习技术，例如最小分类错误（MCE）和支持向量机（SVM），以自动利用从语音数据库中获得的先验知识，从而可以提高VAD的性能。首先，我们提出了一种基于MCE准则的判别体重训练方法。在这种方法中，VAD决策规则成为最佳加权似然比的几何平均值。其次，引入了基于SVM的方法来基于统计模型来辅助VAD。在该算法中，SVM有效地将输入信号分为两类，即具有非线性边界的语音活动区域和非语音活动区域。实验结果表明，这些基于训练的方法可以有效地提高VAD的性能。

著录项

来源
《Computer speech and language》 |2010年第3期|p.515-530|共16页
作者
Jong Won Shin; Joon-Hyuk Chang; Nam Soo Kim;
展开▼
作者单位

School of Electrical Engineering and INMC, Seoul National University, Seoul 151-742, Republic of Korea;

School of Electronic Engineering, Inha University, 253 Yonghyeon-dong, Nam-gu, Incheon 401-751, Republic of Korea;

rnSchool of Electrical Engineering and INMC, Seoul National University, Seoul 151-742, Republic of Korea;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
voice activity detection; statistical modeling; machine learning; prior knowledge; likelihood ratio test; generalized gamma; minimum classification error; support vector machine; A posteriori SNR; A priori SNR; predicted SNR;

机译：语音活动检测;统计建模;机器学习先验知识;似然比检验;广义伽玛最小分类误差;支持向量机后验SNR;先验SNR;预测信噪比;

相似文献

外文文献
中文文献
专利

1. Partial mutual information based input variable selection for supervised learning approaches to voice activity detection [J] . Ivan Markovic, Srecko Juric-Kavelj, Ivan Petrovic Applied Soft Computing . 2013,第11期

机译：基于部分互信息的输入变量选择，用于语音活动检测的有监督学习方法
2. Line spectral frequency-based features and extreme learning machine for voice activity detection from audio signal [J] . Himadri Mukherjee, Sk. Md. Obaidullah, K. C. Santosh, International journal of speech technology . 2018,第4期

机译：基于线频谱频率的功能和极限学习机，可从音频信号中检测语音活动
3. Voice Activity Detection Based on Multiple Statistical Models [J] . Joon-Hyuk Chang, Nam Soo Kim, Sanjit K. Mitra IEEE Transactions on Signal Processing . 2006,第6期

机译：基于多种统计模型的语音活动检测
4. Voice Activity Detection Based on Multiple Statistical Models [C] . Ji Chang-peng, Gao Mo, Yang Jie International Conference on Materials Science and Technology . 2011

机译：基于多种统计模型的语音活动检测
5. Statistical and Machine Learning Approaches for Modelling Infectious Diseases at Sub-City Spatial Resolution [D] . Abdur Rehman, Nabeel. 2020

机译：统计和机器学习方法，用于在市城市空间分辨率下进行传染病
6. A Comparison of Methods for Classifying Clinical Samples Based on Proteomics Data: A Case Study for Statistical and Machine Learning Approaches [O] . Dayle L. Sampson, Tony J. Parker, Zee Upton, 2011

机译：进行分类临床样品方法的比较基于蛋白质组学数据：案例研究统计和机器学习方法
7. Voice activity detection based on multiple statistical models [O] . Joon-hyuk Chang, Nam Soo Kim, Sanjit K. Mitra, 2006

机译：基于多种统计模型的语音活动检测

Voice activity detection based on statistical models and machine learning approaches

摘要

著录项

相似文献

相关主题

期刊订阅