首页> 外文会议>International conference on natural language processing >Voice Activity Detection using Temporal Characteristics of Autocorrelation Lag and Maximum Spectral Amplitude in Sub-bands
【24h】

Voice Activity Detection using Temporal Characteristics of Autocorrelation Lag and Maximum Spectral Amplitude in Sub-bands

机译:利用自相关滞后的时间特性和子带中的最大频谱幅度进行语音活动检测

获取原文

摘要

A robust voice activity detection (VAD) is a prerequisite for many speech based applications like speech recognition. We investigated two VAD techniques that use time domain and frequency domain characteristics of speech signal. The temporal characteristic of the autocorrelation lag is able to discriminate speech and nonspeech regions. In the frequency domain, peak value of the magnitude spectrum in different sub-bands is used for VAD. Performance of the proposed methods are evaluated on TIMIT database with noises from NOISEX-92 database at various signal-to-noise ratio (SNR) levels. From the experimental results, it is observed that VAD based on autocorrelation lag is working consistently better than the maximum peak value of the autocorrelation function based method. However, it performs inferior compared to our second approach and AMR-VAD2. Our second approach i.e., VAD based on maximum spectral amplitude in sub-bands outperforms AMR-VAD2 and Sohn VAD for some noise conditions. Moreover, it is shown that a threshold independent of noises and their levels can be selected in the proposed method.
机译:健壮的语音活动检测(VAD)是许多基于语音的应用(如语音识别)的先决条件。我们研究了两种使用语音信号时域和频域特性的VAD技术。自相关滞后的时间特性能够区分语音和非语音区域。在频域中,将不同子带中幅度谱的峰值用于VAD。在TIMIT数据库上使用来自NOISEX-92数据库的噪声在各种信噪比(SNR)级别上评估了所提出方法的性能。从实验结果可以看出,基于自相关滞后的VAD始终比基于自相关函数的方法的最大峰值更好。但是,与我们的第二种方法和AMR-VAD2相比,它的性能较差。我们的第二种方法,即在某些噪声条件下,基于子带最大频谱幅度的VAD优于AMR-VAD2和Sohn VAD。此外,示出了在所提出的方法中可以选择与噪声及其水平无关的阈值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号