Voice Activity Detection using Temporal Characteristics of Autocorrelation Lag and Maximum Spectral Amplitude in Sub-bands

机译：利用自相关滞后的时间特性和子带中的最大频谱幅度进行语音活动检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A robust voice activity detection (VAD) is a prerequisite for many speech based applications like speech recognition. We investigated two VAD techniques that use time domain and frequency domain characteristics of speech signal. The temporal characteristic of the autocorrelation lag is able to discriminate speech and nonspeech regions. In the frequency domain, peak value of the magnitude spectrum in different sub-bands is used for VAD. Performance of the proposed methods are evaluated on TIMIT database with noises from NOISEX-92 database at various signal-to-noise ratio (SNR) levels. From the experimental results, it is observed that VAD based on autocorrelation lag is working consistently better than the maximum peak value of the autocorrelation function based method. However, it performs inferior compared to our second approach and AMR-VAD2. Our second approach i.e., VAD based on maximum spectral amplitude in sub-bands outperforms AMR-VAD2 and Sohn VAD for some noise conditions. Moreover, it is shown that a threshold independent of noises and their levels can be selected in the proposed method.

机译：健壮的语音活动检测（VAD）是许多基于语音的应用（如语音识别）的先决条件。我们研究了两种使用语音信号时域和频域特性的VAD技术。自相关滞后的时间特性能够区分语音和非语音区域。在频域中，将不同子带中幅度谱的峰值用于VAD。在TIMIT数据库上使用来自NOISEX-92数据库的噪声在各种信噪比（SNR）级别上评估了所提出方法的性能。从实验结果可以看出，基于自相关滞后的VAD始终比基于自相关函数的方法的最大峰值更好。但是，与我们的第二种方法和AMR-VAD2相比，它的性能较差。我们的第二种方法，即在某些噪声条件下，基于子带最大频谱幅度的VAD优于AMR-VAD2和Sohn VAD。此外，示出了在所提出的方法中可以选择与噪声及其水平无关的阈值。

著录项

来源
《International conference on natural language processing》|2014年|48-55|共8页
会议地点
作者
Sivanand Achanta; Nivedita Chennupati; Vishala Pannala; Mansi Rankawat; Kishore Prahallad;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Voice-Activity Detection Using Long-Term Sub-Band Entropy Measure [J] . Kun-Ching WANG IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences . 2012,第9期

机译：使用长期子带熵测度的语音活动检测
2. Discrimination algorithm using voiced detection method and time-delay neural network system by 3 FFT sub-bands [J] . Jae Seung Choi International journal of computational vision and robotics . 2015,第2期

机译：使用语音检测方法和时延神经网络系统的3个FFT子带判别算法
3. SPECTRAL ENERGY BASED VOICE ACTIVITY DETECTION FOR REAL-TIME VOICE INTERFACE [J] . JEONG-SIK PARK, JUNG-SEOK YOON, YONG-HO SEO, Journal of Theoretical and Applied Information Technology . 2017,第17期

机译：基于频谱能量的实时语音接口语音活动检测
4. Efficient voice activity detection algorithm based on sub-band temporal envelope and sub-band long-term signal variability [C] . Liu Bin, Tao Jianhua, Mo Fuyuan, International Symposium on Chinese Spoken Language Processing . 2014

机译：基于子带时间包络和子带长期信号可变性的有效语音活动检测算法
5. Cortical Temporal Processing in Cochlear Implant Users: Amplitude Modulation and Voice Onset Time. [D] . Han, Ji-Hye. 2014

机译：人工耳蜗用户的皮质颞叶处理：振幅调制和语音发作时间。
6. Discrimination of interaural temporal disparities conveyed by high-frequency sinusoidally amplitude-modulated tones and high-frequency transposed tones: Effects of spectrally flanking noises [O] . Leslie R. Bernstein, Constantine Trahiotis -1

机译：高频正弦振幅调制音和高频转置音传达的耳间时间差异的辨别：频谱侧翼噪声的影响
7. A New Voice Activity Detection Method Using Maximized Sub-band SNR [O] . Weiwu Jiang, Wai Kit Lo, Helen Meng 2013

机译：利用最大子带信噪比的语音活动检测新方法

Voice Activity Detection using Temporal Characteristics of Autocorrelation Lag and Maximum Spectral Amplitude in Sub-bands

摘要

著录项

相似文献

相关主题

期刊订阅