Enhancing Distributed Speech Recognition with Back-End Speech Reconstruction

机译：通过后端语音重建增强分布式语音识别

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we present a method to enhance the usefulness of a Distributed Speech Recognition (DSR) system by providing it the capability to reconstruct speech at the back-end. Speech reconstruction is achieved using the standard DSR parameters, viz., Mel-Frequency Cepstral Coefficients (MFCC) and log-energy, and some additional parameters, viz., voicing class, pitch period, and (optionally) higher-resolution energy information. From the MFCC parameters and energy information, the spectral magnitudes at the harmonics of the pitch frequency are estimated. Based on the class information, the harmonic phases are appropriately modeled. The harmonic magnitudes and phases are used to reconstruct speech according to the well-known sinusoidal model for speech synthesis. Transmission of the additional parameters for speech reconstruction increases the DSR bit rate by less than 20%. Evaluation by Mean-Opinion-Score (MOS) test and Diagnostic Rhyme Test (DRT) show that speech reconstructed as above is of reasonable quality and quite intelligible.

机译：在本文中，我们提出了一种通过提供在后端重构语音的能力来增强分布式语音识别（DSR）系统的实用性的方法。使用标准DSR参数（即梅尔频率倒谱系数（MFCC）和对数能量）以及一些其他参数（即发声等级，基音周期和（可选）更高分辨率的能量信息）可以实现语音重建。根据MFCC参数和能量信息，可以估算音调频率谐波处的频谱幅度。基于类别信息，可以对谐波相位进行适当建模。根据众所周知的用于语音合成的正弦模型，谐波幅度和相位用于重建语音。用于语音重建的其他参数的传输将DSR比特率提高了不到20％。通过Mean-Opinion-Score（MOS）测试和Diagnostic Rhyme Test（DRT）进行的评估表明，如上所述重建的语音质量合理且相当清晰。

著录项

来源
《European Conference on Speech Communication and Technology v.3; 20010903-20010907; Aalborg; DK》|2001年|P.1859-1862|共4页
会议地点 Aalborg(DK);Aalborg(DK)
作者
Tenkasi Ramabadran; Jeff Meunier; Mark Jasiuk; Bill Kushner;
展开▼
作者单位

Speech Processing Research Laboratory Motorola Labs Schaumburg, IL 60196, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类传播理论;
关键词

相似文献

外文文献
中文文献
专利

1. Robust distributed speech recognition using speech enhancement [J] . Flynn R., Jones E. IEEE Transactions on Consumer Electronics . 2008,第3期

机译：使用语音增强功能的强大的分布式语音识别
2. A STATISTICAL ANALYSIS ON THE IMPACT OF SPEECH ENHANCEMENT TECHNIQUES ON THE FEATURE VECTORS OF NOISY SPEECH SIGNALS FOR SPEECH RECOGNITION [J] . SWAPNANIL GOGOI, UTPAL BHATTACHARJEE Journal of computer science engineering and information technology research . 2016,第3期

机译：语音增强技术对语音识别中嘈杂语音信号特征向量影响的统计分析
3. A STATISTICAL ANALYSIS ON THE IMPACT OF SPEECH ENHANCEMENT TECHNIQUES ON THE FEATURE VECTORS OF NOISY SPEECH SIGNALS FOR SPEECH RECOGNITION [J] . SWAPNANIL GOGOI, UTPAL BHATTACHARJEE Journal of computer science engineering and information technology research . 2016,第3期

机译：语音增强技术对语音识别中嘈杂语音信号特征向量影响的统计分析
4. Enhancing Distributed Speech Recognition with Back-End Speech Reconstruction [C] . Tenkasi Ramabadran, Jeff Meunier, Mark Jasiuk, European conference on speech communication and technology . 2001

机译：通过后端语音重建增强分布式语音识别
5. Phase-sensitive speech enhancement for automated speech recognition. [D] . Seyed Jafari Olya, Seyed Poorya. 2010

机译：相位敏感语音增强功能可实现自动语音识别。
6. The Self-Advantage in Visual Speech Processing Enhances Audiovisual Speech Recognition in Noise [O] . Nancy Tye-Murray, Brent P. Spehar, Joel Myerson, -1

机译：视觉语音处理中的自我优势增强了噪声中的视听语音识别
7. Combined speech enhancement and auditory modelling for robust distributed speech recognition [O] . Ronan Flynn, Edward Jones 2008

机译：坚固分布式语音识别组合语音增强与听觉建模

Enhancing Distributed Speech Recognition with Back-End Speech Reconstruction

摘要

著录项

相似文献

相关主题

期刊订阅