A comparison of several acoustic representations for speech recognition with degraded and undegraded speech

机译：若干声音表示与劣化和解析语音的若干声学表示

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Several acoustic representations have been compared in speaker-dependent and independent connected and isolated-word recognition tests with undegraded speech and with speech degraded by adding white noise and by applying a 6-dB/octave spectral tilt. The representations comprised the output of an auditory model, cepstrum coefficients derived from an FFT-based mel-scale filter bank with various weighting schemes applied to the coefficients, cepstrum coefficients augmented with measures of their rates of change with time, and sets of linear discriminant functions derived from the filter-bank output and called IMELDA. The model outperformed the cepstrum representations except in noise-free connected-word tests, where it had a high insertion rate. The best cepstrum weighting scheme was derived from within-class variances. Its behavior may explain the empirical adjustments found necessary with other schemes. IMELDA outperformed all other representations in all conditions and is computationally simple.

机译：在扬声器依赖性和独立的连接和隔离字识别测试中进行了几种声学表示，并且通过添加白噪声并通过应用6-db / octrave光谱倾斜来降低语音和言语。该表示包括听觉模型的输出，从基于FFT的MEL级滤波器组导出的具有各种加权方案的Cepstrum系数，其适用于系数，Cepstrum系数通过它们随时间的变化率的测量来增强，以及线性判别的措施和一组线性判别源自滤波器存储体输出并称为Imelda的函数。除非无噪声连接字测试，模型外，模型表现出综衣表示，其具有高插入率。最佳的克斯特劳加权方案来自课堂内差异。其行为可以解释与其他方案所必需的实证调整。 Imelda在所有条件下表现了所有其他陈述，并且是计算方式简单。

著录项

来源
《International Conference on Acoustics, Speech, and Signal Processing》|1989年||共4页
会议地点
作者
Hunt M.J.; Lefebvre C.; Institute of Electric and Electronic Engineer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Transform representation of the spectra of acoustic speech segments with applications. I. General approach and application to speech recognition [J] . Algazi V.R., Brown K.L. IEEE Transactions on Speech and Audio Proceeding . 1993,第2期

机译：借助应用程序来变换语音片段的频谱表示。一，一般方法及其在语音识别中的应用
2. Time-Frequency Feature Representation Using Multi-Resolution Texture Analysis and Acoustic Activity Detector for Real-Life Speech Emotion Recognition [J] . Kun-Ching Wang Sensors . 2015,第1期

机译：使用多分辨率纹理分析和声活动检测器进行实时语音情感识别的时频特征表示
3. Predicting individual speech intelligibility from the cortical tracking of acoustic- and phonetic-level speech representations [J] . Lesenfants D., Vanthornhout J., Verschueren E., Hearing Research: An International Journal . 2019,第期

机译：预测声学跟踪声学跟踪的个体语音可懂度语音级语音表示
4. A comparison of several acoustic representations for speech recognition with degraded and undegraded speech [C] . Hunt, M.J., Lefebvre, . 1989

机译：语音识别中几种声学表示与退化和未退化语音的比较
5. Perceptual and acoustical comparisons of motor speech practice options for children with childhood apraxia of speech. [D] . Nordness, Amy S. 2011

机译：儿童言语失用症儿童运动言语练习选项的听觉和听觉比较。
6. Time-Frequency Feature Representation Using Multi-Resolution Texture Analysis and Acoustic Activity Detector for Real-Life Speech Emotion Recognition [O] . Kun-Ching Wang 2015

机译：使用多分辨率纹理分析和声活动检测器的时频特征表示用于现实生活中的语音情感识别
7. Neural markers of speech comprehension: measuring EEG tracking of linguistic speech representations, controlling the speech acoustics [O] . Marlies Gillis, Jonas Vanthornhout, Jonathan Z. Simon, 2021

机译：语音理解的神经标志：测量语言语音表示的脑电图，控制语音声学

A comparison of several acoustic representations for speech recognition with degraded and undegraded speech

摘要

著录项

相似文献

相关主题

期刊订阅