Glimpse-based estimation of speech intelligibility from speech-in-noise using artificial neural networks

Yan Tang

首页> 外文期刊>Computer speech and language >Glimpse-based estimation of speech intelligibility from speech-in-noise using artificial neural networks

【24h】

Glimpse-based estimation of speech intelligibility from speech-in-noise using artificial neural networks

机译：使用人工神经网络从语音噪声估算语音可懂度估算

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

While human listeners can, to some extent, understand the information conveyed by the speech signal when it is mixed with noise, traditional objective intelligibility measures usually fail to operate without a priori knowledge of the clean speech signal. This hence limits the usability of those measures in situations where the clean speech signal is inaccessible. In this paper a glimpse-based method is extended to make speech intelligibility predictions directly from speech-plus-noise mixtures. Using a neural network, the proposed method estimates the time-frequency regions with a local speech-to-noise ratio above a given threshold - known as glimpses - from the mixture signal, instead of separately comparing the speech signal against the noise signal. The number and locations of the glimpses can then be used to produce an intelligibility score. In Experiment I where listener intelligibilities were measured in one stationary and nine fluctuating noise maskers, the predictions produced by the proposed method were highly correlated with the subjective data, with correlation coefficients above 0.90. In Experiment Ⅱ, with the same neural network trained on normal natural speech as in Experiment Ⅰ, the proposed method was used to predict the intelligibility of speech signals modified by intelligibility-enhancement algorithms and synthetic speech. The method can still maintain its predictive power by demonstrating a similar performance to its intrusive counterpart with an overall correlation coefficient of 0.81, which is superior to many modern traditional measures evaluated under the same conditions. Therefore, the proposed method can be used to estimate speech intelligibility in place of traditional measures in conditions where their capacity falls short.

机译：虽然人类听众可以在某种程度上了解由语音信号传达的信息，当它与噪声混合时，传统的客观可懂度措施通常无法在无需先验的清洁语音信号的情况下操作。这因此限制了这些措施在清洁语音信号无法访问的情况下的可用性。在本文中，扩展了一种基于瞥见的方法，以直接从语音和噪声混合物进行语音可懂度预测。使用神经网络，所提出的方法估计具有高于给定阈值的局部语音与噪声比的时频区域 - 来自混合信号 - 来自混合信号，而不是将语音信号分别与噪声信号进行单独进行比较。然后可以使用闪烁的数量和位置来产生可清晰度分数。在实验I中，在一个静止和九个波动屏蔽器中测量听众渠道，由所提出的方法产生的预测与主观数据高度相关，相关系数高于0.90。在实验Ⅱ中，在实验中的正常自然语音上培训了相同的神经网络，所提出的方法用于预测可理性能增强算法和合成语音改性语音信号的可懂度。该方法仍然可以通过向其侵入性的对应物证明其具有0.81的整体相关系数的类似性能来维持其预测力，这优于许多在相同条件下评估的许多现代传统措施。因此，所提出的方法可用于估计语音可懂性代替传统措施，在其容量缩短的条件下。

著录项

来源
《Computer speech and language》 |2021年第9期|101220.1-101220.15|共15页
作者
Yan Tang;
展开▼
作者单位

Department of Linguistics University of Illinois at Urbana-Champaign USA Beckman Institute for Advanced Science and Technology USA;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech intelligibility; Objective intelligibility measure; Non-intrusive; Glimpse; Artificial neural network; Noise;

机译：语音可懂性;客观可懂度措施;非侵入性;一瞥;人工神经网络;噪音;

相似文献

外文文献
中文文献
专利

1. Speech enhancement based on neural networks improves speech intelligibility in noise for cochlear implant users [J] . Goehring Tobias, Bolner Federico, Monaghan Jessica J. M., Hearing Research: An International Journal . 2017,第期

机译：基于神经网络的语音增强改善了耳蜗植入用户的噪声的语音可懂度
2. Artificial Speech Bandwidth Extension Using Deep Neural Networks for Wideband Spectral Envelope Estimation [J] . Johannes Abel, Tim Fingscheidt Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2018,第1期

机译：使用深度神经网络进行宽带频谱包络估计的人工语音带宽扩展
3. Deep convolutional neural network-based speech enhancement to improve speech intelligibility and quality for hearing-impaired listeners (Retraction of 2018) [J] . Rahiman P. F. Khaleelur, Jayanthi V. S., Jayanthi A. N. Medical and Biological Engineering and Computing: Journal of the International Federation for Medical and Biological Engineering . 2019,第3期

机译：基于深度卷积神经网络的语言增强，提高听力障碍听众的语音清晰度和质量（2018年撤回）
4. PREDICTING SPEECH INTELLIGIBILITY AND SECURITY USING ARTIFICIAL NEURAL NETWORK MODELS [C] . Jingfeng Xu International Congress on Sound and Vibration . 2006

机译：使用人工神经网络模型预测语音可懂度和安全性
5. Bird call recognition with artificial neural networks, support vector machines, and kernel density estimation. [D] . Ross, Derek J. 2006

机译：利用人工神经网络，支持向量机和核密度估计进行鸟叫识别。
6. The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility [O] . Thomas Bentsen, Tobias May, Abigail A. Kressner, 2012

机译：在计算语音隔离中将深度神经网络架构与理想比率掩码估计相结合的好处，可以提高语音清晰度
7. Speech enhancement based on neural networks improves speech intelligibility in noise for cochlear implant users [O] . Goehring, Tobias, Bolner, Federico, Monaghan, Jessica J. M, 2017

机译：基于神经网络的语音增强功能可提高人工耳蜗用户的语音清晰度

Glimpse-based estimation of speech intelligibility from speech-in-noise using artificial neural networks

摘要

著录项

相似文献

相关主题

期刊订阅