Monaural speech separation using GA-DNN integration scheme

Sivapatham Shoba; Ramadoss Rajavel; Kar Asutosh; Majhi Banshidhar

首页> 外文期刊>Applied Acoustics >Monaural speech separation using GA-DNN integration scheme

【24h】

Monaural speech separation using GA-DNN integration scheme

机译：使用GA-DNN集成方案的单声道语音分离

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this research work, we propose the model based on the Genetic Algorithm (GA) and Deep Neural Network (DNN) to enhance the quality and intelligibility of the noisy speech. In this proposed model, the Voiced Speech (VS) T-F mask is computed using correlogram, frame energy and cross-channel correlogram and Unvoiced Speech (UVS) T-F mask is computed using speech onset/offset. The T-F mask obtained using speech onset and offset represents both voiced and unvoiced segment of the noisy speech signal. The UVS T-F mask is obtained by subtracting the VS from the T-F mask obtained earlier using speech onset/offset. Next, the GA is used to find the optimum weight to combine the T-F mask of VS and UVS to improve speech quality and intelligibility. The weight obtained using GA may not be an optimum one for all sets of speech and noise. This research work focuses on this issue and proposes a DNN model to estimate the optimum weight for all sets of speech and noise. The DNN model is trained using features and optimum weight obtained using GA. Later, the trained DNN model is used to estimate the optimum weight for the testing speech and noise samples. The performance of the proposed GA-DNN based model is evaluated using objective and subjective quality and intelligibility measures. The results of the proposed model shows a prompt improvement in the speech quality and intelligibility with average of 0.73, 4.07, 0.17, 0.26 and 0.22 for PESQ SNR, STOI, CSII and NCM when compared with the existing speech separation systems. (C) 2019 Elsevier Ltd. All rights reserved.

机译：在这项研究工作中，我们提出了基于遗传算法（GA）和深神经网络（DNN）的模型，以提高嘈杂言论的质量和可懂度。在该提出的模型中，使用言语，帧能量和交叉通道相关图和无声语音（UV）使用语音开始/偏移来计算浊音语音（VS）T-F掩模。使用语音发作和偏移获得的T-F掩模代表了噪声语音信号的浊音和未提升段。通过使用语音发作/偏移量从前获得的T-F掩模减去VS来获得UVS T-F掩模。接下来，GA用于找到组合VS和UV的T-F掩模的最佳重量，以提高语音质量和可懂度。使用Ga获得的重量可能不是所有语音和噪声的最佳选择。本研究工作侧重于此问题，并提出了一个DNN模型来估计所有语音和噪声的最佳重量。使用GA获得的特征和最佳重量训练DNN模型。后来，训练有素的DNN模型用于估计测试语音和噪声样本的最佳重量。使用客观和主观质量和可清晰度测量评估所提出的GA-DNN基于GA-DNN模型的性能。与现有语音分离系统相比，所提出的模型的结果迅速提高了PESQ SNR，STOI，CSII和NCM的0.73,4.07,0.17,0.26和0.22。（c）2019 Elsevier Ltd.保留所有权利。

著录项

来源
《Applied Acoustics》 |2020年第3期|107140.1-107140.11|共11页
作者
Sivapatham Shoba; Ramadoss Rajavel; Kar Asutosh; Majhi Banshidhar;
展开▼
作者单位

SSN Coll Engn Dept Elect & Commun Engn Kalavakkam India;

SSN Coll Engn Dept Elect & Commun Engn Kalavakkam India;

Indian Inst Informat Technol Design & Mfg Dept Elect & Commun Engn Chennai Tamil Nadu India;

Indian Inst Informat Technol Design & Mfg Dept Comp Sci & Engn Chennai Tamil Nadu India;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Genetic Algorithm; Deep Neural Network; Monaural Speech Separation; Segmentation; Voiced Speech; Unvoiced Speech;

机译：遗传算法;深神经网络;单声道语音分离;分割;浊音语音;清音演讲;

相似文献

外文文献
中文文献
专利

1. Monaural speech separation based on MAXVQ and CASA for robust speech recognition [J] . Peng Li, Yong Guan, Shijin Wang, Computer speech and language . 2010,第1期

机译：基于MAXVQ和CASA的单声道语音分离可增强语音识别能力
2. Monaural Speech Separation Based on Computational Auditory Scene Analysis and Objective Quality Assessment of Speech [J] . Li P., Guan Y., Xu B., IEEE transactions on audio, speech and language processing . 2006,第6期

机译：基于计算听觉场景分析和语音客观质量评估的单声道语音分离
3. Monaural speech/music source separation using discrete energy separation algorithm [J] . Yevgeni Litvin, Israel Cohen, Dan Chazan Signal processing . 2010,第12期

机译：使用离散能量分离算法的单声道语音/音乐源分离
4. Monaural Speech Separation Based on Computational Auditory Scene Analysis and Objective Quality Assessment of Speech* [C] . Peng Li, Yong Guan, Bo Xu, First International Conference on Innovative Computing, Information and Control vol.II . 2006

机译：基于计算听觉场景分析和语音客观质量评估的单声道语音分离*
5. Monaural speech segregation in reverberant environments. [D] . Jin, Zhaozhang. 2010

机译：混响环境中的单声道语音隔离。
6. Complex Ratio Masking for Monaural Speech Separation [O] . Donald S. Williamson, Yuxuan Wang, DeLiang Wang -1

机译：用于单声道语音分离的复数比率掩蔽
7. NMF based speech and music separation in monaural speech recordings with sparseness and temporal continuity constraints [O] . Tu Ming, Xie Xiang, Jiao Yishan 2013

机译：基于NMF的语音和音乐分离在单声道语音记录中，具有稀疏性和时间连续性约束
8. Deep Ensemble Learning for Monaural Speech Separation. [R] . Wang, D. 2015

机译：单声道语音分离的深度集成学习。

Monaural speech separation using GA-DNN integration scheme

摘要

著录项

相似文献

相关主题

期刊订阅