【24h】

Cochannel Speech Segregation with Sparse Coding

机译:稀疏编码的同信道语音分离

获取原文
获取原文并翻译 | 示例

摘要

Most of the computational auditory scene analysis (CASA) based systems rely on pitch based features. When we go for cochannel speech segregation, two speakers are involved. Pitch ranges for male speech and female speech overlap to a large extent. Therefore multi-pitch tracking becomes a nontrivial task. In case of same gender mixtures, again pitch tracking becomes harder. Considering this fact, we should go for some reliable features. Here we propose a cochannel speech segregation system with sparsity based features. Sparse coding is applied on the cochleagram of the signal to get sparse approximation coefficients using pre-trained dictionaries for speakers. We treat sparse approximation coefficients the features because these are selected from the speaker specific dictionaries to represent an input signal. Sparse approximation coefficients are good choice for finding binary masks. Speech waveform is resynthesized from the masked cochleagram of the mixture. Experimental results show that the proposed method produces better objective intelligibility scores than the baseline system.
机译:大多数基于计算听觉场景分析(CASA)的系统都依赖于基于音高的功能。当我们进行同频道语音分离时,涉及到两个发言人。男性语音和女性语音的音调范围在很大程度上重叠。因此,多音高跟踪成为一项艰巨的任务。在性别相同的情况下,音调跟踪也会变得更加困难。考虑到这一事实,我们应该选择一些可靠的功能。在这里,我们提出了一种基于稀疏特征的同信道语音分离系统。使用针对扬声器的预训练词典,对信号的耳蜗图进行稀疏编码,以获得稀疏的近似系数。我们将稀疏近似系数视为特征,因为这些特征是从扬声器特定词典中选择的,以表示输入信号。稀疏近似系数是找到二进制掩码的不错选择。语音波形从混合物的屏蔽耳蜗图重新合成。实验结果表明,与基线系统相比,该方法产生了更好的客观清晰度得分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号