【24h】

Deep ad-hoc beamforming

机译:深度ad-hoc波束成形

获取原文
获取原文并翻译 | 示例
       

摘要

Far-field speech processing is an important and challenging problem. In this paper, we propose deep ad-hoc beamforming, a deep-learning-based multichannel speech enhancement framework based on ad-hoc microphone arrays, to address the problem. It contains three novel components. First, it combines ad-hoc microphone arrays with deep-learning-based multichannel speech enhancement, which reduces the probability of the occurrence of far-field acoustic environments significantly. Second, it groups the microphones around the speech source to a local microphone array by a supervised channel selection framework based on deep neural networks. Third, it develops a simple time synchronization framework to synchronize the channels that have different time delay. Besides the above novelties and advantages, the proposed model is also trained in single-channel fashion, so that it can easily employ new development of speech processing techniques. Its test stage is also flexible in incorporating any number of microphones without retraining or modifying the framework. We have developed many implementations of the proposed framework and conducted an extensive experiment in scenarios where the locations of the speech sources are far-field, random, and blind to the microphones. Results on speech enhancement tasks show that our method outperforms its counterpart that works with linear microphone arrays by a considerable margin in both diffuse noise reverberant environments and point source noise reverberant environments. We have also tested the framework with different handcrafted features. Results show that although good features lead to high performance, they do not affect the conclusion on the effectiveness of the proposed framework.
机译:远场语音处理是一个重要和具有挑战性的问题。在本文中,我们提出了基于Ad-Hoc麦克风阵列的深度学习的多通道语音增强框架的深度ad-hoc波束成形,解决了问题。它包含三个新型组件。首先,它将Ad-hoc麦克风阵列与基于深度学习的多通道语音增强结合,这显着降低了远场声学环境的发生概率。其次,它通过基于深神经网络的监督信道选择框架将语音源周围的麦克风围绕局部麦克风阵列。第三,它开发了一个简单的时间同步框架,可以同步具有不同时间延迟的通道。除了上述Noveltize和优点外,拟议的型号也在单通道时尚培训,因此它可以轻松地采用语音处理技术的新开发。其测试阶段也可以在不重新培训或修改框架的情况下掺入任何数量的麦克风。我们已经开发了许多拟议框架的实现,并在语音源的位置是远场的场景中进行了广泛的实验,对麦克风盲目。语音增强任务的结果表明,我们的方法优于其对应物,其对应于漫射噪声混响环境和点源噪声混响环境中的相当数利润率。我们还测试了不同的手工特征的框架。结果表明,虽然良好的功能导致高性能高,但它们不会影响提出框架的有效性的结论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号