首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Data-Efficient Framework for Real-World Multiple Sound Source 2d Localization
【24h】

Data-Efficient Framework for Real-World Multiple Sound Source 2d Localization

机译:数据高效框架,用于现实世界多声源2D本地化

获取原文

摘要

Deep neural networks have recently led to promising results for the task of multiple sound source localization. Yet, they require a lot of training data to cover a variety of acoustic conditions and micro-phone array layouts. One can leverage acoustic simulators to inexpensively generate labeled training data. However, models trained on synthetic data tend to perform poorly with real-world recordings due to the domain mismatch. Moreover, learning for different microphone array layouts makes the task more complicated due to the infinite number of possible layouts. We propose to use adversarial learning methods to close the gap between synthetic and real do-mains. Our novel ensemble-discrimination method significantly improves the localization performance without requiring any label from the real data. Furthermore, we propose a novel explicit transformation layer to be embedded in the localization architecture. It enables the model to be trained with data from specific microphone array layouts while generalizing well to unseen layouts during inference.
机译:深度神经网络最近导致了多种声源本地化任务的有希望的结果。然而,它们需要大量的培训数据来涵盖各种声学条件和微电话阵列布局。人们可以利用声学模拟器廉价地生成标记的培训数据估计。然而,由于域不匹配导致的综合数据培训的模型往往与真实录音不好。此外,为不同的麦克风阵列布局学习,由于无限数量的可能布局,任务更加复杂。我们建议使用对抗性学习方法来缩小合成和真实系数之间的差距。我们的新型集合鉴别方法显着提高了本地化性能,而无需实际数据的任何标签。此外,我们提出了一种新颖的显式转换层,以嵌入在本地化架构中。它使模型能够通过特定麦克风阵列布局的数据训练,同时概括在推理期间不均匀地布局。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号