...
首页> 外文期刊>IEEE Transactions on Neural Networks >A unified neural-network-based speaker localization technique
【24h】

A unified neural-network-based speaker localization technique

机译:基于统一神经网络的说话人定位技术

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Locating and tracking a speaker in real time using microphone arrays is important in many applications such as hands-free video conferencing, speech processing in large rooms, and acoustic echo cancellation. A speaker can be moving from the far field to the near field of the array, or vice versa. Many neural-network-based localization techniques exist, but they are applicable to either far-field or near-field sources, and are computationally intensive for real-time speaker localization applications because of the wide-band nature of the speech. We propose a unified neural-network-based source localization technique, which is simultaneously applicable to wide-band and narrow-band signal sources that are in the far field or near field of a microphone array. The technique exploits a multilayer perceptron feedforward neural network structure and forms the feature vectors by computing the normalized instantaneous cross-power spectrum samples between adjacent pairs of sensors. Simulation results indicate that our technique is able to locate a source with an absolute error of less than 3.5/spl deg/ at a signal-to-noise ratio of 20 dB and a sampling rate of 8000 Hz at each sensor.
机译:在许多应用中,例如免提视频会议,大房间中的语音处理和回声消除,使用麦克风阵列实时定位和跟踪扬声器很重要。扬声器可以从阵列的远场移到近场,反之亦然。存在许多基于神经网络的定位技术,但是它们适用于远场或近场源,并且由于语音的宽带特性,对于实时说话者定位应用来说,计算量很大。我们提出了一种基于神经网络的统一源定位技术,该技术同时适用于麦克风阵列远场或近场中的宽带和窄带信号源。该技术利用多层感知器前馈神经网络结构,并通过计算相邻传感器对之间的归一化瞬时跨功率谱样本来形成特征向量。仿真结果表明,我们的技术能够以20 dB的信噪比和每个传感器8000 Hz的采样率定位绝对误差小于3.5 / spl deg /的声源。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号