首页> 外文学位 >Multichannel blind separation of speech signals in a reverberant environment.
【24h】

Multichannel blind separation of speech signals in a reverberant environment.

机译:在混响环境中语音信号的多通道盲分离。

获取原文
获取原文并翻译 | 示例

摘要

The need for separating independent speech signals using multiple microphones in a reverberant environment arises in a variety of applications; e.g., speech enhancement, speech recognition, hands-free telephony, etc. In most of these applications, very little or nothing is known about the source signals or the way they are mixed together, making the separation methods “blind.” Existing blind source separation (BSS) methods tend to break down in a realistic reverberant environment. In this thesis, we show that this limited performance is due to random permutations of the unmixing filters over frequency. We refer to this problem as permutation inconsistency, which becomes worse as the length of the room impulse response increases. By developing diagnostic tools, we reveal that if the unmixing filter matrix permutations are properly aligned at all frequency bins, the performance of the BSS method is greatly improved. We derive ideal separation performance benchmarks and examine the effect of microphone separation and room reverberation on the separation performance.; We study the performance of an ideal null-steering beamformer in the context of speech separation when the source locations are assumed to be known. This leads us to explore interesting connections between BSS and ideal beamforming, where we show the feasibility of using beamformer concepts to resolve the permutation inconsistency problem. We propose a permutation alignment scheme based on information gathered from microphone array directivity patterns. This technique is novel in the sense that it works satisfactorily even when the directivity patterns exhibit grating lobes. We also illustrate the remarkable performance of BSS, which outshines the ideal beamformer in highly-reverberant environments even though the later assumes a prior knowledge of speech source locations.; Finally, we discover the phenomenon of the loss of spectral resolution when one tries to align the unmixing filter permutations. We refer to this conflict between the two requirements as permutation-inconsistency /spectral-resolution tradeoff. To ease this tradeoff, we propose a multiresolution approach, which significantly reduces the permutation misalignment while keeping the valuable spectral resolution intact. We carry out our experiments under varying acoustic conditions, and for all methods, we compare the performance to an ideal benchmark.
机译:在多种应用中,需要在混响环境中使用多个麦克风来分离独立的语音信号。例如,语音增强,语音识别,免提电话等。在大多数这些应用中,对源信号或其混合在一起的方式知之甚少或一无所知,从而使分离方法成为“盲目的”。现有的盲源分离(BSS)方法往往会在现实的混响环境中崩溃。在本文中,我们证明了这种有限的性能是由于混频滤波器在频率上的随机排列所致。我们将此问题称为置换不一致,随着房间脉冲响应时间的增加,这种问题会变得更糟。通过开发诊断工具,我们发现,如果解混滤波器矩阵的排列在所有频点处正确对齐,则BSS方法的性能将得到极大改善。我们得出理想的分离性能基准,并检查麦克风分离和房间混响对分离性能的影响。当假设源位置已知时,我们将在语音分离的背景下研究理想零方向波束成形器的性能。这使我们探索了BSS与理想波束形成之间有趣的联系,在这里我们展示了使用波束形成器概念解决排列不一致问题的可行性。我们提出了一种基于从麦克风阵列指向性模式收集的信息的排列对齐方案。即使在方向性图样显示出光栅波瓣的情况下,该技术也可以令人满意地工作,因此该技术是新颖的。我们还说明了BSS的卓越性能,它在高混响环境中胜过理想的波束形成器,尽管后者假定了对语音源位置的先验知识。最后,当我们尝试对齐解混滤波器的排列时,我们发现频谱分辨率降低的现象。我们将这两个需求之间的冲突称为置换不一致 / 光谱分辨率折衷。为了减轻这种折衷,我们提出了一种多分辨率方法,该方法可在保持宝贵的光谱分辨率完好无损的同时,大大减少排列失准。我们在变化的声学条件下进行实验,对于所有方法,我们将性能与理想基准进行比较。

著录项

  • 作者

    Ikram, Muhammad Zubair.;

  • 作者单位

    Georgia Institute of Technology.;

  • 授予单位 Georgia Institute of Technology.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2001
  • 页码 130 p.
  • 总页数 130
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号