首页> 外文会议>2016 IEEE International Workshop on Acoustic Signal Enhancement >Modeling audio directional statistics using a probabilistic spatial dictionary for speaker diarization in real meetings
【24h】

Modeling audio directional statistics using a probabilistic spatial dictionary for speaker diarization in real meetings

机译:使用概率空间字典对音频方向统计进行建模,以在实际会议中进行演讲者区分

获取原文
获取原文并翻译 | 示例

摘要

Speaker diarization is the task of estimating “who spoke when” in a meeting. To realize accurate diarization for real meetings, we have to deal with noise, speaker overlap, reverberation, etc. In this work, we propose to model directional statistics of spatial clusters via a dictionary of probabilistic models. The dictionary is trained using spatial features of possible source locations. Observed mixtures of multiple source signals are statistically represented as the weighted sum of the trained models, where each weight defines the activity of a source associated with a spatial location or a cluster. To detect the active clusters and perform the speaker diarization, the weights are estimated by applying Bayes' rule. Furthermore, a Laplace distribution is proposed to model the background noise. The proposed method was evaluated in real meetings, and it provided high performance comparing to a baseline method.
机译:演讲者差异化是估计会议中“谁在何时发言”的任务。为了实现真实会议的精确二值化,我们必须处理噪声,说话者重叠,混响等问题。在这项工作中,我们建议通过概率模型字典为空间簇的方向统计建模。使用可能的源位置的空间特征来训练字典。观察到的多个源信号的混合在统计上表示为训练模型的加权总和,其中每个权重定义与空间位置或群集关联的源的活动。为了检测活动集群并执行说话者区分,通过应用贝叶斯规则来估计权重。此外,提出了一个拉普拉斯分布来模拟背景噪声。所提议的方法在实际会议中进行了评估,与基线方法相比,它具有很高的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号