首页> 外国专利> Unsupervised speaker segmentation of multi-speaker speech data

Unsupervised speaker segmentation of multi-speaker speech data

机译:多说话者语音数据的无监督说话者分割

摘要

Systems and methods for unsupervised segmentation of multi-speaker speech or audio data by speaker. A front-end analysis is applied to input speech data to obtain feature vectors. The speech data is initially segmented and then clustered into groups of segments that correspond to different speakers. The clusters are iteratively modeled and resegmented to obtain stable speaker segmentations. The overlap between segmentation sets is checked to ensure successful speaker segmentation. Overlapping segments are combined and remodeled and resegmented. Optionally, the speech data is processed to produce a segmentation lattice to maximize the overall segmentation likelihood.
机译:用于由扬声器对多扬声器语音或音频数据进行无监督分割的系统和方法。将前端分析应用于输入语音数据以获得特征向量。首先对语音数据进行分段,然后将其聚类为与不同说话者相对应的分段组。对聚类进行迭代建模和细分,以获得稳定的说话人细分。检查细分集之间的重叠,以确保成功进行说话者细分。重叠的段将合并,重塑和重新分段。可选地,对语音数据进行处理以产生分割格,以使整体分割可能性最大化。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号