首页> 外国专利> Unsupervised speaker segmentation of multi-speaker speech data

Unsupervised speaker segmentation of multi-speaker speech data

机译：多说话者语音数据的无监督说话者分割

页面导航

摘要
著录项
相似文献

摘要

Systems and methods for unsupervised segmentation of multi-speaker speech or audio data by speaker. A front-end analysis is applied to input speech data to obtain feature vectors. The speech data is initially segmented and then clustered into groups of segments that correspond to different speakers. The clusters are iteratively modeled and resegmented to obtain stable speaker segmentations. The overlap between segmentation sets is checked to ensure successful speaker segmentation. Overlapping segments are combined and remodeled and resegmented. Optionally, the speech data is processed to produce a segmentation lattice to maximize the overall segmentation likelihood.

机译：用于由扬声器对多扬声器语音或音频数据进行无监督分割的系统和方法。将前端分析应用于输入语音数据以获得特征向量。首先对语音数据进行分段，然后将其聚类为与不同说话者相对应的分段组。对聚类进行迭代建模和细分，以获得稳定的说话人细分。检查细分集之间的重叠，以确保成功进行说话者细分。重叠的段将合并，重塑和重新分段。可选地，对语音数据进行处理以产生分割格，以使整体分割可能性最大化。

著录项

公开/公告号US7295970B1

专利类型
公开/公告日2007-11-13

原文格式PDF
申请/专利权人 ALLEN LOUIS GORIN;ZHU LIU;SARANGARAJAN PARTHASARATHY;AARON EDWARD ROSENBERG;
展开▼

申请/专利号US20030350727
发明设计人 ALLEN LOUIS GORIN;ZHU LIU;SARANGARAJAN PARTHASARATHY;AARON EDWARD ROSENBERG;
展开▼

申请日2003-01-24
分类号G10L19/12;
国家 US
入库时间 2022-08-21 20:10:09

相似文献

专利
外文文献
中文文献