首页> 外文学位 >Unsupervised speaker identification for TV news
【24h】

Unsupervised speaker identification for TV news

机译:电视新闻的无监督说话人识别

获取原文
获取原文并翻译 | 示例

摘要

Cable, satellite, and broadcast television (TV) networks produce a tremendous amount of information every day. Identifying the speaker throughout a video at specific times would be useful. Previous research has identified speakers on pre-trained faces for TV shows and movies. News videos are challenging because new faces often appear. By using an unsupervised clustering algorithm, this paper proposes to label speakers using just the available information in the news video without external information. Our proposed framework segments the audio by speaker, parses closed captions to identify possible names of speakers, identifies talking persons, performs optical character recognition on text that appears while a person speaks, and checks if a name appears on screen during a speaker's audio segments. Our framework utilizes face detection, face recognition, face clustering, face landmarking, natural language processing tools, parsing rules, and speaker diarization. Our results indicate 63.6% accuracy for identifying speakers for CNN news.
机译:有线,卫星和广播电视(TV)网络每天都会产生大量信息。在特定时间确定整个视频中的讲话者会很有用。先前的研究已经确定了在预训练过的电视节目和电影中面部的说话者。新闻视频具有挑战性,因为经常会出现新面孔。通过使用一种无​​监督的聚类算法,本文建议仅使用新闻视频中的可用信息来标记发言人,而无需外部信息。我们提议的框架按讲话者对音频进行细分,解析隐藏式字幕以识别讲话者的可能姓名,识别讲话者,对讲话者说话时出现的文本进行视觉字符识别以及检查在讲话者的音频片段中屏幕上是否出现名字。我们的框架利用人脸检测,人脸识别,人脸聚类,人脸地标,自然语言处理工具,解析规则和说话人区分。我们的结果表明,识别CNN新闻发言人的准确性为63.6%。

著录项

  • 作者

    Woo, Daniel N.;

  • 作者单位

    The University of Alabama in Huntsville.;

  • 授予单位 The University of Alabama in Huntsville.;
  • 学科 Computer science.;Computer engineering.;Mass communication.
  • 学位 M.S.
  • 年度 2014
  • 页码 69 p.
  • 总页数 69
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 TS97-4;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号