首页> 外文学位 >Effective temporal video segmentation and content-based audio-visual video clustering.
【24h】

Effective temporal video segmentation and content-based audio-visual video clustering.

机译:有效的时间视频分割和基于内容的视听视频聚类。

获取原文
获取原文并翻译 | 示例

摘要

There is a need for tools that efficiently index, browse, and retrieve video data to efficiently access extremely diverse video data without exhaustive searching. To achieve this goal, the first step is temporal video segmentation and the second step is clustering the segmented video sequence according to its content. For temporal video segmentation, a novel spatial-domain approach to detect shot changes and sub-shot changes is proposed. The proposed spatial-domain method for shot change detection provides high performance in the presence of fast camera/object movement or sudden variations in the luminance with a new pixel-wise difference measurement and an inconsistency measurement of the motion vectors. The proposed spatial-domain method for sub-shot change accurately and efficiently estimates camera movement by using information from extracted background images. To reduce computation complexity, a compressed-domain approach is proposed by modifying the proposed spatial-domain approach.; For video clustering, audio-visual clustering methods are proposed to classify video sequences into three categories using both audio and visual information. These categories are action scenes, dialogue scenes, and miscellaneous scenes, which are all high-level semantic entities. First, to cluster a video sequence into action and non-action scenes, motion activity and average shot length are used for the visual classification, and the average energy of the audio sequence is used for the audio classification. Then, to cluster non-action scenes into dialogue and non-dialogue scenes, the time-constrained video clustering method proposed by Yeung and Yeo is modified and applied to the visual information, and a speaker identification and tracking (SDT) method is applied to the audio information. To improve the performance of clustering and the SDT system, a face recognition method is combined with both the modified time-constrained video clustering method and the SDT method. As a result, the proposed video clustering method can also identify the actors and actresses in dialogue scenes by applying SDT.
机译:需要有效地索引,浏览和检索视频数据以有效访问极其多样化的视频数据而无需详尽搜索的工具。为了实现这一目标,第一步是时间视频分割,第二步是根据分割后的视频序列的内容进行聚类。对于时间视频分割,提出了一种新颖的空间域方法来检测镜头变化和子镜头变化。所提出的用于镜头变化检测的空间域方法在摄像机/物体快速移动或亮度突然变化的情况下,通过新的像素方向差异测量和运动矢量不一致测量,可提供高性能。所提出的用于子镜头变化的空间域方法通过使用从提取的背景图像中获取的信息来准确而有效地估计相机运动。为了降低计算复杂度,通过修改所提出的空间域方法来提出一种压缩域方法。对于视频聚类,提出了视听聚类方法,以使用视听信息将视频序列分为三类。这些类别是动作场景,对话场景和其他场景,它们都是高级语义实体。首先,为了将视频序列聚类为动作和非动作场景,将运动活动和平均镜头长度用于视觉分类,并将音频序列的平均能量用于音频分类。然后,为了将非动作场景聚类为对话和非对话场景,对Yeung和Yeo提出的时间受限视频聚类方法进行了修改,并将其应用于视觉信息,并将说话人识别和跟踪(SDT)方法应用于音频信息。为了提高聚类和SDT系统的性能,将人脸识别方法与改进的时间受限视频聚类方法和SDT方法相结合。结果,所提出的视频聚类方法还可以通过应用SDT来识别对话场景中的演员。

著录项

  • 作者

    Kang, Jung Won.;

  • 作者单位

    Georgia Institute of Technology.;

  • 授予单位 Georgia Institute of Technology.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2003
  • 页码 144 p.
  • 总页数 144
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号