...
首页> 外文期刊>IEEE transactions on multimedia >Character-Oriented Video Summarization With Visual and Textual Cues
【24h】

Character-Oriented Video Summarization With Visual and Textual Cues

机译:以视觉和文本提示为导向的视频汇总

获取原文
获取原文并翻译 | 示例
           

摘要

With the booming of content "re-creation" in social media platforms, character-oriented video summary has become a crucial form of user-generated video content. However, artificial extraction could be time-consuming with high missing rate, while traditional techniques on person search may incur heavy burden of computing resources. At the same time, in social media platforms, videos are usually accompanied with rich textual information, e.g., subtitles or bullet-screen comments which provide the multi-view description of videos. Thus, there exists a potential to leverage textual information to enhance the character-oriented video summarization. To that end, in this paper, we propose a novel framework for jointly modeling visual and textual information. Specifically, we first locate characters indiscriminately through detection methods, and then identify these characters via re-identification to extract potential keyframes, in which appropriate source of textual information will be automatically selected and integrated based on the features of specific frame. Finally, key-frames will be aggregated as the character-oriented summarization. Experiments on real-world data sets validate that our solution outperforms several state-of-the-art baselines on both person search and summarization tasks, which prove the effectiveness of our solution on the character-oriented video summarization problem.
机译:随着社交媒体平台中的内容“重新创建”的蓬勃发展,以字符为导向的视频摘要已成为用户生成的视频内容的重要形式。然而,人工提取可能具有高缺失率的耗时,而人物搜索的传统技术可能会产生沉重的计算资源负担。与此同时,在社交媒体平台中,视频通常伴随着丰富的文本信息,例如,字幕或子弹屏幕评论,提供了视频的多视图描述。因此,存在潜力利用文本信息来提高面向性的视频摘要。为此,在本文中,我们提出了一种用于共同建模视觉和文本信息的新框架。具体地,我们首先通过检测方法立即定位角色,然后通过重新识别来识别这些字符以提取潜在的关键帧,其中基于特定帧的特征将自动选择和集成适当的文本信息源。最后,键帧将被聚合为面向角色的摘要。实际数据集的实验验证了我们的解决方案在人员搜索和总结任务上占有了几个最先进的基础,这证明了我们解决方案导向的视频摘要问题的有效性。

著录项

  • 来源
    《IEEE transactions on multimedia》 |2020年第10期|2684-2697|共14页
  • 作者单位

    Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

    Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

    Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

    Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

    Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

    Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

    Kingsoft AI Lab Beijing 100085 Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Character-oriented video summarization; person search; natural language processing;

    机译:以性别为导向的视频摘要;人搜索;自然语言处理;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号