Character-Oriented Video Summarization With Visual and Textual Cues

Zhou Peilun; Xu Tong; Yin Zhizhuo; Liu Dong; Chen Enhong; Lv Guangyi; Li Changliang

首页> 外文期刊>IEEE transactions on multimedia >Character-Oriented Video Summarization With Visual and Textual Cues

【24h】

Character-Oriented Video Summarization With Visual and Textual Cues

机译：以视觉和文本提示为导向的视频汇总

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the booming of content "re-creation" in social media platforms, character-oriented video summary has become a crucial form of user-generated video content. However, artificial extraction could be time-consuming with high missing rate, while traditional techniques on person search may incur heavy burden of computing resources. At the same time, in social media platforms, videos are usually accompanied with rich textual information, e.g., subtitles or bullet-screen comments which provide the multi-view description of videos. Thus, there exists a potential to leverage textual information to enhance the character-oriented video summarization. To that end, in this paper, we propose a novel framework for jointly modeling visual and textual information. Specifically, we first locate characters indiscriminately through detection methods, and then identify these characters via re-identification to extract potential keyframes, in which appropriate source of textual information will be automatically selected and integrated based on the features of specific frame. Finally, key-frames will be aggregated as the character-oriented summarization. Experiments on real-world data sets validate that our solution outperforms several state-of-the-art baselines on both person search and summarization tasks, which prove the effectiveness of our solution on the character-oriented video summarization problem.

机译：随着社交媒体平台中的内容“重新创建”的蓬勃发展，以字符为导向的视频摘要已成为用户生成的视频内容的重要形式。然而，人工提取可能具有高缺失率的耗时，而人物搜索的传统技术可能会产生沉重的计算资源负担。与此同时，在社交媒体平台中，视频通常伴随着丰富的文本信息，例如，字幕或子弹屏幕评论，提供了视频的多视图描述。因此，存在潜力利用文本信息来提高面向性的视频摘要。为此，在本文中，我们提出了一种用于共同建模视觉和文本信息的新框架。具体地，我们首先通过检测方法立即定位角色，然后通过重新识别来识别这些字符以提取潜在的关键帧，其中基于特定帧的特征将自动选择和集成适当的文本信息源。最后，键帧将被聚合为面向角色的摘要。实际数据集的实验验证了我们的解决方案在人员搜索和总结任务上占有了几个最先进的基础，这证明了我们解决方案导向的视频摘要问题的有效性。

著录项

来源
《IEEE transactions on multimedia》 |2020年第10期|2684-2697|共14页
作者
Zhou Peilun; Xu Tong; Yin Zhizhuo; Liu Dong; Chen Enhong; Lv Guangyi; Li Changliang;
展开▼
作者单位

Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;

Kingsoft AI Lab Beijing 100085 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Character-oriented video summarization; person search; natural language processing;

机译：以性别为导向的视频摘要;人搜索;自然语言处理;

相似文献

外文文献
中文文献
专利

1. Video summarization using textual descriptions for authoring video blogs [J] . Otani Mayu, Nakashima Yuta, Sato Tomokazu, Multimedia Tools and Applications . 2017,第9期

机译：使用文本描述编写视频博客的视频摘要
2. Multimodal Saliency and Fusion for Movie Summarization Based on Aural, Visual, and Textual Attention [J] . Evangelopoulos G., Zlatintsi A., Potamianos A., IEEE transactions on multimedia . 2013,第7期

机译：基于听觉，视觉和文本注意的电影摘要多模态显着性和融合
3. EmotionCues: Emotion-Oriented Visual Summarization of Classroom Videos [J] . Zeng Haipeng, Shu Xinhuan, Wang Yanbang, IEEE transactions on visualization and computer graphics . 2021,第7期

机译：情感：面向情感的课堂视觉摘要
4. MATCHING FACES WITH TEXTUAL CUES IN SOCCER VIDEOS [C] . Marco Bertini, Alberto Del Bimbo, Walter Nunziati International Conference on Multimedia and Expo . 2006

机译：匹配面部与足球视频中的文本线索
5. Indexing and browsing unstructured videos using visual, audio, textual, and facial cues [D] . Haubold, Alexander 2008

机译：使用视觉，音频，文本和面部提示来索引和浏览非结构化视频
6. Visual saliency models for summarization of diagnostic hysteroscopy videos in healthcare systems [O] . Khan Muhammad, Jamil Ahmad, Muhammad Sajjad, -1

机译：可视显着性模型用于汇总医疗保健系统中的宫腔镜诊断视频
7. Video Co-summarization: Video Summarization by Visual Co-occurrence [O] . Wen-sheng Chu, Yale Song, Alejandro Jaimes 2015

机译：视频共同总结：视觉共现的视频摘要

Character-Oriented Video Summarization With Visual and Textual Cues

摘要

著录项

相似文献

相关主题

期刊订阅