...
机译:以视觉和文本提示为导向的视频汇总
Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;
Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;
Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;
Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;
Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;
Univ Sci & Technol China Sch Comp Sci Anhui Prov Key Lab Big Data Anal & Applicat Hefei 230026 Peoples R China;
Kingsoft AI Lab Beijing 100085 Peoples R China;
Character-oriented video summarization; person search; natural language processing;
机译:使用文本描述编写视频博客的视频摘要
机译:基于听觉,视觉和文本注意的电影摘要多模态显着性和融合
机译:情感:面向情感的课堂视觉摘要
机译:匹配面部与足球视频中的文本线索
机译:使用视觉,音频,文本和面部提示来索引和浏览非结构化视频
机译:可视显着性模型用于汇总医疗保健系统中的宫腔镜诊断视频
机译:视频共同总结:视觉共现的视频摘要