...
首页> 外文期刊>IEEE Transactions on Image Processing >Ranking Highlights in Personal Videos by Analyzing Edited Videos
【24h】

Ranking Highlights in Personal Videos by Analyzing Edited Videos

机译:通过分析编辑的视频对个人视频中的精彩片段进行排名

获取原文
获取原文并翻译 | 示例
           

摘要

We present a fully automatic system for ranking domain-specific highlights in unconstrained personal videos by analyzing online edited videos. A novel latent linear ranking model is proposed to handle noisy training data harvested online. Specifically, given a targeted domain such as “surfing,” our system mines the YouTube database to find pairs of raw and their corresponding edited videos. Leveraging the assumption that an edited video is more likely to contain highlights than the trimmed parts of the raw video, we obtain pair-wise ranking constraints to train our model. The learning task is challenging due to the amount of noise and variation in the mined data. Hence, a latent loss function is incorporated to mitigate the issues caused by the noise. We efficiently learn the latent model on a large number of videos (about 870 min in total) using a novel EM-like procedure. Our latent ranking model outperforms its classification counterpart and is fairly competitive compared with a fully supervised ranking system that requires labels from Amazon Mechanical Turk. We further show that a state-of-the-art audio feature mel-frequency cepstral coefficients is inferior to a state-of-the-art visual feature. By combining both audio-visual features, we obtain the best performance in dog activity, surfing, skating, and viral video domains. Finally, we show that impressive highlights can be detected without additional human supervision for seven domains (i.e., skating, surfing, skiing, gymnastics, parkour, dog activity, and viral video) in unconstrained personal videos.
机译:我们提供了一种全自动系统,用于通过分析在线编辑的视频来对不受约束的个人视频中的特定领域的精彩片段进行排名。提出了一种新颖的潜在线性排序模型来处理在线收集的嘈杂训练数据。具体来说,给定目标域(例如“冲浪”),我们的系统会挖掘YouTube数据库,以查找成对的原始视频及其对应的编辑视频。利用这样的假设,即经过编辑的视频比原始视频的修剪部分更可能包含亮点,因此我们获得了成对排名约束来训练模型。由于噪声量和挖掘数据的变化,学习任务具有挑战性。因此,合并了潜在损失函数以减轻由噪声引起的问题。我们使用新颖的类似EM的程序有效地学习了大量视频(总共约870分钟)的潜在模型。我们的潜在排名模型优于其分类模型,并且与需要Amazon Mechanical Turk提供标签的完全监督的排名系统相比,具有相当的竞争力。我们进一步显示,最新的音频功能mel频率倒谱系数不如最新的视觉功能。通过将两种视听功能结合在一起,我们在狗活动,冲浪,滑冰和病毒视频领域获得了最佳性能。最后,我们证明,在不受约束的个人视频中,无需另外人工监督七个域(即滑冰,冲浪,滑雪,体操,跑酷,狗活动和病毒视频),就可以检测到令人印象深刻的亮点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号