...
首页> 外文期刊>Neural processing letters >Refocused Attention: Long Short-Term Rewards Guided Video Captioning
【24h】

Refocused Attention: Long Short-Term Rewards Guided Video Captioning

机译:重新转移注意:长期短期奖励引导视频标题

获取原文
获取原文并翻译 | 示例
           

摘要

The adaptive cooperation of visual model and language model is essential for video captioning. However, due to the lack of proper guidance for each time step in end-to-end training, the over-dependence of language model often results in the invalidation of attention-based visual model, which is called 'Attention Defocus' problem in this paper. Based on an important observation that the recognition precision of entity word can reflect the effectiveness of the visual model, we propose a novel strategy called refocused attention to optimize the training and cooperating of visual model and language model, using ingenious guidance at appropriate time step. The strategy consists of a short-term-reward guided local entity recognition and a long-term-reward guided global relation understanding, neither requires any external training data. Moreover, a framework with hierarchical visual representations and hierarchical attention is established to fully exploit the potential strength of the proposed learning strategy. Extensive experiments demonstrate that the ingenious guidance strategy together with the optimized structure outperform state-of-the-art video captioning methods with relative improvements 7.7% in BLEU-4 and 5.0% in CIDEr-D on MSVD dataset, even without multi-modal features.
机译:视觉模型和语言模型的自适应合作对于视频字幕至关重要。但是,由于每次训练中缺乏适当的指导,语言模型的过度依赖性导致关注的视觉模型的无效,这被称为“注意Defocus”问题纸。基于一个重要观察,实体词的识别精度可以反映视觉模型的有效性,我们提出了一种称为分叉注意的新型战略,以优化视觉模型和语言模型的培训和协作,在适当的时间步骤中使用巧妙的指导。该策略包括短期奖励引导的地方实体认可和长期奖励引导的全球关系理解,既不需要任何外部培训数据。此外,建立了具有分层视觉表示和分层关注的框架,以充分利用所提出的学习策略的潜在强度。广泛的实验表明,巧妙的指导策略与优化的结构优于最先进的视频字幕方法,即使没有多模态特征,在MSVD数据集中的BLE-4中的相对改善和5.0%的最先进的视频标题方法。 。

著录项

  • 来源
    《Neural processing letters》 |2020年第2期|935-948|共14页
  • 作者单位

    Institute of Computing Technology Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing China;

    Institute of Computing Technology Chinese Academy of Sciences Beijing China;

    Institute of Computing Technology Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing China;

    Institute of Computing Technology Chinese Academy of Sciences Beijing China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Video captioning; Hierarchical attention; Reinforcement learning; Reward;

    机译:视频标题;分层注意;强化学习;报酬;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号