首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog
【24h】

Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog

机译:通过反复性的双重关注对视觉对话的多步推理

获取原文

摘要

This paper presents a new model for visual dialog, Recurrent Dual Attention Network (ReDAN), using multi-step reasoning to answer a series of questions about an image. In each question-answering turn of a dialog, ReDAN infers the answer progressively through multiple reasoning steps. In each step of the reasoning process, the semantic representation of the question is updated based on the image and the previous dialog history, and the recurrently-refined representation is used for further reasoning in the subsequent step. On the VisDial v1.0 dataset, the proposed ReDAN model achieves a new state-of-the-art of 64.47% NDCG score. Visualization on the reasoning process further demonstrates that ReDAN can locate context-relevant visual and textual clues via iterative refinement, which can lead to the correct answer step-by-step.
机译:本文介绍了可视化对话框,经常性双重关注网络(redan)的新模型,使用多步推理来回答有关图像的一系列问题。在对话框的每个问题回答转弯时,redan通过多个推理步骤逐步递交答案。在推理过程的每个步骤中,基于图像和先前的对话历史来更新问题的语义表示,并且复合的表示用于在后续步骤中进一步推理。在Vidial V1.0 DataSet上,拟议的redan模型实现了最新的最先进的64.47%的NDCG得分。在推理过程中的可视化进一步展示了redan可以通过迭代细化来定位上下文相关的视觉和文本线索,这可以通过逐步导致正确的答案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号