Memory Augmented Deep Recurrent Neural Network for Video Question Answering

Yin Chengxiang; Tang Jian; Xu Zhiyuan; Wang Yanzhi

首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Memory Augmented Deep Recurrent Neural Network for Video Question Answering

【24h】

Memory Augmented Deep Recurrent Neural Network for Video Question Answering

机译：内存增强了用于视频问题的深度经常性神经网络

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Video question answering (VideoQA) is a very important but challenging multimedia task, which automatically analyzes questions and videos and generates accurate answers. However, research on VideoQA is still in its infancy. In this article, we propose a novel memory augmented deep recurrent neural network (MA-DRNN) model for VideoQA, which features a new method for encoding videos and questions, and memory augmentation using the emerging differentiable neural computer (DNC). Specifically, we encode textual (questions) information before visual (videos) information, which leads to better visual-textual representations. Moreover, we leverage DNC (with an external memory) for storing and retrieving useful information in questions and videos, and modeling the long-term visual-textual dependence. To evaluate the proposed model, we conducted extensive experiments using the VTW data set and MSVD-QA data set, which are both Widely used large-scale video data sets for language-level understanding. The experimental results have well validated the proposed model and showed that it outperforms the state-of-the-art in terms of various accuracy-related metrics.

机译：视频问题应答（VideoQA）是一个非常重要但充满挑战的多媒体任务，它会自动分析问题和视频，并产生准确的答案。但是，录像会的研究仍处于初期阶段。在本文中，我们提出了一种新的内存增强深度经常性神经网络（MA-DRNN）模型，用于使用新出现的可微分神经计算机（DNC）来编码视频和问题的新方法，以及内存增强。具体而言，我们在视觉（视频）信息之前编码文本（问题）信息，这导致更好的视觉文本表示。此外，我们利用DNC（带外部存储器）来存储和检索问题和视频中的有用信息，并建立长期视觉文本依赖性。为了评估所提出的模型，我们使用VTW数据集和MSVD-QA数据集进行了广泛的实验，这些实验均为广泛使用的语言级别了解的大型视频数据集。实验结果良好地验证了所提出的模型，并表明它在各种与精度相关的指标方面优于最先进的。

著录项

来源
《Neural Networks and Learning Systems, IEEE Transactions on》 |2020年第9期|3159-3167|共9页
作者
Yin Chengxiang; Tang Jian; Xu Zhiyuan; Wang Yanzhi;
展开▼
作者单位

Syracuse Univ Dept Elect Engn & Comp Sci Syracuse NY 13244 USA;

Syracuse Univ Dept Elect Engn & Comp Sci Syracuse NY 13244 USA|DiDi AI Labs Beijing 100193 Peoples R China;

Syracuse Univ Dept Elect Engn & Comp Sci Syracuse NY 13244 USA;

Northeastern Univ Dept Elect & Engn Boston MA 02115 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Task analysis; Knowledge discovery; Computational modeling; Recurrent neural networks; Data models; Semantics; Deep learning; differentiable neural computer (DNC); memory augmented neural network; recurrent neural network (RNN); video question answering (VideoQA);

机译：任务分析;知识发现;计算建模;复发性神经网络;数据模型;语义;深度学习;可分解的神经计算机（DNC）;记忆增强神经网络;复发性神经网络（RNN）;视频问题应答（视频问题）;

相似文献

外文文献
中文文献
专利

1. Memory-Augmented Neural Networks on FPGA for Real-Time and Energy-Efficient Question Answering [J] . Seongsik Park, Jaehee Jang, Seijoon Kim, Very Large Scale Integration (VLSI) Systems, IEEE Transactions on . 2021,第1期

机译：关于FPGA的内存增强神经网络，用于实时和节能问题应答
2. Long-Term Video Question Answering via Multimodal Hierarchical Memory Attentive Networks [J] . Yu Ting, Yu Jun, Yu Zhou, IEEE Transactions on Circuits and Systems for Video Technology . 2021,第3期

机译：通过多模式分层内存周度网络应答的长期视频问题
3. Frame Augmented Alternating Attention Network for Video Question Answering [J] . Zhang Wenqiao, Tang Siliang, Cao Yanpeng, IEEE transactions on multimedia . 2020,第4期

机译：帧增强交替关注网络用于视频问题应答
4. Deep Neural Network-Based Models for Ranking Question - Answering Pairs in Community Question Answering Systems [C] . Van-Tu Nguyen, Anh-Cuong Le International symposium on integrated uncertainty in knowledge modelling and decision making . 2018

机译：基于深度神经网络的社区答疑系统中答题对排名模型
5. Neural Question Answering Models with Broader Knowledge Scope and Deeper Reasoning Power [D] . Xiong, Wenhan. 2021

机译：神经问题回答模型，具有更广泛的知识范围和更深的推理权力
6. Optogenetics inspired transition metal dichalcogenide neuristors for in-memory deep recurrent neural networks [O] . Rohit Abraham John, Jyotibdha Acharya, Chao Zhu, -1

机译：受光遗传学启发的过渡金属二卤化神经元神经元用于内存中的深度循环神经网络
7. Visual Question Answering with Memory-Augmented Networks [O] . Chao Ma, Chunhua Shen, Anthony Dick, 2018

机译：视觉问题用内存增强网络接听
8. Natural Language Video Description using Deep Recurrent Neural Networks. [R] . Venugopalan, S. 2015

机译：使用深度递归神经网络的自然语言视频描述。

Memory Augmented Deep Recurrent Neural Network for Video Question Answering

摘要

著录项

相似文献

相关主题

期刊订阅