Global and Local Knowledge-Aware Attention Network for Action Recognition

Zhenxing Zheng; Gaoyun An; Dapeng Wu; Qiuqi Ruan

首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Global and Local Knowledge-Aware Attention Network for Action Recognition

【24h】

Global and Local Knowledge-Aware Attention Network for Action Recognition

机译：全球和本地知识意识的行动识别关注网络

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Convolutional neural networks (CNNs) have shown an effective way to learn spatiotemporal representation for action recognition in videos. However, most traditional action recognition algorithms do not employ the attention mechanism to focus on essential parts of video frames that are relevant to the action. In this article, we propose a novel global and local knowledge-aware attention network to address this challenge for action recognition. The proposed network incorporates two types of attention mechanism called statistic-based attention (SA) and learning-based attention (LA) to attach higher importance to the crucial elements in each video frame. As global pooling (GP) models capture global information, while attention models focus on the significant details to make full use of their implicit complementary advantages, our network adopts a three-stream architecture, including two attention streams and a GP stream. Each attention stream employs a fusion layer to combine global and local information and produces composite features. Furthermore, global-attention (GA) regularization is proposed to guide two attention streams to better model dynamics of composite features with the reference to the global information. Fusion at the softmax layer is adopted to make better use of the implicit complementary advantages between SA, LA, and GP streams and get the final comprehensive predictions. The proposed network is trained in an end-to-end fashion and learns efficient video-level features both spatially and temporally. Extensive experiments are conducted on three challenging benchmarks, Kinetics, HMDB51, and UCF101, and experimental results demonstrate that the proposed network outperforms most state-of-the-art methods.

机译：卷积神经网络（CNNS）显示了一种学习视频中的动作识别的时空表示的有效方法。然而，大多数传统的动作识别算法不采用注意机制专注于与动作相关的视频帧的基本部分。在本文中，我们提出了一种新的全球和本地知识意识的关注网络，以解决行动认可的这一挑战。该拟议的网络包含了两种类型的注意机制，称为基于统计的注意力（SA）和基于学习的注意力（LA），以便在每个视频帧中的重要元素附加更高的重要性。随着全球汇集（GP）模型捕获全球信息，而注意模型专注于充分利用其隐含互补优势的重要细节，我们的网络采用三流架构，包括两个注意力流和GP流。每个注意力流采用融合层来组合全局和本地信息并产生复合功能。此外，建议全球关注（GA）正规化，以指导两个注意力流，以更好地参考全局信息的复合特征的模型动态。采用Softmax层的融合来更好地利用SA，LA和GP流之间的隐式互补优势，并获得最终的综合预测。所提出的网络以端到端的方式培训，并在空间和时间上学习高效的视频级功能。在三个具有挑战性的基准测试，动力学，HMDB51和UCF101上进行了广泛的实验，实验结果表明，所提出的网络优于最先进的方法。

著录项

来源
《Neural Networks and Learning Systems, IEEE Transactions on》 |2021年第1期|334-347|共14页
作者
Zhenxing Zheng; Gaoyun An; Dapeng Wu; Qiuqi Ruan;
展开▼
作者单位

Institute of Information Science Beijing Jiaotong University Beijing China;

Institute of Information Science Beijing Jiaotong University Beijing China;

University of Florida Gainesville FL USA;

Institute of Information Science Beijing Jiaotong University Beijing China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Feature extraction; Videos; Spatiotemporal phenomena; Task analysis; Biological system modeling; Information science; Data models;

机译：特征提取;视频;时尚现象;任务分析;生物系统建模;信息科学;数据模型;

相似文献

外文文献
中文文献
专利

1. Weakly Supervised Local-Global Attention Network for Facial Expression Recognition [J] . Zhang Haifeng, Su Wen, Wang Zengfu Quality Control, Transactions . 2020,第期

机译：用于面部表情识别的弱势监督本地 - 全球关注网络
2. “When they say weed causes depression, but it’s your fav antidepressant”: Knowledge-aware attention framework for relationship extraction [J] . Shweta Yadav, Usha Lokala, Raminta Daniulaityte, PLoS One . 2021,第3期

机译：“当他们说杂草导致抑郁症时，但它是你最喜欢的抗抑郁药”：知识意识的关注框架的关系提取
3. Deep Neural Networks Using Capsule Networks and Skeleton-Based Attentions for Action Recognition [J] . Manh-Hung Ha, Oscal Tzyh-Chiang Chen Quality Control, Transactions . 2021,第1期

机译：深度神经网络，使用胶囊网络和基于骨架的行动识别的关注
4. Global Context-Aware Attention LSTM Networks for 3D Action Recognition [C] . Jun Liu, Gang Wang, Ping Hu, IEEE Conference on Computer Vision and Pattern Recognition . 2017

机译：用于3D动作识别的全局上下文感知LSTM网络
5. Local, semi-local and global models for texture, object and scene recognition [D] . Lazebnik, Svetlana 2006

机译：用于纹理，对象和场景识别的局部，半局部和全局模型
6. When they say weed causes depression but it’s your fav antidepressant: Knowledge-aware attention framework for relationship extraction [O] . Shweta Yadav, Usha Lokala, Raminta Daniulaityte, 2021

机译：当他们说杂草导致抑郁症时但它是你最喜欢的抗抑郁药：知识意识的关注框架的关系提取
7. Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks [O] . Liu, Jun, Wang, Gang, Duan, Ling-Yu, 2018

机译：基于骨架的人类行为识别与全局上下文意识注意LsTm Networks
8. Globally Optimal Decentralized Spatial Smoothing for Wireless Sensor Networks With Local Interactions [R] . Barbarossa, S., Battisti, T., Swami, A. 2008

机译：具有局部相互作用的无线传感器网络的全局最优分散空间平滑

Global and Local Knowledge-Aware Attention Network for Action Recognition

摘要

著录项

相似文献

相关主题

期刊订阅