Multimodal Fusion with Co-attention Mechanism

机译：具有共同注意机制的多峰融合

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Because the information from different modalities will complement each other when describing the same contents, multimodal information can be used to obtain better feature representations. Thus, how to represent and fuse the relevant information has become a current research topic. At present, most of the existing feature fusion methods consider the different levels of features representations, but they ignore the significant relevance between the local regions, especially in the high-level semantic representation. In this paper, a general multimodal fusion method based on the co-attention mechanism is proposed, which is similar to the transformer structure. We discuss two main issues: (1) Improving the applicability and generality of the transformer to different modal data; (2) By capturing and transmitting the relevant information between local features before fusion, the proposed method can allow for more robustness. We evaluate our model on the multimodal classification task, and the experiments demonstrate that our model can learn fused featnre representation effectively.

机译：由于来自不同模态的信息在描述相同内容时会相互补充，因此多模态信息可用于获得更好的特征表示。因此，如何表示和融合相关信息已成为当前的研究课题。当前，大多数现有的特征融合方法考虑了不同级别的特征表示，但是它们忽略了局部区域之间的显着相关性，尤其是在高级语义表示中。本文提出了一种基于共同注意机制的通用多峰融合方法，该方法与变压器结构相似。我们讨论了两个主要问题：（1）提高变压器对不同模态数据的适用性和通用性; （2）通过在融合之前在局部特征之间捕获和传输相关信息，所提出的方法可以提供更高的鲁棒性。我们在多模式分类任务上评估了我们的模型，实验表明我们的模型可以有效地学习融合的特征表示。

著录项

来源
《International Conference on Information Fusion》|2020年|1-8|共8页
会议地点
作者
Pei Li; Xinde Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Feature extraction; Data models; Neural networks; Task analysis; Natural languages; Automation; Electronic mail;

机译：特征提取;数据模型;神经网络;任务分析;自然语言;自动化;电子邮件;

相似文献

外文文献
中文文献
专利

1. Co-Attention Memory Network for Multimodal Microblog's Hashtag Recommendation [J] . Ma Renfeng, Qiu Xipeng, Zhang Qi, IEEE Transactions on Knowledge and Data Engineering . 2021,第2期

机译：用于多模式微博的Hashtag推荐的共同注意记忆网络
2. Various syncretic co-attention network for multimodal sentiment analysis [J] . Cao Meng, Zhu Yonghua, Gao Wenjing, Concurrency, practice and experience . 2020,第24期

机译：多式化情绪分析的各种综合关注网络
3. Interpretable Multimodal Fusion Networks Reveal Mechanisms of Brain Cognition [J] . Hu Wenxing, Meng Xianghe, Bai Yuntong, IEEE Transactions on Medical Imaging . 2021,第5期

机译：可解释的多模式融合网络揭示脑认知机制
4. Temporal Aspects of CARE-based Multimodal Fusion: From a Fusion Mechanism to Composition Components and WoZ Components [C] . Marcos Serrano, Laurence Nigay International conference on multimodal interfaces and workshop on machine learning for multimodal interfaces 2009 . 2009

机译：基于CARE的多模式融合的时间方面：从融合机制到合成成分和WoZ成分
5. Commodity-Based Freight Activity on Inland Waterways Through the Fusion of Public Datasets for Multimodal Transportation Planning [D] . Asborno, Magdalena I. 2020

机译：通过融合公共数据集进行多式联运规划的商品基于货运的货运活动
6. Fusion Viewer: A New Tool for Fusion and Visualization of Multimodal Medical Data Sets [O] . Karl G. Baum, María Helguera, Andrzej Krol 2008

机译：Fusion Viewer：用于多模式医学数据集融合和可视化的新工具
7. Hashtag Recommendation for Multimodal Microblog Using Co-Attention Network [O] . Qi Zhang, Jiawen Wang, Haoran Huang, 2017

机译：使用共同关注网络的多模式微博的Hashtag建议

Multimodal Fusion with Co-attention Mechanism

摘要

著录项

相似文献

相关主题

期刊订阅