Multi-level similarity learning for image-text retrieval

Wen-Hui Li; Song Yang; Yan Wang; Dan Song; Xuan-Ya Li

首页> 外文期刊>Information Processing & Management >Multi-level similarity learning for image-text retrieval

【24h】

Multi-level similarity learning for image-text retrieval

机译：图像文本检索的多级相似度学习

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Image-text retrieval task has been a popular research topic and attracts a growing interest due to it bridges computer vision and natural language processing communities and involves two different modalities. Although a lot of methods have made a great progress in image-text task, it remains challenging because of the difficulty to learn the correspondence between two heterogeneous modalities. In this paper, we propose a multi-level representation learning for image-text retrieval task, which utilizes semantic-level, structural-level and contextual information to improve the quality of visual and textual representation. To utilize semantic-level information, we firstly extract the nouns, adjectives and number with high frequency as the semantic labels and adopt multi-label convolutional neural network framework to encode the semantic-level information. To explore the structure-level information of image-text pair, we firstly construct two graphs to encode the visual and textual information with respect to the corresponding modality and then, we apply graph matching with triplet loss to reduce the cross-modality discrepancy. To further improve the retrieval results, we utilize the contextual-level information from two modalities to refine the rank list and enhance the retrieval quality. Extensive experiments on Flickr30k and MSCOCO, which are two commonly datasets for image-text retrieval, have demonstrated the superiority of our proposed method.

机译：图像文本检索任务一直是一个流行的研究主题，并且由于它桥接计算机视觉和自然语言处理社区而引起了越来越多的利益，并且涉及两个不同的方式。虽然许多方法在图像文本任务中取得了很大进展，但它仍然具有挑战性，因为难以学习两个异构模式之间的对应关系。在本文中，我们提出了一种用于图像文本检索任务的多级表示学习，它利用语义级，结构级别和上下文信息来提高视觉和文本表示的质量。要利用语义级信息，我们首先用高频作为语义标签提取名词，形容词和数字，采用多标签卷积神经网络框架来编码语义级信息。为了探索图像文本对的结构级信息，我们首先构造两个图形来对相应的模态进行编码视觉和文本信息，然后，我们应用与三重态丢失相匹配的图形，以减少跨模型差异。为了进一步提高检索结果，我们利用来自两个模态的上下文信息来改进等级列表并增强检索质量。关于Flickr30k和Mscoco的广泛实验，这是图像文本检索的两个通常数据集，已经证明了我们所提出的方法的优越性。

著录项

来源
《Information Processing & Management》 |2021年第1期|102432.1-102432.12|共12页
作者
Wen-Hui Li; Song Yang; Yan Wang; Dan Song; Xuan-Ya Li;
展开▼
作者单位

School of Electrical and Information Engineering Tianjin University Tianjin 300072 China;

School of Microelectronics Tianjin University Tianjin 300072 China;

School of Electrical and Information Engineering Tianjin University Tianjin 300072 China;

School of Electrical and Information Engineering Tianjin University Tianjin 300072 China;

Baidu Inc. Beijing China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Cross modal retrieval; Semantic extraction; Graph matching;

机译：交叉模态检索;语义提取;图形匹配;

相似文献

外文文献
中文文献
专利

1. Joint Image-Text Hashing for Fast Large-Scale Cross-Media Retrieval Using Self-Supervised Deep Learning [J] . Gengshen Wu, Jungong Han, Zijia Lin, IEEE Transactions on Industrial Electronics . 2019,第12期

机译：使用自监督式深度学习的联合图像-文本哈希用于快速大规模跨媒体检索
2. A multi-level similarity measure for the retrieval of the common CT imaging signs of lung diseases [J] . Medical and Biological Engineering and Computing: Journal of the International Federation for Medical and Biological Engineering . 2020,第5期

机译：肺病常见CT成像迹象的检索多级相似度措施
3. A multi-level matching method with hybrid similarity for document retrieval [J] . Haijun Zhang, Tommy W.S. Chow Expert Systems with Application . 2012,第3期

机译：一种具有混合相似度的多级匹配方法
4. Compositional Learning of Image-Text Query for Image Retrieval [C] . Muhammad Umer Anwaar, Egor Labintcev, Martin Kleinsteuber IEEE Winter Conference on Applications of Computer Vision . 2021

机译：图像检索图像文本查询的组成学习
5. Adaptive image retrieval system: Similarity modeling, learning, fusion, and visualization. [D] . Doloc-Mihu, Anca. 2007

机译：自适应图像检索系统：相似性建模，学习，融合和可视化。
6. A Novel Similarity Learning Method via Relative Comparison for Content-Based Medical Image Retrieval [O] . Wei Huang, Peng Zhang, Min Wan 2013

机译：基于相对比较的基于内容的医学图像检索新相似度学习方法
7. Review of Recent Deep Learning Based Methods for Image-Text Retrieval [O] . Jianan Chen, Lu Zhang, Cong Bai, 2020

机译：综述近期基于深度学习的图像文本检索方法

Multi-level similarity learning for image-text retrieval

摘要

著录项

相似文献

相关主题

期刊订阅