MSRC: multimodal spatial regression with semantic context for phrase grounding

Kan Chen; Rama Kovvuri; Jiyang Gao; Ram Nevatia

首页> 外文期刊>International Journal of Multimedia Information Retrieval >MSRC: multimodal spatial regression with semantic context for phrase grounding

【24h】

MSRC: multimodal spatial regression with semantic context for phrase grounding

机译：MSRC：具有语义上下文的多模式空间回归，用于短语接地

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

AbstractGiven a textual description of an image, phrase grounding localizes objects in the image referred by query phrases in the description. State-of-the-art methods treat phrase grounding as a ranking problem and address it by retrieving a set of proposals according to the query’s semantics, which are limited by the performance of independent proposal generation systems and ignore useful cues from context in the description. In this paper, we propose a novel multimodal spatial regression with semantic context (MSRC) system which not only predicts the location of ground truth based on proposal bounding boxes, but also refines prediction results by penalizing similarities of different queries coming from same sentences. There are two advantages of MSRC: First, it sidesteps the performance upper bound from independent proposal generation systems by adopting regression mechanism. Second, MSRC not only encodes the semantics of a query phrase, but also considers its relation with context (i.e., other queries from the same sentence) via a context refinement network. Experiments show MSRC system achieves a significant improvement in accuracy on two popular datasets: Flickr30K Entities and Refer-it Game, with 6.64 and 5.28% increase over the state of the arts, respectively.]]>

机译：<！[CDATA [<摘要ID =“abs1”语言=“en”outcementmedium =“全部”> <标题>抽象 ara id =“par1”>给定图像的文本描述，词组接地定位图像中的图像中的对象由查询短语中的描述中引用。最先进的方法将字词作为排名问题进行接地，并通过根据查询的语义检索一组提案来解决，这些提案受到独立提议生成系统的性能的限制，并从描述中忽略了来自上下文的有用线索。在本文中，我们提出了一种与语义上下影（MSRC）系统的新型多模式空间回归，这不仅根据提案边界框预测地面事实的位置，而且还通过惩罚来自同一句子的不同查询的相似性来改进预测结果。 MSRC有两种优点：首先，它通过采用回归机制来支持独立提案生成系统的性能上限。其次，MSRC不仅通过上下文细化网络对其与上下文（即其他查询）的关系进行编码，而且还考虑其关系。实验显示MSRC系统在两个流行的数据集中实现了显着提高：Flickr30k实体和参考IT游戏，分别增加了6.64和5.28％。 ]]>

著录项

来源
《International Journal of Multimedia Information Retrieval》 |2018年第1期|共12页
作者
Kan Chen; Rama Kovvuri; Jiyang Gao; Ram Nevatia;
展开▼
作者单位

Institute for Robotics and Intelligent Systems University of Southern California;

Institute for Robotics and Intelligent Systems University of Southern California;

Institute for Robotics and Intelligent Systems University of Southern California;

Institute for Robotics and Intelligent Systems University of Southern California;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类图书馆学、图书馆事业;
关键词
Phrase grounding; Spatial regression; Multimodal; context;

机译：短语接地;空间回归;多式联版;背景;

相似文献

外文文献
中文文献
专利

1. MSRC: multimodal spatial regression with semantic context for phrase grounding [J] . Kan Chen, Rama Kovvuri, Jiyang Gao, International Journal of Multimedia Information Retrieval . 2018,第1期

机译：MSRC：具有语义上下文的多模式空间回归，用于短语接地
2. Differences between noun and verb processing in a minimal phrase context: a semantic priming study using event-related brain potentials. [J] . Khader P, Scherag A, Streb J, Brain research. Cognitive brain research . 2003,第2期

机译：最小短语上下文中名词和动词处理之间的差异：使用事件相关脑电势的语义启动研究。
3. Differences between noun and verb processing in a minimal phrase context: a semantic priming study using event-related brain potentials. [J] . Khader P, Scherag A, Streb J, Brain research. Cognitive brain research . 2003,第2期

机译：最小短语上下文中名词和动词处理之间的差异：使用事件相关脑电势的语义启动研究。
4. Query-guided Regression Network with Context Policy for Phrase Grounding [C] . Kan Chen, Rama Kovvuri, Ram Nevatia IEEE International Conference on Computer Vision . 2017

机译：查询引导的回归网络，具有语境策略的词组接地
5. Statistical semantics of phrases in hierarchical contexts. [D] . Steier, Amy Marie. 1994

机译：分层上下文中短语的统计语义。
6. Understanding the spatial dimension of natural language by measuring the spatial semantic similarity of words through a scalable geospatial context window [O] . Bozhi Wang, Teng Fei, Yuhao Kang, 2020

机译：通过测量通过可扩展的地理空间上下文窗口测量单词的空间语义相似性来了解自然语言的空间维度
7. Query-guided Regression Network with Context Policy for Phrase Grounding [O] . Chen, Kan, Kovvuri, Rama, Nevatia, Ram 2017

机译：具有上下文策略的查询引导回归网络用于短语接地

MSRC: multimodal spatial regression with semantic context for phrase grounding

摘要

著录项

相似文献

相关主题

期刊订阅