AbstractGiven a textual description of an image, phrase grounding localizes objects in the image refer'/> MSRC: multimodal spatial regression with semantic context for phrase grounding
首页> 外文期刊>International Journal of Multimedia Information Retrieval >MSRC: multimodal spatial regression with semantic context for phrase grounding
【24h】

MSRC: multimodal spatial regression with semantic context for phrase grounding

机译:MSRC:具有语义上下文的多模式空间回归,用于短语接地

获取原文
获取原文并翻译 | 示例
           

摘要

AbstractGiven a textual description of an image, phrase grounding localizes objects in the image referred by query phrases in the description. State-of-the-art methods treat phrase grounding as a ranking problem and address it by retrieving a set of proposals according to the query’s semantics, which are limited by the performance of independent proposal generation systems and ignore useful cues from context in the description. In this paper, we propose a novel multimodal spatial regression with semantic context (MSRC) system which not only predicts the location of ground truth based on proposal bounding boxes, but also refines prediction results by penalizing similarities of different queries coming from same sentences. There are two advantages of MSRC: First, it sidesteps the performance upper bound from independent proposal generation systems by adopting regression mechanism. Second, MSRC not only encodes the semantics of a query phrase, but also considers its relation with context (i.e., other queries from the same sentence) via a context refinement network. Experiments show MSRC system achieves a significant improvement in accuracy on two popular datasets: Flickr30K Entities and Refer-it Game, with 6.64 and 5.28% increase over the state of the arts, respectively.]]>
机译:<![CDATA [<摘要ID =“abs1”语言=“en”outcementmedium =“全部”> <标题>抽象 ara id =“par1”>给定图像的文本描述,词组接地定位图像中的图像中的对象由查询短语中的描述中引用。最先进的方法将字词作为排名问题进行​​接地,并通过根据查询的语义检索一组提案来解决,这些提案受到独立提议生成系统的性能的限制,并从描述中忽略了来自上下文的有用线索。在本文中,我们提出了一种与语义上下影(MSRC)系统的新型多模式空间回归,这不仅根据提案边界框预测地面事实的位置,而且还通过惩罚来自同一句子的不同查询的相似性来改进预测结果。 MSRC有两种优点:首先,它通过采用回归机制来支持独立提案生成系统的性能上限。其次,MSRC不仅通过上下文细化网络对其与上下文(即其他查询)的关系进行编码,而且还考虑其关系。实验显示MSRC系统在两个流行的数据集中实现了显着提高:Flickr30k实体和参考IT游戏,分别增加了6.64和5.28%。 ]]>

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号