首页> 外文学位 >Finding Objects in Complex Scenes
【24h】

Finding Objects in Complex Scenes

机译:在复杂场景中查找对象

获取原文
获取原文并翻译 | 示例

摘要

Object detection is one of the fundamental problems in computer vision that has great practical impact. Current object detectors work well under certain conditions. However, challenges arise when scenes become more complex. Scenes are often cluttered and object detectors trained on Internet collected data fail when there are large variations in objects' appearance.;We believe the key to tackle those challenges is to understand the rich context of objects in scenes, which includes: the appearance variations of an object due to viewpoint and lighting condition changes; the relationships between objects and their typical environment; and the composition of multiple objects in the same scene. This dissertation aims to study the complexity of scenes from those aspects.;To facilitate collecting training data with large variations, we design a novel user interface, ARLabeler, utilizing the power of Augmented Reality (AR) devices. Instead of labeling images from the Internet passively, we put an observer in the real world with full control over the scene complexities. Users walk around freely and observe objects from multiple angles. Lighting can be adjusted. Objects can be added and/or removed to the scene to create rich compositions. Our tool opens new possibilities to prepare data for complex scenes.;We also study challenges in deploying object detectors in real world scenes: detecting curb ramps in street view images. A system, Tohme, is proposed to combine detection results from detectors and human crowdsourcing verifications. One core component is a meta-classifier that estimates the complexity of a scene and assigns it to human (accurate but costly) or computer (low cost but error-prone) accordingly.;One of the insights from Tohme is that context is crucial in detecting objects. To understand the complex relationship between objects and their environment, we propose a standalone context model that predicts where an object can occur in an image. By combining this model with object detection, it can find regions where an object is missing. It can also be used to find out-of-context objects.;To take a step beyond single object based detections, we explicitly model the geometrical relationships between groups of objects and use the layout information to represent scenes as a whole. We show that such a strategy is useful in retrieving indoor furniture scenes with natural language inputs.
机译:对象检测是计算机视觉中具有重大实际影响的基本问题之一。当前的物体检测器在某些条件下工作良好。但是,当场景变得更加复杂时,就会出现挑战。当物体的外观变化很大时,场景通常会混乱不堪,并且通过Internet收集的数据训练的物体检测器会失败。;我们认为,应对这些挑战的关键是了解场景中物体的丰富背景,其中包括:由于视点和照明条件变化而导致的物体;对象与其典型环境之间的关系;以及同一场景中多个对象的合成。本文旨在从这些方面研究场景的复杂性。为了便于收集变化较大的训练数据,我们利用增强现实(AR)设备的功能设计了一个新颖的用户界面ARLabeler。代替被动地标记来自Internet的图像,我们将观察者置于现实世界中,可以完全控制场景的复杂性。用户自由走动,并从多个角度观察物体。照明可以调节。可以将对象添加和/或删除到场景以创建丰富的构图。我们的工具为为复杂场景准备数据提供了新的可能性。我们还研究了在现实世界场景中部署对象检测器的挑战:检测街景图像中的路缘坡道。提出了一种Tohme系统,该系统将检测器的检测结果与人类众包验证相结合。一个核心组件是元分类器,它可以估计场景的复杂性,并相应地将其分配给人(准确但昂贵)或计算机(低成本但容易出错)。;Tohme的见解之一是,上下文对于检测物体。为了了解对象及其环境之间的复杂关系,我们提出了一个独立的上下文模型,该模型可以预测对象在图像中可能出现的位置。通过将该模型与对象检测相结合,可以找到缺少对象的区域。它也可以用于查找上下文外的对象。为了超越基于单个对象的检测范围,我们显式地对对象组之间的几何关系进行建模,并使用布局信息来表示整个场景。我们证明了这种策略对于检索具有自然语言输入的室内家具场景很有用。

著录项

  • 作者

    Sun, Jin.;

  • 作者单位

    University of Maryland, College Park.;

  • 授予单位 University of Maryland, College Park.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2018
  • 页码 175 p.
  • 总页数 175
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号