Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks

Fan Deng-Ping; Lin Zheng; Zhang Zhao; Zhu Menglong; Cheng Ming-Ming

首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks

【24h】

Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks

机译：重新思考RGB-D突出对象检测：模型，数据集和大型基准测试

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The use of RGB-D information for salient object detection (SOD) has been extensively explored in recent years. However, relatively few efforts have been put toward modeling SOD in real-world human activity scenes with RGB-D. In this article, we fill the gap by making the following contributions to RGB-D SOD: 1) we carefully collect a new Salient Person (SIP) data set that consists of similar to 1 K high-resolution images that cover diverse real-world scenes from various viewpoints, poses, occlusions, illuminations, and backgrounds; 2) we conduct a large-scale (and, so far, the most comprehensive) benchmark comparing contemporary methods, which has long been missing in the field and can serve as a baseline for future research, and we systematically summarize 32 popular models and evaluate 18 parts of 32 models on seven data sets containing a total of about 97k images; and 3) we propose a simple general architecture, called deep depth-depurator network (D(3)Net). It consists of a depth depurator unit (DDU) and a three-stream feature learning module (FLM), which performs low-quality depth map filtering and cross-modal feature learning, respectively. These components form a nested structure and are elaborately designed to be learned jointly. D(3)Net exceeds the performance of any prior contenders across all five metrics under consideration, thus serving as a strong model to advance research in this field. We also demonstrate that D(3)Net can be used to efficiently extract salient object masks from real scenes, enabling effective background-changing application with a speed of 65 frames/s on a single GPU. All the saliency maps, our new SIP data set, the D(3)Net model, and the evaluation tools are publicly available at https://github.com/DengPingFan/D3NetBenchmark.

机译：近年来，使用RGB-D突出物体检测（SOD）的使用信息已经过分探讨。然而，相对较少的努力已经朝着RGB-D的现实世界人类活动场景中建模的草皮。在本文中，我们通过对RGB-D SOD的贡献提出以下贡献来填补差距：1）我们仔细收集了一个包含类似于1 K高分辨率图像的新的突出人（SIP）数据集，涵盖不同的现实世界来自各种观点，姿势，闭塞，照明和背景的场景; 2）我们进行大规模的（以及到目前为止，最全面的）基准比较的比较当代方法，这些方法长期以来一直缺少该领域，并可作为未来研究的基准，我们系统地汇总了32种流行的型号和评估七个数据集的18个部分型号，总共包含约97K图像; 3）我们提出了一种简单的一般架构，称为深度深度疏基器网络（D（3）网）。它由深度克纳特单元（DDU）和三流特征学习模块（FLM）组成，其分别执行低质量深度映射滤波和跨模型特征学习。这些部件形成嵌套结构，并设计设计旨在共同学习。 D（3）净超出了所有五个指标的先前竞争者的表现，从而担任强大的模型，以推进该领域的研究。我们还证明了D（3）NET可用于从真实场景中有效地提取突出对象掩码，使得在单个GPU上具有65帧/秒的速度的有效的背景更改应用。所有显着性图，我们的新SIP数据集，D（3）NET模型以及评估工具在HTTPS://github.com/dengpingFan/d3NetBenchmark上公开可用。

著录项

来源
《Neural Networks and Learning Systems, IEEE Transactions on》 |2021年第5期|2075-2089|共15页
作者
Fan Deng-Ping; Lin Zheng; Zhang Zhao; Zhu Menglong; Cheng Ming-Ming;
展开▼
作者单位

Nankai Univ Coll Comp Sci Tianjin 300350 Peoples R China|Inception Inst Artificial Intelligence IIAI Abu Dhabi U Arab Emirates;

Nankai Univ Coll Comp Sci Tianjin 300350 Peoples R China;

Nankai Univ Coll Comp Sci Tianjin 300350 Peoples R China;

Google AI Mountain View CA 94043 USA;

Nankai Univ Coll Comp Sci Tianjin 300350 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Benchmark; RGB-D; saliency; salient object detection (SOD); Salient Person (SIP) data set;

机译：基准;RGB-D;显着性;突出物体检测（SOD）;突出人（SIP）数据集;

相似文献

外文文献
中文文献
专利

1. Depth quality-aware selective saliency fusion for RGB-D image salient object detection [J] . Wang Xuehao, Li Shuai, Chen Chenglizhao, Neurocomputing . 2021,第Apra7期

机译：深度质量意识的选择性显着性粘性融合，用于RGB-D图像突出对象检测
2. RGB-D Salient Object Detection via Minimum Barrier Distance Transform and Saliency Fusion [J] . Anzhi Wang, Minghui Wang IEEE signal processing letters . 2017,第5期

机译：通过最小障碍物距离变换和显着性融合进行RGB-D显着物体检测
3. Large-Scale Gesture Recognition With a Fusion of RGB-D Data Based on Saliency Theory and C3D Model [J] . Yunan Li, Qiguang Miao, Kuan Tian, IEEE Transactions on Circuits and Systems for Video Technology . 2018,第10期

机译：基于显着性理论和C3D模型的RGB-D数据融合的大规模手势识别
4. Unsupervised Feature Extraction from RGB-D Data for Object Classification: a Case Study on the YCB Object and Model Set [C] . André Brás, Pedro Neto Annual Conference of the IEEE Industrial Electronics Society . 2018

机译：从RGB-D数据中进行无监督的特征提取以进行对象分类：以YCB对象和模型集为例
5. Performance Evaluation of Object Proposal Generators for Salient Object Detection [D] . Kotamraju, Sai Prajwal. 2019

机译：目标提议生成器用于显着目标检测的性能评估
6. Saliency-Guided Detection of Unknown Objects in RGB-D Indoor Scenes [O] . Jiatong Bao, Yunyi Jia, Yu Cheng, 2015

机译：RGB-D室内场景中未知对象的显着性引导检测
7. A Benchmark Dataset and Saliency-guided Stacked Autoencoders for Video-based Salient Object Detection [O] . Li, Jia, Xia, Changqun, Chen, Xiaowu 2017

机译：基准数据集和显着性引导的堆叠自动编码器基于视频的显着对象检测

Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks

摘要

著录项

相似文献

相关主题

期刊订阅