首页> 外文期刊>IEEE transactions on industrial informatics >Efficient Outdoor Video Semantic Segmentation Using Feedback-Based Fully Convolution Neural Network
【24h】

Efficient Outdoor Video Semantic Segmentation Using Feedback-Based Fully Convolution Neural Network

机译:高效的户外视频语义分割使用基于反馈的完全卷积神经网络

获取原文
获取原文并翻译 | 示例
           

摘要

In this article, we focus on efficient semantic segmentation problem from sequential two-dimensional images, in which all pixels are classified into certain classes for scene understanding. Such problem is challenging because it involves constraints of both spatial and temporal consistencies, which have large difficulties in explicitly determining such structural constraints. Traditionally, such a problem is tackled using structured prediction method, such as conditional random field (CRF). However, pure CRF method suffers from very high complexity in computing high-order potentials and slow performance during inference step, which is unsuitable for efficient video segmentation in real scenario. In this article, a novel feedback-based deep fully convolutional neural network (CNN) is proposed to inherently incorporate spatial context through appending output feedback mechanism. The proposed method has the following contributions: 1) spatial context in images are easily captured through iterative feedback refinement, without the expensive postprocess step such as CRF refinement; 2) easily integrated with generic deep CNN structure; and 3) the inference time is greatly reduced for efficient image segmentation. Compared to current state-of-the-art methods, our proposed method was shown to provide up to 14% better accuracy on semantic segmentation task in challenging Camvid and Cityscapes datasets, while taking up to relatively 980% shorter inference time. The proposed method also shows its effectiveness for real-time road detection task of autonomous driving.
机译:在本文中,我们专注于序贯二维图像的高效语义分段问题,其中所有像素都被分类为某些类别的场景理解。这些问题是具有挑战性的,因为它涉及空间和时间常量的限制,在明确地确定这种结构约束时具有很大的困难。传统上,使用结构化预测方法(例如条件随机字段(CRF))来解决这种问题。然而,纯CRF方法在计算高阶电位和推理步骤期间的性能缓慢的情况下遭受了非常高的复杂性,这是不适用于实际方案中有效的视频分段。在本文中,提出了一种新的基于反馈的深度完全卷积神经网络(CNN),以通过附加输出反馈机制固有地结合空间上下文。所提出的方法具有以下贡献:1)通过迭代反馈细化容易捕获图像中的空间上下文,没有昂贵的后处理步骤,如CRF细化; 2)易于与通用深层CNN结构集成; 3)有效图像分割大大降低了推理时间。与目前的最先进的方法相比,我们提出的方法显示在挑战Camvid和CityCAPES数据集中提供高达14%的语义细分任务精度,同时占用相对980%的推理时间。该方法还显示了自主驾驶实时道路检测任务的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号