【24h】

Motion-Based Occlusion-Aware Pixel Graph Network for Video Object Segmentation

机译:基于运动的遮挡感知像素图网络,用于视频对象分割

获取原文

摘要

This paper proposes a dual-channel based Graph Convolutional Network (GCN) for the Video Object Segmentation (VOS) task. The main contribution lies in formulating two pixel graphs based on the raw RGB and optical flow features. Both spatial and temporal features are learned independently, making the network robust to various challenging scenarios in real-world videos. Additionally, a motion orientation-based aggregator scheme efficiently captures long-range dependencies among objects. This not only deals with the complex issue of modelling velocity differences among multiple objects moving in various directions, but also adapts to change of appearance of objects due to pose and scale deformations. Also, an occlusion-aware attention mechanism has been employed to facilitate accurate segmentation under scenarios where multiple objects have temporal discontinuity in their appearance due to occlusion. Performance analysis on DAVIS-2016 and DAVIS-2017 datasets show the effectiveness of our proposed method in foreground segmentation of objects in videos over the existing state-of-the-art techniques. Control experiments using CamVid dataset show the generalising capability of the model for scene segmentation.
机译:本文针对视频对象分割(VOS)任务提出了一种基于双通道的图卷积网络(GCN)。主要贡献在于基于原始RGB和光流特征制定两个像素图。空间和时间特征都是独立学习的,从而使网络对于实际视频中的各种挑战性场景都具有强大的鲁棒性。此外,基于运动方向的聚合器方案可以有效地捕获对象之间的远程依赖关系。这不仅解决了建模沿多个方向移动的多个对象之间的速度差异这一复杂问题,而且还适应了由于姿态和比例变形而导致的对象外观变化。另外,在多个对象由于遮挡而在外观上具有时间不连续性的情况下,采用了遮挡意识的注意力机制来促进准确的分割。对DAVIS-2016和DAVIS-2017数据集的性能分析表明,与现有的最新技术相比,我们提出的方法在视频对象前景分割中的有效性。使用CamVid数据集的控制实验显示了该模型对场景分割的泛化能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号