首页> 美国卫生研究院文献>Sensors (Basel Switzerland) >Video-Based Person Re-Identification by an End-To-End Learning Architecture with Hybrid Deep Appearance-Temporal Feature
【2h】

Video-Based Person Re-Identification by an End-To-End Learning Architecture with Hybrid Deep Appearance-Temporal Feature

机译:具有混合深度外观-时态特征的端到端学习体系结构的基于视频的人员重新识别

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Video-based person re-identification is an important task with the challenges of lighting variation, low-resolution images, background clutter, occlusion, and human appearance similarity in the multi-camera visual sensor networks. In this paper, we propose a video-based person re-identification method called the end-to-end learning architecture with hybrid deep appearance-temporal feature. It can learn the appearance features of pivotal frames, the temporal features, and the independent distance metric of different features. This architecture consists of two-stream deep feature structure and two Siamese networks. For the first-stream structure, we propose the Two-branch Appearance Feature (TAF) sub-structure to obtain the appearance information of persons, and used one of the two Siamese networks to learn the similarity of appearance features of a pairwise person. To utilize the temporal information, we designed the second-stream structure that consisting of the Optical flow Temporal Feature (OTF) sub-structure and another Siamese network, to learn the person’s temporal features and the distances of pairwise features. In addition, we select the pivotal frames of video as inputs to the Inception-V3 network on the Two-branch Appearance Feature sub-structure, and employ the salience-learning fusion layer to fuse the learned global and local appearance features. Extensive experimental results on the PRID2011, iLIDS-VID, and Motion Analysis and Re-identification Set (MARS) datasets showed that the respective proposed architectures reached 79%, 59% and 72% at Rank-1 and had advantages over state-of-the-art algorithms. Meanwhile, it also improved the feature representation ability of persons.
机译:在多摄像机视觉传感器网络中,基于视频的人员重新识别是一项重要任务,面临着光照变化,低分辨率图像,背景杂波,遮挡和人脸相似性的挑战。在本文中,我们提出了一种基于视频的人的重新识别方法,称为具有端到端混合深度外观-时间特征的端到端学习体系结构。它可以学习关键帧的外观特征,时间特征以及不同特征的独立距离度量。该体系结构由两个流的深层特征结构和两个暹罗网络组成。对于第一流结构,我们提出了两分支外观特征(TAF)子结构来获取人员的外观信息,并使用两个连体网络之一来学习成对人员的外观特征的相似性。为了利用时间信息,我们设计了第二流结构,该结构由光流时间特征(OTF)子结构和另一个暹罗网络组成,以了解人的时间特征和成对特征的距离。此外,我们在两分支外观特征子结构上选择视频的关键帧作为Inception-V3网络的输入,并使用显着性学习融合层融合已学习的全局和局部外观特征。在PRID2011,iLIDS-VID以及运动分析和重新识别集(MARS)数据集上的大量实验结果表明,所提出的相应体系结构在Rank-1时分别达到79%,59%和72%,与状态状态相比具有优势。最先进的算法。同时,也提高了人的特征表示能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号