...
首页> 外文期刊>IEEE transactions on multimedia >Hierarchy-Dependent Cross-Platform Multi-View Feature Learning for Venue Category Prediction
【24h】

Hierarchy-Dependent Cross-Platform Multi-View Feature Learning for Venue Category Prediction

机译:基于层次的跨平台多视图特征学习,用于场地类别预测

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we focus on visual venue category prediction, which can facilitate various applications for location-based service and personalization. Considering the complementarity of different media platforms, it is reasonable to leverage venue-relevant media data from different platforms to boost the prediction performance. Intuitively, recognizing one venue category involves multiple semantic cues, especially objects and scenes and, thus, they should contribute together to venue category prediction. In addition, these venues can be organized in a natural hierarchical structure, which provides prior knowledge to guide venue category estimation. Taking these aspects into account, we propose a Hierarchy-dependent Cross-platform Multi-view Feature Learning (HCM-FL) framework for venue category prediction from videos by leveraging images from other platforms. HCM-FL includes two major components, namely Cross-Platform Transfer Deep Learning (CPTDL) and Multi-View Feature Learning with the Hierarchical Venue Structure (MVFL-HVS). CPTDL is capable of reinforcing the learned deep network from videos using images from other platforms. Specifically, CPTDL first trained a deep network using videos. These images from other platforms are filtered by the learnt network and these selected images are then fed into this learnt network to enhance it. Two kinds of pre-trained networks on the ImageNet and Places dataset are employed. Therefore, we can harness both object-oriented and scene-oriented deep features through these enhanced deep networks. MVFL-HVS is then developed to enable multi-view feature fusion. It is capable of embedding the hierarchical structure ontology to support more discriminative joint feature learning. We conduct the experiment on videos from Vine and images from Foursquare. These experimental results demonstrate the advantage of our proposed framework in jointly utilizing multi-platform data, multi-view deep features, and hierarchical venue structure knowledge.
机译:在本文中,我们专注于视觉场所类别预测,这可以促进基于位置的服务和个性化的各种应用程序。考虑到不同媒体平台的互补性,合理利用来自不同平台的与场地相关的媒体数据来提高预测性能。直观上,识别一个场所类别涉及多个语义线索,尤其是对象和场景,因此,它们应该一起为场所类别预测做出贡献。另外,这些场所可以以自然的层次结构进行组织,这提供了先验知识以指导场所类别估计。考虑到这些方面,我们提出了一种基于层次的跨平台多视图特征学习(HCM-FL)框架,用于通过利用其他平台的图像来预测视频中的场所类别。 HCM-FL包括两个主要组件,即跨平台转移深度学习(CPTDL)和具有分层场地结构的多视图特征学习(MVFL-HVS)。 CPTDL能够使用其他平台的图像来增强从视频中学到的深度网络。具体而言,CPTDL首先使用视频来训练深度网络。来自其他平台的这些图像被学习的网络过滤,然后将这些选定的图像馈送到该学习的网络中以对其进行增强。使用ImageNet和Places数据集上的两种预训练网络。因此,我们可以通过这些增强的深度网络来利用面向对象和面向场景的深度功能。然后开发MVFL-HVS以实现多视图特征融合。它能够嵌入层次结构本体,以支持更具区分性的联合特征学习。我们对Vine的视频和Foursquare的图像进行了实验。这些实验结果证明了我们提出的框架在联合利用多平台数据,多视图深度特征和分层场地结构知识方面的优势。

著录项

  • 来源
    《IEEE transactions on multimedia》 |2019年第6期|1609-1619|共11页
  • 作者单位

    Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China|Univ Chinese Acad Sci, Beijing 100049, Peoples R China;

    Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China|Chinese Acad Sci, Shenyang Inst Automat, State key Lab Robot, Shenyang 110016, Liaoning, Peoples R China;

    Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China|Shandong Univ Sci & Technol, Qingdao 266590, Shandong, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Feature extraction; knowledge transfer; supervised learning; video signal processing; Web 2.0;

    机译:特征提取;知识转移;监督学习;视频信号处理;Web 2.0;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号