首页> 中文期刊> 《中南大学学报(自然科学版)》 >一种基于深度学习的异构多模态目标识别方法

一种基于深度学习的异构多模态目标识别方法

         

摘要

The heterogeneous multimodal object recognition method was proposed based on deep learning. Firstly, based on the video and audio co-existing feature of media data, a heterogeneous multimodal structure was constructed to incorporate the convolutional neural network(CNN) and the restricted boltzmann machine(RBM). The audio and video information were processed respectively, generating the share characteristic representation by using the canonical correlation analysis(CCA). Then the temporal coherence of video frame was utilized to improve the recognizing accuracy further. The experiments were implemented based on the standard audio & face library and the actual movie video fragments. The results show thatforboth the two kinds ofvideo sources, the proposed method improves the accuracy of target recognition significantly.%提出一种基于深度学习的异构多模态目标识别方法。首先针对媒体流中同时存在音频和视频信息的特征,建立一种异构多模态深度学习结构;结合卷积神经网络和限制波尔兹曼机的算法优点,对音频信息和视频信息分别并行处理,生成基于典型关联分析的共享特征表示,并进一步利用时间相关特性进行参数的优化。分别使用标准语音人脸库和截取的实际电影视频对算法进行实验。研究结果表明:对于这2种视频来源,所提出方法在目标识别的精度方面都有显著提高。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号