首页> 外文期刊>Image Processing, IET >Dynamic gesture recognition based on feature fusion network and variant ConvLSTM
【24h】

Dynamic gesture recognition based on feature fusion network and variant ConvLSTM

机译:基于特征融合网络和变体Convlstm的动态手势识别

获取原文
获取原文并翻译 | 示例
           

摘要

Gesture is a natural form of human communication, and it is of great significance in human-computer interaction. In the dynamic gesture recognition method based on deep learning, the key is to obtain comprehensive gesture feature information. Aiming at the problem of inadequate extraction of spatiotemporal features or loss of feature information in current dynamic gesture recognition, a new gesture recognition architecture is proposed, which combines feature fusion network with variant convolutional long short-term memory (ConvLSTM). The architecture extracts spatiotemporal feature information from local, global and deep aspects, and combines feature fusion to alleviate the loss of feature information. Firstly, local spatiotemporal feature information is extracted from video sequence by 3D residual network based on channel feature fusion. Then the authors use the variant ConvLSTM to learn the global spatiotemporal information of dynamic gesture, and introduce the attention mechanism to change the gate structure of ConvLSTM. Finally, a multi-feature fusion depthwise separable network is used to learn higher-level features including depth feature information. The proposed approach obtains very competitive performance on the Jester dataset with the classification accuracies of 95.59%, achieving state-of-the-art performance with 99.65% accuracy on the SKIG (Sheffifield Kinect Gesture) dataset.
机译:姿态是一种自然的人类交流形式,在人机互动方面具有重要意义。在基于深度学习的动态手势识别方法中,关键是获得综合手势特征信息。针对时尚特征的提取不足或当前动态手势识别中的特征信息丢失的问题,提出了一种新的手势识别架构,其将具有变体卷积长短短期存储器(CONMLSTM)的特征融合网络组合。该架构从本地,全局和深度方面提取了时空特征信息,并结合了特征融合来缓解特征信息的丢失。首先,基于信道特征融合,通过3D剩余网络从视频序列中提取本地时空特征信息。然后,作者使用Variant Convlstm学习动态手势的全局时空信息,并引入更改Convlstm的栅极结构的注意机制。最后,使用多特征融合深度可分离网络来学习包括深度特征信息的更高级别特征。该拟议的方法在杰斯特数据集中获得了非常竞争力的性能,分类准确性为95.59%,实现了最先进的性能,在Skig(Sheffififield Kinect Gesture)数据集上具有99.65%的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号