Unique Faces Recognition in Videos

机译：视频中独特的人脸识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper tackles face recognition in videos employing metric learning methods and similarity ranking models. The paper compares the use of the Siamese network with contrastive loss and Triplet Network with triplet loss implementing the following architectures: Google/Inception architecture, 3D Convolutional Network (C3D), and a 2-D Long short-term memory (LSTM) Recurrent Neural Network. We make use of still images and sequences from videos for training the networks and compare the performances implementing the above architectures. The dataset used was the YouTube Face Database designed for investigating the problem of face recognition in videos. The contribution of this paper is two-fold: to begin, the experiments have established 3-D Convolutional networks and 2-D LSTMs with the contrastive loss on image sequences do not outperform Google/Inception architecture with contrastive loss in top n rank face retrievals with still images. However, the 3-D Convolution networks and 2-D LSTM with triplet Loss outperform the Google/Inception with triplet loss in top n rank face retrievals on the dataset; second, a Support Vector Machine (SVM) was used in conjunction with the CNNs' learned feature representations for facial identification. The results show that feature representation learned with triplet loss is significantly better for n-shot facial identification compared to contrastive loss. The most useful feature representations for facial identification are from the 2-D LSTM with triplet loss. The experiments show that learning spatio-temporal features from video sequences is beneficial for facial recognition in videos.

机译：本文针对采用度量学习方法和相似性排名模型的视频中的人脸识别问题进行了研究。该白皮书比较了具有对比损失的暹罗网络和具有三重损失的Triplet网络在以下架构上的使用：Google / Inception架构，3D卷积网络（C3D）和2-D长短期记忆（LSTM）递归神经网络网络。我们利用视频中的静止图像和序列来训练网络，并比较实现上述架构的性能。所使用的数据集是YouTube人脸数据库，旨在调查视频中人脸识别的问题。本文的贡献有两个方面：首先，实验建立了3-D卷积网络和2-D LSTM，图像序列上的对比损失不超过Google / Inception体系结构，在前n个等级的人脸检索中对比损失与静止图像。但是，在数据集的前n个排名面部检索中，具有三重损失的3-D卷积网络和2-D LSTM优于具有三重损失的Google / Inception。其次，将支持向量机（SVM）与CNN的学习特征表示结合使用来进行面部识别。结果表明，与对比丢失相比，具有三重丢失的学习到的特征表示方法对于n镜头面部识别而言明显更好。用于面部识别的最有用的特征表示来自具有三重态损失的2-D LSTM。实验表明，从视频序列中学习时空特征有利于视频中的面部识别。

著录项

来源
《International Conference on Information Fusion》|2020年|1-7|共7页
会议地点
作者
Jiahao Huo; Terence L van Zyl;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Videos; Faces; Face recognition; Support vector machines; YouTube;

机译：视频;面部;面部识别;支持向量机; YouTube;

相似文献

外文文献
中文文献
专利

1. Science and Islam Videos: Creating a Methodology to Find “All” Unique Internet Videos [J] . Vika Gardner, Salman Hameed CyberOrient : Online Journal of the Virtual Middle East . 2017,第1期

机译：科学和伊斯兰视频：创建一种方法来查找“所有”独特的互联网视频
2. A Facial Recognition-based Video Encryption Approach to Prevent Fakedeep Videos [J] . Alex Liang, Yu Su, Fangyan Zhang Computer Science & Information Technology . 2019,第13期

机译：一种基于面部识别的视频加密方法，以防止伪视频
3. Audio-video based character recognition for handwritten mathematical content in classroom videos [J] . Smita Vemulapalli, Monson Hayes Integrated Computer-Aided Engineering . 2014,第3期

机译：基于音频视频的字符识别，用于教室视频中的手写数学内容
4. A Novel Framework for Computing Unique People Count from Monocular Videos [C] . Satarupa Mukherjee, Nilanjan Ray Doctoral Consortium . 2014

机译：从单眼视频计算独特人数的新框架
5. Face Recognition in Video Surveillance from a Single Reference Sample Through Domain Adaptation =Reconnaissance de visages en vidéosurveillance à partir d'un échantillon de référence unique à par l'adaptation de domaine [D] . Bashbaghi, Saman. 2017

机译：通过域自适应从单个参考样本中进行视频监控中的人脸识别
6. Activity Recognition for Ambient Assisted Living with Videos Inertial Units and Ambient Sensors [O] . Caetano Mazzoni Ranieri, Scott MacLeod, Mauro Dragone, 2021

机译：活动识别与视频惯性单元和环境传感器辅助的环境辅助
7. Object recognition and auto-annotation in news videos [Haber videolarinda nesne tanima ve otomatik etiketleme] [O] . Baştan, M., Duygulu P. 2006

机译：新闻视频中的对象识别和自动注释[新闻视频中的对象识别和自动标记]
8. Multi-Aircraft Video - Human/Automation Target Recognition Studies: Video Display Size in Unaided Target Acquisition Involving Multiple Videos [R] . Plantz, S. E., Warfield, L., Carretta, T. R., 2008

机译：多飞机视频 - 人/自动目标识别研究：涉及多个视频的无辅助目标采集中的视频显示尺寸

Unique Faces Recognition in Videos

摘要

著录项

相似文献

相关主题

期刊订阅