Training a deep convolutional neural network with multiple face sizes and positions, but not resolutions, is necessary for generating invariant face recognition across these transformations

Megha Srivastava; Kalanit Grill-Spector

首页> 外文期刊>Journal of vision >Training a deep convolutional neural network with multiple face sizes and positions, but not resolutions, is necessary for generating invariant face recognition across these transformations

【24h】

Training a deep convolutional neural network with multiple face sizes and positions, but not resolutions, is necessary for generating invariant face recognition across these transformations

机译：训练具有多个面部尺寸和位置而不是分辨率的深度卷积神经网络对于在这些转换中生成不变的面部识别是必要的

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Convolutional neural networks have demonstrated human-like ability in face recognition, with recent networks achieving as high as 97% accuracy (Taigman, 2014). It is thought that non-linear operations (e.g. maximum-pooling) are key for developing position and size invariance (Riesenhuber & Poggio, 1999). However, it is unknown how training contributes to invariant face recognition. Here, we tested how training affects invariant face recognition across position, size, and resolution. We used a convolutional neural network architecture of TensorFlow (tensorflow.org). We trained the network to recognize 101 faces that varied in age, gender, and ethnicity across views (15 views/face, spanning 0 to ?±105?°). The network was trained on 80% of views, randomly selected, and tested on the remaining 20% of views. During training faces were shown centrally and presented in one size and resolution. Then, we tested face recognition across views for new positions, sizes, and resolutions not shown during training. Results show that face recognition performance progressively declined for faces shown in different positions (Figure 1A) or sizes (Figure 1B) than shown during training. However, face recognition performance generalized across resolutions (Figure 1C). Further experiments using a constant number of training examples, but different training regimes, revealed that training with random positions (Figure 1D) or random sizes (Figure 1E) generated more robust performance than training with faces in 5 positions (Figure 1D) or 5 sizes (Figure 1E). Additionally, the network displayed better performance on faces shown in new sizes than new positions. Overall, our results indicate that the architecture of the neural network is (1) sufficient for invariant face recognition across resolutions, (2) but insufficient for invariant face recognition across size and position unless trained with many faces varying in size and position. By understanding the limits of convolutional neural networks we can gain insights to understanding factors that enable successful face recognition.

机译：卷积神经网络已经表现出类似于人的面部识别能力，最近的网络达到了97％的准确率（Taigman，2014）。人们认为非线性运算（例如最大池化）是产生位置和尺寸不变性的关键（Riesenhuber＆Poggio，1999）。但是，尚不清楚训练如何促进不变的面部识别。在这里，我们测试了训练如何在位置，大小和分辨率上影响不变的面部识别。我们使用了TensorFlow（tensorflow.org）的卷积神经网络架构。我们对网络进行了训练，以识别101种不同年龄，性别和种族的面孔（每张面孔15张，范围从0到±105°）。该网络针对80％的视图进行了训练，随机选择并针对其余20％的视图进行了测试。在训练过程中，人脸集中显示并以一种尺寸和分辨率呈现。然后，我们针对训练期间未显示的新位置，大小和分辨率测试了跨视图的面部识别。结果表明，与训练期间相比，在不同位置（图1A）或大小（图1B）显示的脸部面部识别性能逐渐下降。但是，人脸识别性能在各种分辨率下都是通用的（图1C）。使用恒定数量的训练示例但采用不同训练方式的进一步实验表明，与使用5个位置（图1D）或5个大小的脸部训练相比，使用随机位置（图1D）或随机大小（图1E）进行训练产生的鲁棒性能更好。（图1E）。此外，在以新尺寸显示的面孔上，网络比在新位置上表现出更好的性能。总体而言，我们的结果表明，神经网络的体系结构（1）足以在各种分辨率下进行不变的人脸识别，（2），但对于在大小和位置上不变的人脸识别是不够的，除非训练有许多大小和位置都不同的人脸。通过了解卷积神经网络的局限性，我们可以深入了解有助于成功进行面部识别的因素。

著录项

来源
《Journal of vision》 |2017年第10期|共1页
作者
Megha Srivastava; Kalanit Grill-Spector;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类眼科学;
关键词

相似文献

外文文献
中文文献
专利

1. Spatio–Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks ? [J] . Huy Hieu Pham, Houssam Salmane, Louahdi Khoudour, Sensors . 2019,第8期

机译：3D骨骼运动的时空图像表示，用于深度卷积神经网络的视图不变动作识别？
2. Handwritten Devanagari Character Recognition Using Layer-Wise Training of Deep Convolutional Neural Networks and Adaptive Gradient Methods [J] . Mahesh Jangid, Sumit Srivastava Journal of Imaging . 2018,第2期

机译：深度卷积神经网络的分层明智训练和自适应梯度法的手写体梵文字符识别
3. DropSample: A new training method to enhance deep convolutional neural networks for large-scale unconstrained handwritten Chinese character recognition [J] . Yang Weixin, Jin Lianwen, Tao Dacheng, Pattern Recognition: The Journal of the Pattern Recognition Society . 2016,第Null期

机译：DropSample：增强深度卷积神经网络以进行大规模无约束手写汉字识别的新训练方法
4. Reducing Overfitting and Improving Generalization in Training Convolutional Neural Network (CNN) under Limited Sample Sizes in Image Recognition [C] . Panissara Thanapol, Kittichai Lavangnananda, Pascal Bouvry, International Conference on Information Technology . 2020

机译：在图像识别下有限样本尺寸下减少过度装备和改善训练卷积神经网络（CNN）的概述
5. Multiple Label Recognition of Deep Learning Using Convolutional Neural Network [D] . He, Mingju. 2018

机译：利用卷积神经网络多重标签识别深度学习
6. Spatio–Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks [O] . Huy Hieu Pham, Houssam Salmane, Louahdi Khoudour, 2019

机译：深度卷积神经网络的视图不变动作识别的3D骨骼运动的时空图像表示
7. Handwritten Hindi Character Recognition Using Layer-Wise Training of Deep Convolutional Neural Networks [O] . Abhishek Mehta, Subhashchandra Desai, Ashish Chaturvedi 2020

机译：手写的印度人字符识别使用深度卷积神经网络的层面训练

Training a deep convolutional neural network with multiple face sizes and positions, but not resolutions, is necessary for generating invariant face recognition across these transformations

摘要

著录项

相似文献

相关主题

期刊订阅