首页> 外文期刊>Journal of vision >Training a deep convolutional neural network with multiple face sizes and positions, but not resolutions, is necessary for generating invariant face recognition across these transformations
【24h】

Training a deep convolutional neural network with multiple face sizes and positions, but not resolutions, is necessary for generating invariant face recognition across these transformations

机译:训练具有多个面部尺寸和位置而不是分辨率的深度卷积神经网络对于在这些转换中生成不变的面部识别是必要的

获取原文
           

摘要

Convolutional neural networks have demonstrated human-like ability in face recognition, with recent networks achieving as high as 97% accuracy (Taigman, 2014). It is thought that non-linear operations (e.g. maximum-pooling) are key for developing position and size invariance (Riesenhuber & Poggio, 1999). However, it is unknown how training contributes to invariant face recognition. Here, we tested how training affects invariant face recognition across position, size, and resolution. We used a convolutional neural network architecture of TensorFlow (tensorflow.org). We trained the network to recognize 101 faces that varied in age, gender, and ethnicity across views (15 views/face, spanning 0 to ?±105?°). The network was trained on 80% of views, randomly selected, and tested on the remaining 20% of views. During training faces were shown centrally and presented in one size and resolution. Then, we tested face recognition across views for new positions, sizes, and resolutions not shown during training. Results show that face recognition performance progressively declined for faces shown in different positions (Figure 1A) or sizes (Figure 1B) than shown during training. However, face recognition performance generalized across resolutions (Figure 1C). Further experiments using a constant number of training examples, but different training regimes, revealed that training with random positions (Figure 1D) or random sizes (Figure 1E) generated more robust performance than training with faces in 5 positions (Figure 1D) or 5 sizes (Figure 1E). Additionally, the network displayed better performance on faces shown in new sizes than new positions. Overall, our results indicate that the architecture of the neural network is (1) sufficient for invariant face recognition across resolutions, (2) but insufficient for invariant face recognition across size and position unless trained with many faces varying in size and position. By understanding the limits of convolutional neural networks we can gain insights to understanding factors that enable successful face recognition.
机译:卷积神经网络已经表现出类似于人的面部识别能力,最近的网络达到了97%的准确率(Taigman,2014)。人们认为非线性运算(例如最大池化)是产生位置和尺寸不变性的关键(Riesenhuber&Poggio,1999)。但是,尚不清楚训练如何促进不变的面部识别。在这里,我们测试了训练如何在位置,大小和分辨率上影响不变的面部识别。我们使用了TensorFlow(tensorflow.org)的卷积神经网络架构。我们对网络进行了训练,以识别101种不同年龄,性别和种族的面孔(每张面孔15张,范围从0到±105°)。该网络针对80%的视图进行了训练,随机选择并针对其余20%的视图进行了测试。在训练过程中,人脸集中显示并以一种尺寸和分辨率呈现。然后,我们针对训练期间未显示的新位置,大小和分辨率测试了跨视图的面部识别。结果表明,与训练期间相比,在不同位置(图1A)或大小(图1B)显示的脸部面部识别性能逐渐下降。但是,人脸识别性能在各种分辨率下都是通用的(图1C)。使用恒定数量的训练示例但采用不同训练方式的进一步实验表明,与使用5个位置(图1D)或5个大小的脸部训练相比,使用随机位置(图1D)或随机大小(图1E)进行训练产生的鲁棒性能更好。 (图1E)。此外,在以新尺寸显示的面孔上,网络比在新位置上表现出更好的性能。总体而言,我们的结果表明,神经网络的体系结构(1)足以在各种分辨率下进行不变的人脸识别,(2),但对于在大小和位置上不变的人脸识别是不够的,除非训练有许多大小和位置都不同的人脸。通过了解卷积神经网络的局限性,我们可以深入了解有助于成功进行面部识别的因素。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号