首页> 外文会议>IEEE Conference on Computer Vision and Pattern Recognition >All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation
【24h】

All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation

机译:所需的一切都超越了良好的初衷:探索更好的解决方案,以训练具有正交正态性和调制的极深卷积神经网络

获取原文

摘要

Deep neural network is difficult to train and this predicament becomes worse as the depth increases. The essence of this problem exists in the magnitude of backpropagated errors that will result in gradient vanishing or exploding phenomenon. We show that a variant of regularizer which utilizes orthonormality among different filter banks can alleviate this problem. Moreover, we design a backward error modulation mechanism based on the quasi-isometry assumption between two consecutive parametric layers. Equipped with these two ingredients, we propose several novel optimization solutions that can be utilized for training a specific-structured (repetitively triple modules of Conv-BNReLU) extremely deep convolutional neural network (CNN) WITHOUT any shortcuts/ identity mappings from scratch. Experiments show that our proposed solutions can achieve distinct improvements for a 44-layer and a 110-layer plain networks on both the CIFAR-10 and ImageNet datasets. Moreover, we can successfully train plain CNNs to match the performance of the residual counterparts. Besides, we propose new principles for designing network structure from the insights evoked by orthonormality. Combined with residual structure, we achieve comparative performance on the ImageNet dataset.
机译:深度神经网络很难训练,并且随着深度的增加,这种情况变得更糟。该问题的本质在于反向传播误差的大小,该误差将导致梯度消失或爆炸现象。我们表明,利用不同滤波器组之间的正交性的正则化器的变体可以缓解此问题。此外,我们基于两个连续参数层之间的准等距假设设计了一种后向误差调制机制。装备了这两种成分,我们提出了几种新颖的优化解决方案,可用于训练特定结构的(反复使用Conv-BNReLU的三重模块)极深的卷积神经网络(CNN),而无需从头开始进行任何快捷方式/身份映射。实验表明,我们提出的解决方案可以在CIFAR-10和ImageNet数据集上针对44层和110层的普通网络实现明显的改进。此外,我们可以成功地训练普通的CNN,以匹配剩余对应项的性能。此外,我们从正交性的见解中提出了设计网络结构的新原理。结合残差结构,我们在ImageNet数据集上实现了可比较的性能。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号