...
首页> 外文期刊>Physical Review X >Statistical Mechanics of Deep Linear Neural Networks: The Backpropagating Kernel Renormalization
【24h】

Statistical Mechanics of Deep Linear Neural Networks: The Backpropagating Kernel Renormalization

机译:深线性神经网络的统计力学:背交核心重整

获取原文
           

摘要

The groundbreaking success of deep learning in many real-world tasks has triggered an intense effort to theoretically understand the power and limitations of deep learning in the training and generalization of complex tasks, so far with limited progress. In this work, we study the statistical mechanics of learning in deep linear neural networks (DLNNs) in which the input-output function of an individual unit is linear. Despite the linearity of the units, learning in DLNNs is highly nonlinear; hence, studying its properties reveals some of the essential features of nonlinear deep neural networks (DNNs). Importantly, we exactly solve the network properties following supervised learning using an equilibrium Gibbs distribution in the weight space. To do this, we introduce the backpropagating kernel renormalization (BPKR), which allows for the incremental integration of the network weights layer by layer starting from the network output layer and progressing backward until the first layer’s weights are integrated out. This procedure allows us to evaluate important network properties, such as its generalization error, the role of network width and depth, the impact of the size of the training set, and the effects of weight regularization and learning stochasticity. BPKR does not assume specific statistics of the input or the task’s output. Furthermore, by performing partial integration of the layers, the BPKR allows us to compute the emergent properties of the neural representations across the different hidden layers. We propose a heuristic extension of the BPKR to nonlinear DNNs with rectified linear units (ReLU). Surprisingly, our numerical simulations reveal that despite the nonlinearity, the predictions of our theory are largely shared by ReLU networks of modest depth, in a wide regime of parameters. Our work is the first exact statistical mechanical study of learning in a family of deep neural networks, and the first successful theory of learning through the successive integration of degrees of freedom in the learned weight space.
机译:许多现实世界任务的深度学习的突破性成功引发了理论上,从理论上致力于理解复杂任务的培训和泛化的力量和局限,进展有限。在这项工作中,我们研究了在深度线性神经网络(DLNN)中学习的统计机制,其中单个单位的输入输出功能是线性的。尽管单位的线性,但DLNNS的学习非常非线性;因此,研究其性质揭示了非线性深神经网络(DNN)的一些基本特征。重要的是,我们在使用重量空间中的均衡吉布斯分布的监督学习后,我们精确解决了网络性质。为此,我们介绍了反向化内核重新运行(BPKR),其允许从网络输出层开始的网络权重层的增量集成,并向后向后进入,直到第一层的重量集成OUT。此过程允许我们评估重要的网络属性,例如其泛化误差,网络宽度和深度的作用,训练集的尺寸的影响,以及重量正则化和学习随机性的影响。 BPKR不承担输入或任务的输出的特定统计信息。此外,通过执行层的部分积分,BPKR允许我们计算跨不同隐藏层的神经表示的紧急特性。我们提出了具有整流线性单元(Relu)的BPKR对非线性DNN的启发式扩展。令人惊讶的是,我们的数值模拟表明,尽管非线性,但我们的理论的预测在很大程度上被适度深度的reflu网络,在广泛的参数方案中。我们的工作是在一个深度神经网络系列学习的第一个精确的统计机械研究,以及通过在学习的重量空间中连续融合自由度的连续融入自由度的第一个成功理论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号