首页> 外国专利> SECOND-ORDER OPTIMIZATION METHODS FOR AVOIDING SADDLE POINTS DURING THE TRAINING OF DEEP NEURAL NETWORKS

SECOND-ORDER OPTIMIZATION METHODS FOR AVOIDING SADDLE POINTS DURING THE TRAINING OF DEEP NEURAL NETWORKS

机译:深层神经网络训练中避免鞍点的二阶优化方法

摘要

A computer-implemented method for training a deep neural network includes defining a loss function corresponding to the deep neural network, receiving a training dataset comprising training samples, and setting current parameter values to initial parameter values. An optimization method is performed which iteratively minimizes the loss function. During each iteration, a steepest direction of the loss function is calculated by determining the gradient of the loss function at the current parameter values. A batch of samples included in training samples is selected. A matrix-free CG solver is applied to obtain an inexact solution to a linear system defined by the steepest direction of the loss function and a stochastic Hessian matrix with respect to the batch of samples. A descent direction is determined, and the parameter values are updated based on the descent direction. Following the optimization method, the parameter values are stored in relationship to the deep neural network.
机译:一种用于训练深度神经网络的计算机实现的方法,包括定义与该深度神经网络相对应的损失函数,接收包括训练样本的训练数据集以及将当前参数值设置为初始参数值。执行一种优化方法,该方法迭代地使损失函数最小化。在每次迭代期间,通过确定当前参数值处的损失函数的梯度来计算损失函数的最陡方向。选择训练样本中包括的一批样本。应用无矩阵CG解算器以获取线性函数的不精确解,该线性系统由损耗函数的最陡方向和相对于一批样品的随机Hessian矩阵定义。确定下降方向,并且基于该下降方向来更新参数值。按照优化方法,将参数值与深度神经网络关联存储。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号