首页>
外国专利>
SECOND-ORDER OPTIMIZATION METHODS FOR AVOIDING SADDLE POINTS DURING THE TRAINING OF DEEP NEURAL NETWORKS
SECOND-ORDER OPTIMIZATION METHODS FOR AVOIDING SADDLE POINTS DURING THE TRAINING OF DEEP NEURAL NETWORKS
展开▼
机译:深层神经网络训练中避免鞍点的二阶优化方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
A computer-implemented method for training a deep neural network includes defining a loss function corresponding to the deep neural network, receiving a training dataset comprising training samples, and setting current parameter values to initial parameter values. An optimization method is performed which iteratively minimizes the loss function. During each iteration, a steepest direction of the loss function is calculated by determining the gradient of the loss function at the current parameter values. A batch of samples included in training samples is selected. A matrix-free CG solver is applied to obtain an inexact solution to a linear system defined by the steepest direction of the loss function and a stochastic Hessian matrix with respect to the batch of samples. A descent direction is determined, and the parameter values are updated based on the descent direction. Following the optimization method, the parameter values are stored in relationship to the deep neural network.
展开▼