On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings

机译：在随机设置中Nesterov加速梯度法的融合

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study Nesterov's accelerated gradient method with constant step-size and momentum parameters in the stochastic approximation setting (unbiased gradients with bounded variance) and the finite-sum setting (where randomness is due to sampling mini-batches). To build better insight into the behavior of Nesterov's method in stochastic settings, we focus throughout on objectives that are smooth, strongly-convex, and twice continuously differentiable. In the stochastic approximation setting, Nesterov's method converges to a neighborhood of the optimal point at the same accelerated rate as in the deterministic setting. Perhaps surprisingly, in the finite-sum setting, we prove that Nesterov's method may diverge with the usual choice of step-size and momentum, unless additional conditions on the problem related to conditioning and data coherence are satisfied. Our results shed light as to why Nesterov's method may fail to converge or achieve acceleration in the finite-sum setting.

机译：我们在随机近似设置中的恒定步长和动量参数（具有有界差异的非偏见梯度）和有限和设置（其中由于采样小批次而导致的偏差梯度）来研究Nesterov的加速梯度方法。为了更好地了解Nesterov在随机设置中的方法的行为，我们整体上专注于流畅，强烈凸出的目标，两次连续可差。在随机近似设置中，Nesterov的方法会聚到与确定性设置相同加速速率的最佳点的邻域。也许令人惊讶的是，在有限和设置中，我们证明了Nesterov的方法可能随着常规选择的阶梯大小和势头而分歧，除非满足有关与调理和数据一致性相关的问题的额外条件。我们的结果阐明了为什么Nesterov的方法可能无法在有限和设置中收敛或实现加速度。

著录项

来源
《International Conference on Machine Learning》|2021年|786p|共11页
会议地点
作者
Mahmoud Assran; Michael Rabbat;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP181-53;
关键词

相似文献

外文文献
中文文献
专利

1. RATE OF CONVERGENCE OF THE NESTEROV ACCELERATED GRADIENT METHOD IN THE SUBCRITICAL CASE α≤3 [J] . ESAIM . 2019,第1期

机译：亚临界情况下Nesterov加速梯度方法的收敛速度
2. Nesterov-aided stochastic gradient methods using Laplace approximation for Bayesian design optimization [J] . Computer Methods in Applied Mechanics and Engineering . 2020,第May1期

机译：基于Laplace逼近的Nesterov辅助随机梯度方法用于贝叶斯设计优化
3. THE RATE OF CONVERGENCE OF NESTEROV'S ACCELERATED FORWARD-BACKWARD METHOD IS ACTUALLY FASTER THAN 1/k(2) [J] . Attouch Hedy, Peypouquet Juan SIAM Journal on Optimization: A Publication of the Society for Industrial and Applied Mathematics . 2016,第3期

机译：Nesterov加速的向前-向后方法的收敛速度实际上比1 / k（2）快
4. On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings [C] . Mahmoud Assran, Michael Rabbat International Conference on Machine Learning . 2021

机译：在随机设置中Nesterov加速梯度法的融合
5. Network Elastic Net, Stochastic Accelerated Gradient Boosting and Accelerated Failure Time in XGBoost [D] . Barnwal, Avinash . 2020

机译：网络弹性网，随机加速梯度提升和XGBoost中的加速失效时间
6. Accelerated statistical reconstruction for C-arm cone-beam CT using Nesterov’s method [O] . Adam S. Wang, J. Webster Stayman, Yoshito Otake, -1

机译：使用Nesterov方法的C臂锥形束CT加速统计重建
7. Rate of convergence of the Nesterov accelerated gradient method in the subcritical case $lpha leq 3$ [O] . Attouch Hedy, Chbani Zaki, Riahi Hassan 2017

机译：Nesterov加速梯度法的收敛速度次临界案例$ alpha leq 3 $
8. I,II Convergence and Rate of Convergence Theorems for Constrained and Unconstrained Stochastic Approximation,via Weak Convergence Methods. III Numerical Studies for Constrained Stochastic Approximation Problems, [R] . kushner,harold j. lakshmivarahan, s. 1977

机译：I，II收敛性和受约束和无约束随机逼近的收敛速度定理，通过弱收敛方法。 III约束随机逼近问题的数值研究，

On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings

摘要

著录项

相似文献

相关主题

期刊订阅