Non-convergence and Limit Cycles in the Adam Optimizer

机译：Adam优化器中的非收敛和极限环

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

One of the most popular training algorithms for deep neural networks is the Adaptive Moment Estimation (Adam) introduced by Kingma and Ba. Despite its success in many applications there is no satisfactory convergence analysis: only local convergence can be shown for batch mode under some restrictions on the hyperparameters, counterexamples exist for incremental mode. Recent results show that for simple quadratic objective functions limit cycles of period 2 exist in batch mode, but only for atypical hyperparameters, and only for the algorithm without bias correction. We extend the convergence analysis to all choices of the hyperparameters for quadratic functions. This finally answers the question of convergence for Adam in batch mode to the negative. We analyze the stability of these limit cycles and relate our analysis to other results where approximate convergence was shown, but under the additional assumption of bounded gradients which does not apply to quadratic functions. The investigation heavily relies on the use of computer algebra due to the complexity of the equations.

机译：深度神经网络最流行的训练算法之一是Kingma和Ba提出的自适应矩估计（Adam）。尽管它在许多应用程序中都取得了成功，但没有令人满意的收敛分析：在对超参数有一定限制的情况下，对于批处理模式只能显示局部收敛，而对于增量模式则存在反例。最近的结果表明，对于简单的二次目标函数，周期2的极限环以批处理模式存在，但仅对于非典型超参数，并且仅对于没有偏差校正的算法。我们将收敛性分析扩展到用于二次函数的超参数的所有选择。这最终将亚当在批处理模式下的收敛性问题解答为否定的。我们分析了这些极限环的稳定性，并将我们的分析与显示近似收敛的其他结果相关联，但是在有界梯度的附加假设下，这不适用于二次函数。由于方程的复杂性，研究严重依赖于计算机代数的使用。

著录项

来源
《International Conference on Artificial Neural Networks》|2019年|232-243|共12页
会议地点
作者
Sebastian Bock; Martin Weiss;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Adam optimizer; Convergence; Computer algebra; Dynamical system; Limit cycle;

机译：亚当优化器;收敛;计算机代数;动力系统;极限循环;

相似文献

外文文献
中文文献
专利

1. Many-particle limits and non-convergence of dislocation wall pile-ups [J] . van Meurs Patrick Nonlinearity . 2018,第1期

机译：脱位墙堆积的多种粒子限制和非融合
2. Limits and Optimization of Power Input or Output of Actual Thermal Cycles [J] . Emin A??kkalp, Hasan Yam?k Entropy . 2013,第8期

机译：实际热循环的功率输入或输出的限制和优化
3. BAS-ADAM: an ADAM based approach to improve the performance of beetle antennae search optimizer [J] . Automatica Sinica, IEEE/CAA Journal of . 2020,第2期

机译：BAS-ADAM：一种基于ADAM的方法来改善甲虫天线搜索优化器的性能
4. Non-convergence and Limit Cycles in the Adam Optimizer [C] . Sebastian Bock, Martin Weiss International Conference on Artificial Neural Networks . 2019

机译：ADAM优化器中的非收敛性和限制周期
5. Religious zeal, political faction and the corruption of morals: Adam Smith and the limits of enlightenment. [D] . Brubaker, Lauren Eugene. 2002

机译：宗教热情，政治派别和道德败坏：亚当·斯密（Adam Smith）和启蒙运动的局限性。
6. Hair-cycle dependent differential expression of ADAM 10 and ADAM 12 [O] . Shin-Taek Oh, Baik-Kee Cho, Anja Schramme, 2009

机译：ADAM 10和ADAM 12的毛发周期依赖性差异表达
7. Limits and Optimization of Power Input or Output of Actual Thermal Cycles [O] . Emin Açıkkalp, Hasan Yamık 2013

机译：实际热循环的功率输入或输出的限制和优化
8. Optimization study of mass expulsion attitude control systems by means of advanced limit cycle techniques [R] . Adlhoch, R. W., Englehart, W. C. 1964

机译：基于先进极限环技术的质量排出姿态控制系统优化研究

Non-convergence and Limit Cycles in the Adam Optimizer

摘要

著录项

相似文献

相关主题

期刊订阅