首页> 外文OA文献 >Second-order Optimization for Non-convex Machine Learning: an Empirical Study

【2h】

Second-order Optimization for Non-convex Machine Learning: an Empirical Study

机译：非凸机学习二阶优化：实证研究

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

The resurgence of deep learning, as a highly effective machine learningparadigm, has brought back to life the old optimization question ofnon-convexity. Indeed, the challenges related to the large-scale nature of manymodern machine learning applications are severely exacerbated by the inherentnon-convexity in the underlying models. In this light, efficient optimizationalgorithms which can be effectively applied to such large-scale and non-convexlearning problems are highly desired. In doing so, however, the bulk ofresearch has been almost completely restricted to the class of 1st-orderalgorithms. This is despite the fact that employing the curvature information,e.g., in the form of Hessian, can indeed help with obtaining effective methodswith desirable convergence properties for non-convex problems, e.g., avoidingsaddle-points and convergence to local minima. The conventional wisdom, in themachine learning community is that the application of 2nd-order methods, i.e.,those that employ Hessian as well as gradient information, can be highlyinefficient. Consequently, 1st-order algorithms, such as stochastic gradientdescent (SGD), have been at the center-stage for solving such machine learningproblems. Here, we aim at addressing this misconception by consideringefficient and stochastic variants of Newton's method, namely, sub-sampledtrust-region and cubic regularization, whose theoretical convergence propertieshave recently been established in [Xu 2017]. Using a variety of experiments, weempirically evaluate the performance of these methods for solving non-convexmachine learning applications. In doing so, we highlight the shortcomings of1st-order methods, e.g., high sensitivity to hyper-parameters such as step-sizeand undesirable behavior near saddle-points, and showcase the advantages ofemploying curvature information as effective remedy.

机译：深入学习的复苏，作为一种高效的机器学习评委，已经带回了人们吞噬的旧优化问题。实际上，与底层模型中的固有元凸起严重加剧了与许多熟悉机器学习应用的大规模性质有关的挑战。在这种光中，非常需要有效地应用于这种大规模和非凸性问题的有效优化识别。然而，在这样做时，散装搜索几乎完全限制在第一个Orderalgorithms的类别。尽管采用曲率信息，例如，以Hessian的形式采用曲率信息，但可以帮助获得用于非凸面问题的有效的方法，例如，用于非凸起问题的理想收敛性，例如，避免对局部最小值的收敛和收敛。在实心测量社区中的传统智慧是，第2顺序方法的应用，即雇用Hessian以及梯度信息的人可以是高度的。因此，第一阶算法（例如随机梯度（SGD））一直处于求解这种机器学习问题的中心阶段。在这里，我们的目的是通过考虑牛顿方法，即亚抽样区域和立方规范化的考虑效率和随机变体来解决这种误解，其理论会聚属性最近在[徐2017]中建立。利用各种实验，借着借毒评估这些方法解决非凸法学习应用的方法。在这样做时，我们突出了1阶方法的缺点，例如，对鞍点附近的STEP-SIZEAND不良行为等高敏感性，并展示了将曲率信息的优势作为有效的补救措施。

著录项

作者
Peng Xu; Fred Roosta; Michael W. Mahoney;
展开▼
作者单位

展开▼
年度 2020
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Non-convex Optimization for Machine Learning [J] . Prateek Jain, Purushottam Kar Foundations and trends in machine learning . 2017,第3a4期

机译：机器学习的非凸优化
2. Second-Order Stochastic Optimization for Machine Learning in Linear Time [J] . Naman Agarwal, Brian Bullins, Elad Hazan Journal of machine learning research . 2017,第116期

机译：线性时间机器学习的二阶随机优化
3. Second-Order Stochastic Optimization for Machine Learning in Linear Time [J] . Naman Agarwal, Brian Bullins, Elad Hazan Journal of machine learning research . 2017,第116期

机译：线性时间机器学习的二阶随机优化
4. An Empirical Study of Optimizers for Quantum Machine Learning [C] . Yiming Huang, Hang Lei, Xiaoyu Li IEEE International Conference on Computer and Communications . 2020

机译：量子机学习优化器的实证研究
5. Efficient Second-Order Methods for Non-Convex Optimization and Machine Learning [D] . Yao, Zhewei. 2021

机译：有效的非凸优化和机器学习的二阶方法
6. Retracted: Medical Dataset Classification: A Machine Learning Paradigm Integrating Particle Swarm Optimization with Extreme Learning Machine Classifier [O] . The Scientific World Journal 2016

机译：缩回：医学数据集分类：结合粒子群优化与极限学习机分类器的机器学习范例
7. Non-convex Optimization for Machine Learning [O] . Jain, Prateek, Kar, Purushottam 2017

机译：机器学习的非凸优化

Second-order Optimization for Non-convex Machine Learning: an Empirical Study

摘要

著录项

相似文献

相关主题

期刊订阅