首页> 外文会议>2011 49th Annual Allerton Conference on Communication, Control, and Computing >Early stopping for non-parametric regression: An optimal data-dependent stopping rule
【24h】

Early stopping for non-parametric regression: An optimal data-dependent stopping rule

机译:非参数回归的提前停止:最佳的数据相关停止规则

获取原文

摘要

The goal of non-parametric regression is to estimate an unknown function f based on n i.i.d. observations of the form yi = f(xi) + wi, where {wi}ni=1 are additive noise variables. Simply choosing a function to minimize the least-squares loss 1/2n ∑ni=1 (yi − f(xi))2 will lead to “overfitting”, so that various estimators are based on different types of regularization. The early stopping strategy is to run an iterative algorithm such as gradient descent for a fixed but finite number of iterations. Early stopping is known to yield estimates with better prediction accuracy than those obtained by running the algorithm for an infinite number of iterations. Although bounds on this prediction error are known for certain function classes and step size choices, the bias-variance tradeoffs for arbitrary reproducing kernel Hilbert spaces (RKHSs) and arbitrary choices of step-sizes have not been well-understood to date. In this paper, we derive upper bounds on both the L2(Pn) and L2(P) error for arbitrary RKHSs, and provide an explicit and easily computable data-dependent stopping rule. In particular, it depends only on the sum of step-sizes and the eigenvalues of the empirical kernel matrix for the RKHS. For Sobolev spaces and finite-rank kernel classes, we show that our stopping rule yields estimates that achieve the statistically optimal rates in a minimax sense.
机译:非参数回归的目标是基于n i.i.d估计未知函数f 。形式为y i = f (x i )+ w i 的观测值,其中{w i } n i = 1 是加性噪声变量。只需选择一个函数以最小化最小二乘损失1 / 2n ∑ n i = 1 (y i − f(x i )) 2 将导致“过度拟合”,因此各种估计量都基于不同类型的正则化。早期停止策略是针对固定但有限数量的迭代运行迭代算法(例如梯度下降)。与通过运行算法进行无数次迭代所获得的估计相比,提前停止所产生的估计具有更好的预测精度。尽管对于某些函数类和步长选择,此预测误差的界限是已知的,但迄今为止,尚未很好地理解任意再现内核希尔伯特空间(RKHS)和步长的任意选择的偏差方差折衷。在本文中,我们针对任意RKHS导出L 2 (P n )和L 2 (P)误差的上限,以及提供明确且易于计算的数据相关停止规则。特别是,它仅取决于RKHS的经验核矩阵的步长和特征值之和。对于Sobolev空间和有限秩核类,我们证明了我们的停止规则得出的估计值在最小极大意义上达到了统计上的最优比率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号