首页> 外文期刊>Statistics and computing >A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems
【24h】

A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems

机译:小规模和大规模问题中基于删失数据的基于秩的系数估计教程

获取原文
获取原文并翻译 | 示例
           

摘要

The analysis of survival endpoints subject to right-censoring is an important research area in statistics, particularly among econometricians and biostatisticians. The two most popular semiparametric models are the proportional hazards model and the accelerated failure time (AFT) model. Rank-based estimation in the AFT model is computationally challenging due to optimization of a non-smooth loss function. Previous work has shown that rank-based estimators may be written as solutions to linear programming (LP) problems. However, the size of the LP problem is O(n~2 + p) subject to n~2 linear constraints, where n denotes sample size and p denotes the dimension of parameters. As n and/or p increases, the feasibility of such solution in practice becomes questionable. Among data mining and statistical learning enthusiasts, there is interest in extending ordinary regression coefficient estimators for low-dimensions into high-dimensional data mining tools through regularization. Applying this recipe to rank-based coefficient estimators leads to formidable optimization problems which may be avoided through smooth approximations to non-smooth functions. We review smooth approximations and quasi-Newton methods for rank-based estimation in AFT models. The computational cost of our method is substantially smaller than the corresponding LP problem and can be applied to small- or large-scale problems similarly. The algorithm described here allows one to couple rank-based estimation for censored data with virtually any regularization and is exemplified through four case studies.
机译:受权利审查的生存终点分析是统计学的重要研究领域,尤其是在计量经济学家和生物统计学家中。两种最受欢迎​​的半参数模型是比例风险模型和加速故障时间(AFT)模型。由于非平滑损失函数的优化,AFT模型中基于秩的估计在计算上具有挑战性。先前的工作表明,基于秩的估计器可以写为线性规划(LP)问题的解决方案。但是,LP问题的大小是O(n〜2 + p),受n〜2线性约束,其中n表示样本大小,p表示参数的维数。随着n和/或p的增加,这种解决方案在实践中的可行性变得令人怀疑。在数据挖掘和统计学习爱好者中,有兴趣通过正则化将低维的普通回归系数估计量扩展为高维数据挖掘工具。将此方法应用于基于等级的系数估计器会导致巨大的优化问题,可以通过对非平滑函数进行平滑逼近来避免这些优化问题。我们回顾了AFT模型中基于秩的估计的平滑近似和拟牛顿法。我们方法的计算成本大大小于相应的LP问题,并且可以类似地应用于小规模或大规模问题。这里描述的算法允许将审查数据的基于秩的估计与几乎任何正则化结合起来,并通过四个案例研究进行了举例说明。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号