A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems

Matthias Chung; Qi Long; Brent A. Johnson

首页> 外文期刊>Statistics and computing >A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems

【24h】

A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems

机译：小规模和大规模问题中基于删失数据的基于秩的系数估计教程

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The analysis of survival endpoints subject to right-censoring is an important research area in statistics, particularly among econometricians and biostatisticians. The two most popular semiparametric models are the proportional hazards model and the accelerated failure time (AFT) model. Rank-based estimation in the AFT model is computationally challenging due to optimization of a non-smooth loss function. Previous work has shown that rank-based estimators may be written as solutions to linear programming (LP) problems. However, the size of the LP problem is O(n~2 + p) subject to n~2 linear constraints, where n denotes sample size and p denotes the dimension of parameters. As n and/or p increases, the feasibility of such solution in practice becomes questionable. Among data mining and statistical learning enthusiasts, there is interest in extending ordinary regression coefficient estimators for low-dimensions into high-dimensional data mining tools through regularization. Applying this recipe to rank-based coefficient estimators leads to formidable optimization problems which may be avoided through smooth approximations to non-smooth functions. We review smooth approximations and quasi-Newton methods for rank-based estimation in AFT models. The computational cost of our method is substantially smaller than the corresponding LP problem and can be applied to small- or large-scale problems similarly. The algorithm described here allows one to couple rank-based estimation for censored data with virtually any regularization and is exemplified through four case studies.

机译：受权利审查的生存终点分析是统计学的重要研究领域，尤其是在计量经济学家和生物统计学家中。两种最受欢迎的半参数模型是比例风险模型和加速故障时间（AFT）模型。由于非平滑损失函数的优化，AFT模型中基于秩的估计在计算上具有挑战性。先前的工作表明，基于秩的估计器可以写为线性规划（LP）问题的解决方案。但是，LP问题的大小是O（n〜2 + p），受n〜2线性约束，其中n表示样本大小，p表示参数的维数。随着n和/或p的增加，这种解决方案在实践中的可行性变得令人怀疑。在数据挖掘和统计学习爱好者中，有兴趣通过正则化将低维的普通回归系数估计量扩展为高维数据挖掘工具。将此方法应用于基于等级的系数估计器会导致巨大的优化问题，可以通过对非平滑函数进行平滑逼近来避免这些优化问题。我们回顾了AFT模型中基于秩的估计的平滑近似和拟牛顿法。我们方法的计算成本大大小于相应的LP问题，并且可以类似地应用于小规模或大规模问题。这里描述的算法允许将审查数据的基于秩的估计与几乎任何正则化结合起来，并通过四个案例研究进行了举例说明。

著录项

来源
《Statistics and computing》 |2013年第5期|601-614|共14页
作者
Matthias Chung; Qi Long; Brent A. Johnson;
展开▼
作者单位

Department of Mathematics, Virginia Tech, Blacksburg, VA 24061-0123, USA;

Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA;

Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Accelerated failure time model; Ill-posed problems; Regularization; Survival analysis;

机译：加速故障时间模型;不适的问题;正则化;生存分析;

相似文献

外文文献
中文文献
专利

1. Rank-based estimation in the l(1)-regularized partly linear model for censored outcomes with application to integrated analyses of clinical predictors and gene expression data [J] . Johnson BA Biostatistics . 2009,第4期

机译：l（1）修正的部分线性模型中基于秩的估计，用于审查结果，并应用于临床预测因子和基因表达数据的综合分析
2. Rank-based estimation in the ℓ₁-regularized partly linear model for censored outcomes with application to integrated analyses of clinical predictors and gene expression data [J] . Brent A. Johnson* Biostatistics . 2009,第4期

机译：sub _{1 -正则化部分线性模型中基于秩的估计，用于审查结果，并应用于临床预测因子和基因表达数据的综合分析}
3. ON ESTIMATION OF PARTIALLY LINEAR VARYING-COEFFICIENT TRANSFORMATION MODELS WITH CENSORED DATA [J] . Li Bo, Liang Baosheng, Tong Xingwei, Statistica Sinica . 2019,第4期

机译：抑制了截面数据的部分线性变化系数转换模型的估计
4. Rank-Based Inference for the Accelerated Failure Time Model in the Presence of Interval Censored Data [C] . Mostafa Karimi, Noor Akma Ibrahim, Mohd Rizam Abu Bakar, International Conference on Mathematical Sciences and Statistics . 2016

机译：基于秩的加速故障时间模型在存在间隔缩短数据中的推断
5. Efficient use of genetic data for mapping complex traits: Improved data management, significance testing for marker allele sharing statistics, and estimation of kinship coefficients. [D] . Day-Williams, Aaron Garth. 2009

机译：有效利用遗传数据绘制复杂性状：改进数据管理，标记等位基因共享统计数据的显着性测试以及亲属系数估计。
6. A Tutorial on Rank-based Coefficient Estimation for Censored Data in Small- and Large-Scale Problems [O] . Matthias Chung, Qi Long, Brent A. Johnson -1

机译：基于秩的基于级别的系数估计的教程对小型和大规模问题进行了审查的数据
7. Rank-based estimation in the ℓ1-regularized partly linear model for censored outcomes with application to integrated analyses of clinical predictors and gene expression data [O] . Johnson, Brent A. 2009

机译：ℓ1正则化部分线性模型中基于秩的估计，用于审查结果，并应用于临床预测因子和基因表达数据的综合分析
8. Multi-Step Estimation of Regression Coefficients in a Linear Model with Censored Survival Data [R] . Koul, H., Susarla, V., Van Ryzin, J. 1981

机译：截尾生存数据线性模型回归系数的多步估计

A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems

摘要

著录项

相似文献

相关主题

期刊订阅