...
首页> 外文期刊>Annals of the Institute of Statistical Mathematics >Robust distributed estimation and variable selection for massive datasets via rank regression
【24h】

Robust distributed estimation and variable selection for massive datasets via rank regression

机译:通过秩回归对海量数据集进行鲁棒的分布式估计和变量选择

获取原文
获取原文并翻译 | 示例
           

摘要

Rank regression is a robust modeling tool; it is challenging to implement it for the distributed massive data owing to memory constraints. In practice, the massive data may be distributed heterogeneously from machine to machine; how to incorporate the heterogeneity is also an interesting issue. This paper proposes a distributed rank regression (DR2), which can be implemented in the master machine by solving a weighted least-squares and adaptive when the data are heterogeneous. Theoretically, we prove that the resulting estimator is statistically as efficient as the global rank regression estimator. Furthermore, based on the adaptive LASSO and a newly defined distributed BIC-type tuning parameter selector, we propose a distributed regularized rank regression (DR3), which can make consistent variable selection and can also be easily implemented by using the LARS algorithm on the master machine. Simulation results and real data analysis are included to validate our method.
机译:秩回归是一种强大的建模工具;由于内存限制,对于分布式海量数据,实现它具有挑战性。在实践中,海量数据可能在机器之间异构分布;如何纳入异质性也是一个有趣的问题。该文提出了一种分布式秩回归(DR2),该回归可以通过求解加权最小二乘法在主机中实现,并在数据异构时自适应。从理论上讲,我们证明了得到的估计器在统计上与全局秩回归估计器一样有效。此外,基于自适应LASSO和新定义的分布式BIC型调谐参数选择器,提出了一种分布式正则化秩回归(DR3),该回归可以进行一致的变量选择,也可以在主机上使用LARS算法轻松实现。结合仿真结果和实际数据分析,验证了我们的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号