首页> 美国卫生研究院文献>other >Online Censoring for Large-Scale Regressions with Application to Streaming Big Data
【2h】

Online Censoring for Large-Scale Regressions with Application to Streaming Big Data

机译:大规模回归的在线检查及其在流传输大数据中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

On par with data-intensive applications, the sheer size of modern linear regression problems creates an ever-growing demand for efficient solvers. Fortunately, a significant percentage of the data accrued can be omitted while maintaining a certain quality of statistical inference with an affordable computational budget. This work introduces means of identifying and omitting less informative observations in an online and data-adaptive fashion. Given streaming data, the related maximum-likelihood estimator is sequentially found using first- and second-order stochastic approximation algorithms. These schemes are well suited when data are inherently censored or when the aim is to save communication overhead in decentralized learning setups. In a different operational scenario, the task of joint censoring and estimation is put forth to solve large-scale linear regressions in a centralized setup. Novel online algorithms are developed enjoying simple closed-form updates and provable (non)asymptotic convergence guarantees. To attain desired censoring patterns and levels of dimensionality reduction, thresholding rules are investigated too. Numerical tests on real and synthetic datasets corroborate the efficacy of the proposed data-adaptive methods compared to data-agnostic random projection-based alternatives.
机译:与数据密集型应用程序一样,现代线性回归问题的庞大规模导致对有效求解器的需求不断增长。幸运的是,可以在可承受的计算预算范围内保持一定统计推断质量的同时,省去很大一部分应计数据。这项工作介绍了一种以在线和数据自适应的方式识别和省略较少信息的观测值的方法。给定流数据,使用一阶和二阶随机逼近算法顺序找到相关的最大似然估计器。当数据固有地被检查时或当目的是在分散式学习设置中节省通信开销时,这些方案非常适合。在不同的操作场景中,提出了联合检查和估计的任务以解决集中式设置中的大规模线性回归问题。开发了新颖的在线算法,可进行简单的闭式更新和可证明的(非)渐近收敛保证。为了获得所需的检查模式和降维级别,还研究了阈值规则。与基于数据不可知的基于随机投影的替代方法相比,对真实和综合数据集的数值测试证实了所提出的数据自适应方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号