...
首页> 外文期刊>Statistics and Its Interface >REC: fast sparse regression-based multicategory classification
【24h】

REC: fast sparse regression-based multicategory classification

机译:REC:基于快速稀疏的回归的多特语分类

获取原文
获取原文并翻译 | 示例
           

摘要

Recent advance in technology enables researchers to gather and store enormous data sets with ultra high dimensionality. In bioinformatics, microarray and next generation sequencing technologies can produce data with tens of thousands of predictors of biomarkers. On the other hand, the corresponding sample sizes are often limited. For classification problems, to predict new observations with high accuracy, and to better understand the effect of predictors on classification, it is desirable, and often necessary, to train the classifier with variable selection. In the literature, sparse regularized classification techniques have been popular due to the ability of simultaneous classification and variable selection. Despite its success, such a sparse penalized method may have low computational speed, when the dimension of the problem is ultra high. To overcome this challenge, we propose a new sparse REgression based multicategory Classifier (REC). Our method uses a simplex to represent different categories of the classification problem. A major advantage of REC is that the optimization can be decoupled into smaller independent sparse penalized regression problems, and hence solved by using parallel computing. Consequently, REC enjoys an extraordinarily fast computational speed. Moreover, REC is able to provide class conditional probability estimation. Simulated examples and applications on microarray and next generation sequencing data suggest that REC is very competitive when compared to several existing methods.
机译:最近的技术进步使研究人员能够通过超高维度收集和存储巨大的数据集。在生物信息学中,微阵列和下一代测序技术可以生产具有成千上万的生物标志物预测因子的数据。另一方面,相应的样本尺寸通常是有限的。对于分类问题,为了以高精度预测新观察,并更好地了解预测器对分类的影响,是理想的,并且通常需要培训分类器的变量选择。在文献中,由于同时分类和可变选择的能力,稀疏的正则化分类技术已经很受欢迎。尽管有其成功,但这种稀疏的惩罚方法可能具有低的计算速度,当问题的维度超高时。为了克服这一挑战,我们提出了一种基于新的稀疏回归的多特征分类器(REC)。我们的方法使用Simplex表示不同类别的分类问题。 REC的一个主要优点是,优化可以分离成较小的独立稀疏惩罚的回归问题,因此通过使用并行计算解决了。因此,REC享有非凡的计算速度。此外,REC能够提供类条件概率估计。微阵列和下一代测序数据的模拟示例和应用表明,与若干现有方法相比,REC是非常竞争力的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号