Classification of large microarray datasets using fast random forest construction.

Manilich Elena A; Ozsoyoglu Z Meral; Trubachev Valeriy; Radivoyevitch Tomas

首页> 外文期刊>Journal of Bioinformatics and Computational Biology >Classification of large microarray datasets using fast random forest construction.

【24h】

Classification of large microarray datasets using fast random forest construction.

机译：使用快速随机森林构建对大型微阵列数据集进行分类。

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Random forest is an ensemble classification algorithm. It performs well when most predictive variables are noisy and can be used when the number of variables is much larger than the number of observations. The use of bootstrap samples and restricted subsets of attributes makes it more powerful than simple ensembles of trees. The main advantage of a random forest classifier is its explanatory power: it measures variable importance or impact of each factor on a predicted class label. These characteristics make the algorithm ideal for microarray data. It was shown to build models with high accuracy when tested on high-dimensional microarray datasets. Current implementations of random forest in the machine learning and statistics community, however, limit its usability for mining over large datasets, as they require that the entire dataset remains permanently in memory. We propose a new framework, an optimized implementation of a random forest classifier, which addresses specific properties of microarray data, takes computational complexity of a decision tree algorithm into consideration, and shows excellent computing performance while preserving predictive accuracy. The implementation is based on reducing overlapping computations and eliminating dependency on the size of main memory. The implementation's excellent computational performance makes the algorithm useful for interactive data analyses and data mining.

机译：随机森林是一种集成分类算法。当大多数预测变量嘈杂时，它表现良好;当变量数量远大于观测值数量时，可以使用它。引导程序样本和属性的受限子集的使用使其比简单的树状集成更强大。随机森林分类器的主要优点是其解释力：它可以衡量变量的重要性或每个因素对预测类别标签的影响。这些特性使该算法非常适合微阵列数据。在高维微阵列数据集上进行测试时，它显示出可以建立高精度的模型。但是，机器学习和统计领域中随机森林的当前实现限制了其在大型数据集上进行挖掘的可用性，因为它们要求整个数据集永久保留在内存中。我们提出了一个新的框架，即随机森林分类器的优化实现，该算法解决了微阵列数据的特定属性，考虑了决策树算法的计算复杂性，并在保持预测精度的同时显示了出色的计算性能。该实现基于减少重叠计算并消除对主存储器大小的依赖性。该实现的出色计算性能使该算法可用于交互式数据分析和数据挖掘。

著录项

来源
《Journal of Bioinformatics and Computational Biology》 |2011年第2期|共17页
作者
Manilich Elena A; Ozsoyoglu Z Meral; Trubachev Valeriy; Radivoyevitch Tomas;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类细胞生物学;
关键词

相似文献

外文文献
中文文献
专利

1. Classification of large microarray datasets using fast random forest construction. [J] . Manilich Elena A, Ozsoyoglu Z Meral, Trubachev Valeriy, Journal of Bioinformatics and Computational Biology . 2011,第2期

机译：使用快速随机森林构建对大型微阵列数据集进行分类。
2. Applying randomness effectively based on random forests for classification task of datasets of insufficient information [J] . Sug H. Journal of applied mathematics . 2012,第Pta12期

机译：基于随机森林的有效随机性用于信息量不足的数据集的分类任务
3. Applying Randomness Effectively Based on Random Forests for Classification Task of Datasets of Insufficient Information [J] . HyontaiSug Journal of applied mathematics . 2012,第2期

机译：基于随机森林的有效随机性在信息不足数据集分类任务中的应用
4. Learning Microarray Cancer Datasets by Random Forests and Support Vector Machines [C] . Klassen M. 2010 5th International Conference on Future Information Technology (FutureTech 2010) . 2010

机译：通过随机森林和支持向量机学习微阵列癌症数据集
5. Interactive fast random access, retrieval, and navigation of large datasets [D] . Fan, Zihong 2011

机译：大型数据集的交互式快速随机访问，检索和导航
6. Correction: Modified shape index for object-based random forest image classification of agricultural systems using airborne hyperspectral datasets [O] . Eric Ariel L. Salas, Sakthi Kumaran Subburayalu 2012

机译：校正：使用机载高光谱数据集对农业系统进行基于对象的随机森林图像分类的改进形状指数
7. Selecting Important Genes from Complex Microarray Datasets Using a Random Forest Model [O] . Hui Xia, Yasemin M Akay, Metin Akay 2021

机译：使用随机森林模型从复杂的微阵列数据集中选择重要基因

Classification of large microarray datasets using fast random forest construction.

摘要

著录项

相似文献

相关主题

期刊订阅