首页> 外文会议>Pattern Recognition in Bioinformatics >Feature Selection and Classification for Small Gene Sets
【24h】

Feature Selection and Classification for Small Gene Sets

机译:小型基因集的特征选择和分类

获取原文
获取原文并翻译 | 示例

摘要

Random Forests, Support Vector Machines and k-Nearest Neighbors are successful and proven classification techniques that are widely used for different kinds of classification problems. One of them is classification of genomic and proteomic data that is known as a problem with extremely high dimensionality and therefore demands suited classification techniques. In this domain they are usually combined with gene selection techniques to provide optimal classification accuracy rates. Another reason for reducing the dimensionality of such datasets is their interpretability. It is much easier to interpret a small set of ranked genes than 20 or 30 thousands of unordered genes. In this paper we present a classification ensemble of decision trees called Rotation Forest and evaluate its classification performance on small subsets of ranked genes for 14 genomic and proteomic classification problems. An important feature of Rotation Forest is demonstrated - i.e. robustness and high classification accuracy using small sets of genes.
机译:随机森林,支持向量机和k最近邻是成功且经过验证的分类技术,已广泛用于各种分类问题。其中之一是基因组和蛋白质组数据的分类,这被认为是具有极高维度的问题,因此需要合适的分类技术。在这一领域,它们通常与基因选择技术结合使用以提供最佳的分类准确率。降低此类数据集维数的另一个原因是其可解释性。与20或3万个无序基因相比,解释一小列排名基因要容易得多。在本文中,我们提出了一种称为“旋转森林”的决策树分类集合,并针对14个基因组和蛋白质组分类问题对排名基因的小子集评估其分类性能。展示了轮作林的重要特征-即使用少量基因集的鲁棒性和较高的分类精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号