首页> 外文期刊>Genomics >Unsupervised feature selection algorithm for multiclass cancer classification of gene expression RNA-Seq data
【24h】

Unsupervised feature selection algorithm for multiclass cancer classification of gene expression RNA-Seq data

机译:关于基因表达RNA-SEQ数据的多标菌癌分类的无监督特征选择算法

获取原文
       

摘要

This paper presents a Grouping Genetic Algorithm (GGA) to solve a maximally diverse grouping problem. It has been applied for the classification of an unbalanced database of 801 samples of gene expression RNA-Seq data in 5 types of cancer. The samples are composed by 20,531 genes. GGA extracts several groups of genes that achieve high accuracy in multiple classification. Accuracy has been evaluated by an Extreme Learning Machine algorithm and was found to be slightly higher in balanced databases than in unbalanced ones. The final classification decision has been made through a weighted majority vote system between the groups of features. The proposed algorithm finally selects 49 genes to classify samples with an average accuracy of 98.81% and a standard deviation of 0.0174.
机译:本文介绍了分组遗传算法(GGA),以解决最大程度的分组问题。它已应用于5种癌症中的801个基因表达RNA-SEQ数据的不平衡数据库的分类。样品由20,531个基因组成。 GGA提取几组基因,可在多种分类中获得高精度。精度已经通过极端学习机算法进行了评估,并且在平衡数据库中被发现比在不平衡数据库中略高。最终的分类决定是通过在特征组之间的加权多数票制度进行的。所提出的算法最终选择49个基因以分类样品,平均精度为98.81%,标准偏差为0.0174。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号