...
首页> 外文期刊>Journal of Bioinformatics and Computational Biology >AN INTEGRATED FEATURE SELECTION AND CLASSIFICATION METHOD TO SELECT MINIMUM NUMBER OF VARIABLES ON THE CASE STUDY OF GENE EXPRESSION DATA
【24h】

AN INTEGRATED FEATURE SELECTION AND CLASSIFICATION METHOD TO SELECT MINIMUM NUMBER OF VARIABLES ON THE CASE STUDY OF GENE EXPRESSION DATA

机译:基因表达数据研究中选择最小数量变量的综合特征选择和分类方法

获取原文
获取原文并翻译 | 示例
           

摘要

This paper introduces a novel generic approach for classification problems with the objective of achieving maximum classification accuracy with minimum number of features selected. The method is illustrated with several case studies of gene expression data. Our approach integrates filter and wrapper gene selection methods with an added objective of selecting a small set of non-redundant genes that are most relevant for classification with the provision of bins for genes to be swapped in the search for their biological relevance. It is capable of selecting relatively few marker genes while giving comparable or better leave-one-out cross-validation accuracy when compared with gene ranking selection approaches. Additionally, gene profiles can be extracted from the evolving connectionist system, which provides a set of rules that can be further developed into expert systems. The approach uses an integration of Pearson correlation coefficient and signal-to-noise ratio methods with an adaptive evolving classifier applied through the leave-one-out method for validation. Datasets of gene expression from four case studies are used to illustrate the method. The results show the proposed approach leads to an improved feature selection process in terms of reducing the number of variables required and an increased in classification accuracy.
机译:本文介绍了一种用于分类问题的新颖通用方法,其目的是在选择最少特征的情况下实现最大分类精度。通过基因表达数据的几个案例研究说明了该方法。我们的方法将过滤器和包装器基因选择方法整合在一起,其附加目标是选择一小组与分类最相关的非冗余基因,并为要寻找其生物学相关性的基因提供交换箱。与基因排名选择方法相比,它能够选择相对较少的标记基因,同时具有相当或更好的留一法交叉验证准确性。此外,可以从不断发展的连接主义者系统中提取基因概况,该系统提供了一组可以进一步发展为专家系统的规则。该方法结合了皮尔逊相关系数和信噪比方法以及通过留一法进行验证的自适应进化分类器的集成。来自四个案例研究的基因表达数据集用于说明该方法。结果表明,所提出的方法在减少所需变量数量和提高分类精度方面带来了改进的特征选择过程。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号