首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Prediction of Essential Genes in Comparison States Using Machine Learning
【24h】

Prediction of Essential Genes in Comparison States Using Machine Learning

机译:利用机器学习预测比较状态的基本基因

获取原文
获取原文并翻译 | 示例
           

摘要

Identifying essential genes in comparison states (EGS) is vital to understanding cell differentiation, performing drug discovery, and identifying disease causes. Here, we present a machine learning method termed Prediction of Essential Genes in Comparison States (PreEGS). To capture the alteration of the network in comparison states, PreEGS extracts topological and gene expression features of each gene in a five-dimensional vector. PreEGS also recruits a positive sample expansion method to address the problem of unbalanced positive and negative samples, which is often encountered in practical applications. Different classifiers are applied to the simulated datasets, and the PreEGS based on the random forests model (PreEGSRF) was chosen for optimal performance. PreEGSRF was then compared with six other methods, including three machine learning methods, to predict EGS in a specific state. On real datasets with four gene regulatory networks, PreEGSRF predicted five essential genes related to leukemia and five enriched KEGG pathways. Four of the predicted essential genes and all predicted pathways were consistent with previous studies and highly correlated with leukemia. With high prediction accuracy and generalization ability, PreEGSRF is broadly applicable for the discovery of disease-causing genes, driver genes for cell fate decisions, and complex biomarkers of biological systems.
机译:在比较状态(EGS)中鉴定必要基因对于了解细胞分化,表现药物发现和鉴定疾病原因至关重要。在这里,我们提出了一种机器学习方法,其在比较状态(PREEGS)中的必需基因预测。为了在比较状态下捕获网络的改变,PREEGS在五维载体中提取每个基因的拓扑和基因表达特征。 PREEGS还招募了一种积极的样本扩展方法,以解决不平衡的正面和阴性样本的问题,这通常在实际应用中遇到。应用于模拟数据集的不同分类器,选择基于随机林模型(PREEGSRF)的PREEGS以获得最佳性能。然后将PREEGSRF与六种其他方法(包括三种机器学习方法)进行比较,以预测特定状态。在具有四个基因监管网络的实际数据集上,PREEGSRF预测了与白血病有关的五个基因和五个浓汤途径。四个预测的必需基因和所有预测途径与先前的研究一致,与白血病高度相关。具有高预测准确性和泛化能力,PREEGSRF广泛适用于发现疾病导致基因,用于细胞命运决策的驾驶员基因,以及生物系统的复杂生物标志物。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号