首页> 外文会议>IEEE International Conference on Computational Advances in Bio and Medical Sciences >Differential gene expression analysis using coexpression and RNA-Seq data
【24h】

Differential gene expression analysis using coexpression and RNA-Seq data

机译:使用共表达和RNA-Seq数据进行差异基因表达分析

获取原文

摘要

RNA-Seq is increasingly being used for differential gene expression analysis which was dominated by the microarray technology in the past decade. However, inferring differential gene expression based on the observed difference of RNA-Seq read counts has unique challenges that were not present in microarray-based analysis. The differential expression estimation may be biased against low read count values such that the differential expression of genes with high read counts is more easily detected. The estimation bias may further propagate in downstream analyses at the systems biology level if it is not corrected. To obtain a better inference of differential gene expression, we propose a new efficient algorithm based on a markov random field (MRF) model, called MRFSeq, that uses additional gene coexpression data to enhance the prediction power. Our main technical contribution is the careful selection of the clique potential functions in the MRF so its maximum a posteriori (MAP) estimation can be reduced to the well-known maximum flow problem and thus solved in polynomial time. Our extensive experiments on simulated and real RNA-Seq datasets demonstrate that MRFSeq is more accurate and less biased against genes with low read counts than the existing methods based on RNA-Seq data alone. For example, on the well-studied MAQC dataset, MRFSeq improved the sensitivity from 11.6% to 38.8% for genes with low read counts. MRFSeq is implemented in C++ and available at http://www.cs.ucr.edu/∼yyang027/mrfseq.htm
机译:RNA-Seq越来越多地用于差异基因表达分析,在过去十年中,微阵列技术一直主导着RNA-Seq。但是,基于观察到的RNA-Seq读数计数差异推断差异基因表达具有独特的挑战,而这些挑战在基于微阵列的分析中不存在。差异表达估计可以偏向于低读取计数值,使得具有高读取计数的基因的差异表达更容易被检测到。如果不纠正,估计偏差可能会在系统生物学级别的下游分析中进一步传播。为了更好地推断差异基因表达,我们提出了一种基于马尔可夫随机场(MRF)模型的新有效算法,称为MRFSeq,该算法使用其他基因共表达数据来增强预测能力。我们的主要技术贡献是仔细选择了MRF中的派系势函数,因此可以将其最大后验(MAP)估计简化为众所周知的最大流量问题,从而在多项式时间内解决。我们在模拟和真实的RNA-Seq数据集上进行的广泛实验表明,与仅基于RNA-Seq数据的现有方法相比,MRFSeq更加准确,并且对具有低读取计数的基因产生较少的偏向。例如,在经过充分研究的MAQC数据集上,MRFSeq将读取计数低的基因的灵敏度从11.6%提高到38.8%。 MRFSeq用C ++实现,可从http://www.cs.ucr.edu/∼yyang027/mrfseq.htm获得

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号