首页> 外文期刊>Computational Biology and Bioinformatics, IEEE/ACM Transactions on >Probabilistic Inference on Multiple Normalized Signal Profiles from Next Generation Sequencing: Transcription Factor Binding Sites
【24h】

Probabilistic Inference on Multiple Normalized Signal Profiles from Next Generation Sequencing: Transcription Factor Binding Sites

机译:来自下一代测序的多个归一化信号图的概率推断:转录因子结合位点

获取原文
获取原文并翻译 | 示例
           

摘要

With the prevalence of chromatin immunoprecipitation (ChIP) with sequencing (ChIP-Seq) technology, massive ChIP-Seq data has been accumulated. The ChIP-Seq technology measures the genome-wide occupancy of DNA-binding proteins . It is well-known that different DNA-binding protein occupancies may result in a gene being regulated in different conditions (e.g. different cell types). To fully understand a gene’s function, it is essential to develop probabilistic models on multiple ChIP-Seq profiles for deciphering the gene transcription causalities. In this work, we propose and describe two probabilistic models. Assuming the conditional independence of different DNA-binding proteins’ occupancies, the first method (SignalRanker) is developed as an intuitive method for ChIP-Seq genome-wide signal profile inference. Unfortunately, such an assumption may not always hold in some gene regulation cases. Thus, we propose and describe another method (FullSignalRanker) which does not make the conditional independence assumption. The proposed methods are compared with other existing methods on ENCODE ChIP-Seq datasets, demonstrating its regression and classification ability. The results suggest that FullSignalRanker is the best-performing method for recovering the signal ranks on the promoter and enhancer regions. In addition, FullSignalRanker is also the best-performing method for peak sequence classification. We envision that SignalRanker and FullSignalRanker will become important in the era of next generation sequencing. FullSignalRanker program is available on the following website: http://www.cs.toronto.edu/wkc/FullSignalRanker/
机译:随着染色质免疫沉淀(ChIP)和测序(ChIP-Seq)技术的普及,已经积累了大量的ChIP-Seq数据。 ChIP-Seq技术可测量DNA结合蛋白在全基因组中的占有率。众所周知,不同的DNA结合蛋白占有率可能导致基因在不同的条件下(例如不同的细胞类型)被调节。为了充分了解基因的功能,必须在多个ChIP-Seq谱上建立概率模型来破译基因转录因果关系。在这项工作中,我们提出并描述了两种概率模型。假设不同DNA结合蛋白的存在条件独立,第一种方法(SignalRanker)被开发为ChIP-Seq基因组范围内信号谱推断的直观方法。不幸的是,这种假设在某些基因调控情况下可能并不总是成立。因此,我们提出并描述了另一种方法(FullSignalRanker),该方法不进行条件独立性假设。将该方法与ENCODE ChIP-Seq数据集上的其他现有方法进行了比较,证明了其回归和分类能力。结果表明,FullSignalRanker是恢复启动子和增强子区域信号等级的最佳方法。此外,FullSignalRanker还是用于峰序列分类的最佳方法。我们设想,SignalRanker和FullSignalRanker在下一代测序时代将变得重要。 FullSignalRanker程序可在以下网站上找到:http://www.cs.toronto.edu/wkc/FullSignalRanker/

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号