...
首页> 外文期刊>BMC Bioinformatics >groHMM: a computational tool for identifying unannotated and cell type-specific transcription units from global run-on sequencing data
【24h】

groHMM: a computational tool for identifying unannotated and cell type-specific transcription units from global run-on sequencing data

机译:groHMM:一种计算工具,可从全局连续测序数据中识别未注释的和特定细胞类型的转录单位

获取原文
           

摘要

Background Global run-on coupled with deep sequencing (GRO-seq) provides extensive information on the location and function of coding and non-coding transcripts, including primary microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and enhancer RNAs (eRNAs), as well as yet undiscovered classes of transcripts. However, few computational tools tailored toward this new type of sequencing data are available, limiting the applicability of GRO-seq data for identifying novel transcription units. Results Here, we present groHMM, a computational tool in R, which defines the boundaries of transcription units de novo using a two state hidden-Markov model (HMM). A systematic comparison of the performance between groHMM and two existing peak-calling methods tuned to identify broad regions (SICER and HOMER) favorably supports our approach on existing GRO-seq data from MCF-7 breast cancer cells. To demonstrate the broader utility of our approach, we have used groHMM to annotate a diverse array of transcription units (i.e., primary transcripts) from four GRO-seq data sets derived from cells representing a variety of different human tissue types, including non-transformed cells (cardiomyocytes and lung fibroblasts) and transformed cells (LNCaP and MCF-7 cancer cells), as well as non-mammalian cells (from flies and worms). As an example of the utility of groHMM and its application to questions about the transcriptome, we show how groHMM can be used to analyze cell type-specific enhancers as defined by newly annotated enhancer transcripts. Conclusions Our results show that groHMM can reveal new insights into cell?type-specific transcription by identifying novel transcription units, and serve as a complete and useful tool for evaluating functional genomic elements in cells.
机译:背景技术全局测序与深度测序(GRO-seq)结合使用,可提供有关编码和非编码转录本的位置和功能的广泛信息,包括原始microRNA(miRNA),长非编码RNA(lncRNA)和增强子RNA( eRNA),以及尚未发现的转录本类别。但是,针对这种新型测序数据量身定制的计算工具很少,这限制了GRO-seq数据用于识别新型转录单位的适用性。结果在这里,我们介绍了groHMM,它是R中的一种计算工具,它使用两种状态的隐马尔可夫模型(HMM)定义了从头重新定义转录单位的边界。系统地比较groHMM和两种现有的调峰方法以优化识别广泛区域(SICER和HOMER)之间的性能,有利地支持了我们对来自MCF-7乳腺癌细胞的现有GRO-seq数据的方法。为了证明我们的方法具有更广泛的用途,我们使用了groHMM来注释来自四个GRO-seq数据集的转录单位(即原始转录本)的多样化阵列,这些数据集来自代表多种不同人类组织类型(包括未转化的人类组织)的细胞细胞(心肌细胞和肺成纤维细胞)和转化细胞(LNCaP和MCF-7癌细胞),以及非哺乳动物细胞(蝇和蠕虫)。作为groHMM实用程序及其在有关转录组问题中的应用的一个示例,我们展示了groHMM如何可用于分析由新注释的增强子转录本定义的特定于细胞类型的增强子。结论我们的结果表明,groHMM通过鉴定新的转录单位可以揭示细胞类型特异性转录的新见解,并且可以作为评估细胞中功能基因组元件的完整而有用的工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号