首页> 外文会议>4th International Conference on Biomedical Engineering in Vietnam >Advances in Computational Identification and Modeling of DNA Regulatory Elements in the Human Genome
【24h】

Advances in Computational Identification and Modeling of DNA Regulatory Elements in the Human Genome

机译:人类基因组中DNA调控元件的计算鉴定和建模研究进展

获取原文
获取原文并翻译 | 示例

摘要

Identification of DNA regulatory elements in the human genome remains a significant challenge. Variation in these regulatory elements can contribute to disease in many ways by altering protein levels. Enhancers constitute an important class of these DNA regulatory elements, and a major component of current research is focused on a more complete understanding of enhancer function and improved techniques for enhancer detection. We recently developed a computational approach to identify enhancers from primary DNA sequence using a support vector machine (kmer-SVM) framework. Here we show that the kmer-SVM model can accurately predict tissue specific enhancer activity without any prior knowledge about TF binding sites. We adapt this approach to predict genomic TF binding data generated by the ENCODE project, showing that genomic MYC binding can be accurately predicted from local DNA sequence with the kmer-SVM. We find similar accuracy with an SVM using PWMs representing known TF binding specificities. By integrating Chip-seq and expression data, we show that while much of MYC binding is shared between ENCODE cell types and is promoter proximal, cell-type specific MYC binding is distal and is correlated with enhanced cell-specific expression of nearby (~50kb) genes. The distinction between shared and cell-specific MYC binding is determined by DNA sequence variation around the canonical MYC binding site, which by itself cannot distinguish cell-specific binding events. These results suggest that tissue specific enhancer activity is specified by primary DNA sequence, that local sequence context controls tissue specific activity through cooperative TF interactions, and that local context sequence features can be identified from genomic binding data.
机译:鉴定人类基因组中的DNA调控元件仍然是一项重大挑战。这些调节元件的变化可通过改变蛋白质水平以多种方式导致疾病。增强子构成了这些DNA调控元件的重要类别,当前研究的主要内容集中在对增强子功能的更全面理解以及增强子检测的改进技术上。我们最近开发了一种使用支持​​向量机(kmer-SVM)框架从一级DNA序列中识别增强子的计算方法。在这里,我们显示kmer-SVM模型可以准确预测组织特异的增强子活性,而无需任何有关TF结合位点的先验知识。我们采用这种方法来预测由ENCODE项目生成的基因组TF结合数据,表明可以使用kmer-SVM从本地DNA序列准确预测基因组MYC结合。我们发现使用代表已知TF结合特异性的PWM的SVM具有相似的准确性。通过整合Chip-seq和表达数据,我们发现,虽然许多MYC结合在ENCODE细胞类型之间共享并且位于启动子的近端,但细胞类型特异性MYC结合却在远端,并且与附近的增强的细胞特异性表达相关(〜50kb )基因。共享的和细胞特异性的MYC结合之间的区别是由规范的MYC结合位点周围的DNA序列变异决定的,而DNA序列变异本身不能区分细胞特异性的结合事件。这些结果表明,组织特异性增强子活性由初级DNA序列指定,局部序列背景通过协同TF相互作用控制组织特异性活性,并且局部背景序列特征可以从基因组结合数据中鉴定。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号