首页> 美国卫生研究院文献>PLoS Computational Biology >MrTADFinder: A network modularity based approach to identify topologically associating domains in multiple resolutions
【2h】

MrTADFinder: A network modularity based approach to identify topologically associating domains in multiple resolutions

机译:MrTADFinder:一种基于网络模块化的方法,可在多种分辨率下识别拓扑关联的域

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Genome-wide proximity ligation based assays such as Hi-C have revealed that eukaryotic genomes are organized into structural units called topologically associating domains (TADs). From a visual examination of the chromosomal contact map, however, it is clear that the organization of the domains is not simple or obvious. Instead, TADs exhibit various length scales and, in many cases, a nested arrangement. Here, by exploiting the resemblance between TADs in a chromosomal contact map and densely connected modules in a network, we formulate TAD identification as a network optimization problem and propose an algorithm, MrTADFinder, to identify TADs from intra-chromosomal contact maps. MrTADFinder is based on the network-science concept of modularity. A key component of it is deriving an appropriate background model for contacts in a random chain, by numerically solving a set of matrix equations. The background model preserves the observed coverage of each genomic bin as well as the distance dependence of the contact frequency for any pair of bins exhibited by the empirical map. Also, by introducing a tunable resolution parameter, MrTADFinder provides a self-consistent approach for identifying TADs at different length scales, hence the acronym "Mr" standing for Multiple Resolutions. We then apply MrTADFinder to various Hi-C datasets. The identified domain boundaries are marked by characteristic signatures in chromatin marks and transcription factors (TF) that are consistent with earlier work. Moreover, by calling TADs at different length scales, we observe that boundary signatures change with resolution, with different chromatin features having different characteristic length scales. Furthermore, we report an enrichment of HOT (high-occupancy target) regions near TAD boundaries and investigate the role of different TFs in determining boundaries at various resolutions. To further explore the interplay between TADs and epigenetic marks, as tumor mutational burden is known to be coupled to chromatin structure, we examine how somatic mutations are distributed across boundaries and find a clear stepwise pattern. Overall, MrTADFinder provides a novel computational framework to explore the multi-scale structures in Hi-C contact maps.
机译:基于全基因组邻近连接的检测方法(例如Hi-C)显示,真核基因组被组织为称为拓扑关联域(TAD)的结构单元。然而,从染色体接触图的目视检查中,很明显,域的组织并不简单或不明显。取而代之的是,TAD显示各种长度比例,并且在许多情况下显示为嵌套排列。在这里,通过利用染色体接触图中的TAD与网络中紧密连接的模块之间的相似性,我们将TAD识别公式化为网络优化问题,并提出一种算法MrTADFinder,以从染色体内接触图中识别TAD。 MrTADFinder基于网络科学的模块化概念。它的一个关键组成部分是通过对一组矩阵方程进行数值求解,从而得出适用于随机链中联系人的背景模型。背景模型保留了观察到的每个基因组区间的覆盖率以及经验图显示的任何一对区间的接触频率的距离依赖性。此外,通过引入可调分辨率参数,MrTADFinder提供了一种自一致的方法来识别不同长度比例的TAD,因此首字母缩写词“ Mr”代表多种分辨率。然后,我们将MrTADFinder应用于各种Hi-C数据集。所识别的结构域边界由染色质标记和转录因子(TF)中与早期工作一致的特征性特征标记。此外,通过以不同的长度尺度调用TAD,我们观察到边界特征随着分辨率而变化,具有不同特征长度尺度的不同染色质特征。此外,我们报告了TAD边界附近的HOT(高占用目标)区域的富集,并研究了不同TF在确定各种分辨率的边界中的作用。为了进一步探讨TAD与表观遗传标记之间的相互作用,因为已知肿瘤突变负担与染色质结构相关,我们研究了体细胞突变如何跨边界分布并找到清晰的逐步模式。总体而言,MrTADFinder提供了一种新颖的计算框架来探索Hi-C联系人图中的多尺度结构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号