...
首页> 外文期刊>Proceedings of the National Academy of Sciences of the United States of America >Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome
【24h】

Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome

机译:系统预测和验证与人类基因组中拷贝数变异相关的断点

获取原文
获取原文并翻译 | 示例
           

摘要

Copy-number variants (CNVs) are an abundant form of genetic variation in humans. However, approaches for determining exact CNV breakpoint sequences (physical deletion or duplication boundaries) across individuals, crucial for associating genotype to phe-notype, have been lacking so far, and the vast majority of CNVs have been reported with approximate genomic coordinates only. Here, we report an approach, called BreakPtr, for fine-mapping CNVs (available from http://breakptr.gersteinlab.org). We statistically integrate both sequence characteristics and data from high-resolution comparative genome hybridization experiments in a discrete-valued, bivariate hidden Markov model. Incorporation of nucleotide-sequence information allows us to take into account the fact that recently duplicated sequences (e.g., segmental duplications) often coincide with breakpoints. In anticipation of an upcoming increase in CNV data, we developed an iterative, "active" approach to initially scoring with a preliminary model, performing targeted validations, retraining the model, and then rescoring, and a flexible parameterization system that intuitively collapses from a full model of 2,503 parameters to a core one of only 10. Using our approach, we accurately mapped >400 breakpoints on chromosome 22 and a region of chromosome 11, refining the boundaries of many previously approximately mapped CNVs. Four predicted breakpoints flanked known disease-associated deletions. We validated an additional four predicted CNV breakpoints by sequencing. Overall, our results suggest a predictive resolution of ≈300bp. This level of resolution enables more precise correlations between CNVs and across individuals than previously possible, allowing the study of CNV population frequencies. Further, it enabled us to demonstrate a clear Mendelian pattern of inheritance for one of the CNVs.
机译:拷贝数变异(CNV)是人类遗传变异的一种丰富形式。但是,到目前为止,缺乏确定个体之间确切的CNV断点序列(物理缺失或重复边界)的方法,这对于将基因型与phe-notype相关联至关重要,据报道,绝大多数CNV仅具有近似的基因组坐标。在这里,我们报告了一种名为BreakPtr的方法,用于精细映射CNV(可从http://breakptr.gersteinlab.org获得)。我们在离散值,双变量隐藏马尔可夫模型中统计地整合了序列特征和高分辨率比较基因组杂交实验的数据。掺入核苷酸序列信息使我们考虑到最近重复的序列(例如,片段重复)经常与断点一致的事实。考虑到CNV数据即将到来,我们开发了一种迭代的“主动”方法,可以使用初步模型进行初始评分,执行目标验证,重新训练模型,然后进行评分,以及灵活的参数化系统,该系统可以直观地从整体上崩溃将2503个参数的模型转换为仅10个核中的一个。使用我们的方法,我们在22号染色体和11号染色体的一个区域上精确地映射了> 400个断点,从而完善了许多以前近似映射的CNV的边界。四个预测的断点位于已知的疾病相关的缺失旁。我们通过测序验证了另外四个预测的CNV断点。总体而言,我们的结果表明预测分辨率约为300bp。这种分辨率水平使CNV之间以及各个个体之间的关联比以前更精确,从而可以研究CNV群体的频率。此外,它使我们能够展示一种CNV的清晰孟德尔遗传模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号