首页> 外文期刊>Nature >Resolving the complexity of the human genome using single - molecule sequencing
【24h】

Resolving the complexity of the human genome using single - molecule sequencing

机译:使用单分子测序解决人类基因组的复杂性

获取原文
获取原文并翻译 | 示例
           

摘要

The human genome is arguably the most complete mammalian reference assembly, yet more than 160 euchromatic gaps remain and aspects of its structural variation remain poorly understood ten years after its completion. To identify missing sequence and genetic variation, here we sequence and analyse a haploid human genome (CHM1) using single-molecule, real-time DNA sequencing. We close or extend 55% of the remaining interstitial gaps in the human GRCh37 reference genome-78% of which carried long runs of degenerate short tandem repeats, often several kilobases in length, embedded within (G+C)-rich genomic regions. We resolve the complete sequence of 26,079 euchromatic structural variants at the base-pair level, including inversions, complex insertions and long tracts of tandem repeats. Most have not been previously reported, with the greatest increases in sensitivity occurring for events less than 5 kilobases in size. Compared to the human reference, we find a significant insertional bias (3:1)in regions corresponding to complex insertions and long short tandem repeats. Our results suggest a greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology.
机译:人类基因组可以说是最完整的哺乳动物参考装配体,但是在完成十年后,仍然存在超过160个常染色体的缺口,并且对其结构变异的各个方面仍然知之甚少。为了识别缺失的序列和遗传变异,这里我们使用单分子实时DNA测序对单倍体人类基因组(CHM1)进行测序和分析。我们缩小或扩展了人类GRCh37参考基因组中55%的剩余间隙间隙,其中有78%携带着简并的短串联重复序列(通常长度为几千个碱基),并嵌入(G + C)丰富的基因组区域。我们在碱基对水平上解析了26,079个常色结构变体的完整序列,包括倒位,复杂插入和较长的串联重复序列。以前大多数都没有报道过,对于小于5千个碱基的事件,灵敏度的增加最大。与人类参考文献相比,我们发现在与复杂插入和长短串联重复序列相对应的区域中存在明显的插入偏倚(3:1)。我们的结果表明,以更长和更复杂的重复DNA变异的形式存在的人类基因组更加复杂,现在可以通过应用这种长读测序技术在很大程度上解决该问题。

著录项

  • 来源
    《Nature》 |2015年第7536期|608-611|共4页
  • 作者单位

    Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;

    Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA;

    Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;

    Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;

    Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;

    Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;

    Dipartimento di Biologia, Universita degli Studi di Bari 'Aldo Moro', Bari 70125, Italy;

    Department of Pathology, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, USA;

    Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;

    Pacific Biosciences of California, Inc., Menlo Park, California 94025, USA;

    Pacific Biosciences of California, Inc., Menlo Park, California 94025, USA;

    Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;

    Pacific Biosciences of California, Inc., Menlo Park, California 94025, USA;

    Pacific Biosciences of California, Inc., Menlo Park, California 94025, USA;

    Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);美国《生物学医学文摘》(MEDLINE);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号