...
首页> 外文期刊>Genomics >Establishment of an eHAP1 human haploid cell line hybrid reference genome assembled from short and long reads
【24h】

Establishment of an eHAP1 human haploid cell line hybrid reference genome assembled from short and long reads

机译:建立EHAP1人单倍体细胞系杂交参考基因组组装从短期和长读数

获取原文
获取原文并翻译 | 示例
           

摘要

Haploid cell lines are a valuable research tool with broad applicability for genetic assays. As such the fully haploid human cell line, eHAP1, has been used in a wide array of studies. However, the absence of a corresponding reference genome sequence for this cell line has limited the potential for more widespread applications to experiments dependent on available sequence, like capture-clone methodologies. We generated -15 x coverage Nanopore long reads from ten GridION flowcells and utilized this data to assemble a de novo draft genome using minimap and miniasm and subsequently polished using Racon. This assembly was further polished using previously generated, low-coverage, Illumina short reads with Pilon and ntEdit. This resulted in a hybrid eHAP1 assembly with > 90% complete BUSCO scores. We further assessed the eHAP1 long read data for structural variants using Sniffles and identify a variety of rearrangements, including a previously established Philadelphia translocation. Finally, we demonstrate how some of these variants overlap open chromatin regions, potentially impacting regulatory regions. By integrating both long and short reads, we generated a high-quality reference assembly for eHAP1 cells. The union of long and short reads demonstrates the utility in combining sequencing platforms to generate a high-quality reference genome de novo solely from low coverage data. We expect the resulting eHAP1 genome assembly to provide a useful resource to enable novel experimental applications in this important model cell line.
机译:单倍体细胞系是一种有价值的研究工具,具有广泛适用性的遗传测定。作为这样的完全单倍体人体细胞系EHAP1,已在广泛的研究中使用。然而,没有对该细胞系的相应参考基因组序列限制了更广泛应用于依赖于可用序列的实验的可能性,例如捕获克隆方法。我们生成-15 x覆盖纳米孔长度从十个网格流电池读取,并利用此数据使用Sineap和MiniAsm组装De Novo草案基因组,随后使用租船抛光。使用先前产生的低覆盖率,Illumina短读取,使用Pilon和NTEDIT进一步抛光该组件。这导致混合EHAP1组件,完整的Busco分数具有> 90%。我们进一步评估了使用Sniffles的结构变体的EHAP1长读取数据,并确定各种重排,包括先前建立的费城易位。最后,我们展示了一些这些变体如何重叠开口染色质区域,可能影响监管区域。通过整合长短读取,我们为EHAP1单元生成了高质量的参考组件。长短读取的联盟演示了在组合测序平台中的效用,仅从低覆盖数据产生高质量参考基因组de Novo。我们预计所产生的EHAP1基因组组件提供有用的资源,以使新颖的模型细胞系能够实现新的实验应用。

著录项

  • 来源
    《Genomics》 |2020年第3期|共6页
  • 作者

  • 作者单位
  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 医学遗传学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号