首页> 美国卫生研究院文献>other >CUSHAW3: Sensitive and Accurate Base-Space and Color-Space Short-Read Alignment with Hybrid Seeding
【2h】

CUSHAW3: Sensitive and Accurate Base-Space and Color-Space Short-Read Alignment with Hybrid Seeding

机译:CUSHAW3:具有混合播种的灵敏准确的基空间和色空间短读对齐

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The majority of next-generation sequencing short-reads can be properly aligned by leading aligners at high speed. However, the alignment quality can still be further improved, since usually not all reads can be correctly aligned to large genomes, such as the human genome, even for simulated data. Moreover, even slight improvements in this area are important but challenging, and usually require significantly more computational endeavor. In this paper, we present CUSHAW3, an open-source parallelized, sensitive and accurate short-read aligner for both base-space and color-space sequences. In this aligner, we have investigated a hybrid seeding approach to improve alignment quality, which incorporates three different seed types, i.e. maximal exact match seeds, exact-match k-mer seeds and variable-length seeds, into the alignment pipeline. Furthermore, three techniques: weighted seed-pairing heuristic, paired-end alignment pair ranking and read mate rescuing have been conceived to facilitate accurate paired-end alignment. For base-space alignment, we have compared CUSHAW3 to Novoalign, CUSHAW2, BWA-MEM, Bowtie2 and GEM, by aligning both simulated and real reads to the human genome. The results show that CUSHAW3 consistently outperforms CUSHAW2, BWA-MEM, Bowtie2 and GEM in terms of single-end and paired-end alignment. Furthermore, our aligner has demonstrated better paired-end alignment performance than Novoalign for short-reads with high error rates. For color-space alignment, CUSHAW3 is consistently one of the best aligners compared to SHRiMP2 and BFAST. The source code of CUSHAW3 and all simulated data are available at .
机译:大多数下一代测序短读片段可通过领先的比对仪进行高速比对。但是,比对质量仍可进一步提高,因为通常即使对于模拟数据,并非所有读取都都能与大型基因组(例如人类基因组)正确地进行比对。而且,即使在这方面进行微小的改进也很重要但具有挑战性,并且通常需要大量的计算工作。在本文中,我们介绍了CUSHAW3,这是一种针对基空间和色彩空间序列的开源并行化,灵敏且准确的短读对齐器。在该对准器中,我们研究了一种提高对准质量的混合播种方法,该方法将三种不同的种子类型(即最大精确匹配的种子,精确匹配的k-mer种子和可变长度的种子)合并到对准管道中。此外,已经构想了三种技术:加权种子配对试探法,配对末端比对配对排序和读取伴侣抢救,以促进准确的配对末端比对。对于碱基空间比对,我们通过将模拟和真实读码与人类基因组比对,将CUSHAW3与Novoalign,CUSHAW2,BWA-MEM,Bowtie2和GEM进行了比较。结果表明,CUSHAW3在单端和配对端对齐方面始终优于CUSHAW2,BWA-MEM,Bowtie2和GEM。此外,对于具有高错误率的短读,我们的对准器已显示出比Novoalign更好的配对末端对准性能。对于色彩空间对齐,与SHRiMP2和BFAST相比,CUSHAW3始终是最佳的对齐器之一。 CUSHAW3的源代码和所有仿真数据可在上找到。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号