...
首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >MEC: Misassembly Error Correction in Contigs based on Distribution of Paired-End Reads and Statistics of GC-contents
【24h】

MEC: Misassembly Error Correction in Contigs based on Distribution of Paired-End Reads and Statistics of GC-contents

机译:MEC:基于配对终端读取和GC-Contents的统计数据分布的Contigs中误报纠错

获取原文
获取原文并翻译 | 示例
           

摘要

The de novo assembly tools aim at reconstructing genomes from next-generation sequencing (NGS) data. However, the assembly tools usually generate a large amount of contigs containing many misassemblies, which are caused by problems of repetitive regions, chimeric reads, and sequencing errors. As they can improve the accuracy of assembly results, detecting and correcting the misassemblies in contigs are appealing, yet challenging. In this study, a novel method, called MEC, is proposed to identify and correct misassemblies in contigs. Based on the insert size distribution of paired-end reads and the statistical analysis of GC-contents, MEC can identify more misassemblies accurately. We evaluate our MEC with the metrics (NA50, NGA50) on four datasets, compared it with the most available misassembly correction tools, and carry out experiments to analyze the influence of MEC on scaffolding results, which shows that MEC can reduce misassemblies effectively and result in quantitative improvements in scaffolding quality. MEC is publicly available at https://github.com/bioinfomaticsCSU/MEC.
机译:DE Novo组装工具旨在重建来自下一代测序(NGS)数据的基因组。然而,组装工具通常产生包含许多误框的大量折叠,这是由重复区域,嵌合读取和测序误差的问题引起的。因为它们可以提高装配结果的准确性,检测和纠正Contigs中的误解是吸引人的,但具有挑战性。在这项研究中,提出了一种称为MEC的新方法,以识别并纠正Contig的误框。基于成对末端读取的插入尺寸分布和GC含量的统计分析,MEC可以准确地识别更多的误入歧机。我们将MEC与四个数据集上的指标(NA50,NGA50)进行评估,与最可用的误用校正工具相比,并进行实验,以分析MEC对脚手架结果的影响,这表明MEC可以有效地减少误框并导致在脚手架质量的定量改进。 MEC在HTTPS://github.com/bioinfomaticscsu/mec公开使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号