【24h】

BLAST++: A Tool for BLASTing Queries in Batches

机译:BLAST ++:分批查询的工具

获取原文
获取原文并翻译 | 示例

摘要

BLAST is the standard tool that molecular biologists use to search for sequence similarity in genomic (and protein) databases. It employs a brute force approach of comparing a query sequence against every database sequence — for each pair of the sequences to be matched, BLAST searches for short fixed-length word pairs (seeds) in the sequences and then extends them to higher-scoring regions. To search multiple queries, the basic approach is to run BLAST on each of the queries one at a time. This is clearly inefficient and fails to exploit common subsequences that the collection of queries may share. In this paper, we propose a new genome search tool, BLAST++, that allows multiple, say K, queries to be searched against a database concurrently. The design of BLAST++ is based on our observation that the seed searching step of BLAST is a bottleneck that consumes more than 80% of the total response time! BLAST++ essentially treats a collection of queries as a single virtual query so that the seed searching step needs to be performed only once for common subsequences. We implemented BLAST++ as an extension of the NCBI BLAST, and evaluated its performance. Our study shows that the results obtained by BLAST++ are identical to that obtained by executing BLAST on each of the K queries, but the single-process version of BLAST++ completes the processing in a much shorter time, about only 25% of the original single-process version of NCBI BLAST.
机译:BLAST是分子生物学家用来在基因组(和蛋白质)数据库中搜索序列相似性的标准工具。它采用蛮力方法将查询序列与每个数据库序列进行比较-对于要匹配的每一对序列,BLAST在序列中搜索短的固定长度单词对(种子),然后将其扩展到得分较高的区域。要搜索多个查询,基本方法是一次对每个查询运行BLAST。这显然是低效的,并且无法利用查询集合可能共享的常见子序列。在本文中,我们提出了一种新的基因组搜索工具BLAST ++,该工具允许针对数据库同时搜索多个(例如K个)查询。 BLAST ++的设计基于我们的观察,即BLAST的种子搜索步骤是一个瓶颈,它消耗了总响应时间的80%以上! BLAST ++本质上将查询集合视为单个虚拟查询,因此对于公共子序列,种子搜索步骤仅需要执行一次。我们将BLAST ++实施为NCBI BLAST的扩展,并评估了其性能。我们的研究表明,BLAST ++所获得的结果与对K个查询中的每一个执行BLAST所获得的结果相同,但是BLAST ++的单进程版本可在更短的时间内完成处理,仅占原始单机版本的25%处理版本的NCBI BLAST。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号