...
首页> 外文期刊>IEEE transactions on nanobioscience >Distributed Sequence Alignment Applications for the Public Computing Architecture
【24h】

Distributed Sequence Alignment Applications for the Public Computing Architecture

机译:面向公共计算体系结构的分布式序列比对应用程序

获取原文
获取原文并翻译 | 示例
           

摘要

The public computer architecture shows promise as a platform for solving fundamental problems in bioinformatics such as global gene sequence alignment and data mining with tools such as the basic local alignment search tool (BLAST). Our implementation of these two problems on the Berkeley open infrastructure for network computing (BOINC) platform demonstrates a runtime reduction factor of 1.15 for sequence alignment and 16.76 for BLAST. While the runtime reduction factor of the global gene sequence alignment application is modest, this value is based on a theoretical sequential runtime extrapolated from the calculation of a smaller problem. Because this runtime is extrapolated from running the calculation in memory, the theoretical sequential runtime would require 37.3 GB of memory on a single system. With this in mind, the BOINC implementation not only offers the reduced runtime, but also the aggregation of the available memory of all participant nodes. If an actual sequential run of the problem were compared, a more drastic reduction in the runtime would be seen due to an additional secondary storage I/O overhead for a practical system. Despite the limitations of the public computer architecture, most notably in communication overhead, it represents a practical platform for grid- and cluster-scale bioinformatics computations today and shows great potential for future implementations.
机译:公共计算机体系结构显示出有望作为解决生物信息学中基本问题的平台,例如全球基因序列比对和使用基本局部比对搜索工具(BLAST)等工具进行数据挖掘。我们在伯克利网络计算开放式基础架构(BOINC)平台上对这两个问题的实现证明,序列比对的运行时缩减因子为1.15,BLAST的运行时缩减因子为16.76。虽然全局基因序列比对应用程序的运行时减少因子是适度的,但此值是基于从较小问题的计算中推断出的理论顺序运行时。由于此运行时是从在内存中运行计算推断出来的,因此理论上的顺序运行时在单个系统上将需要37.3 GB的内存。考虑到这一点,BOINC实现不仅减少了运行时间,而且还汇总了所有参与者节点的可用内存。如果将问题的实际顺序运行进行比较,则由于实际系统会增加辅助存储I / O开销,因此运行时间将大大减少。尽管公用计算机体系结构有局限性,最显着的是通信开销,它还是当今网格和集群规模生物信息学计算的实用平台,并显示出未来实现的巨大潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号