...
首页> 外文期刊>Computational Biology and Bioinformatics, IEEE/ACM Transactions on >Fast Parallel Markov Clustering in Bioinformatics Using Massively Parallel Computing on GPU with CUDA and ELLPACK-R Sparse Format
【24h】

Fast Parallel Markov Clustering in Bioinformatics Using Massively Parallel Computing on GPU with CUDA and ELLPACK-R Sparse Format

机译:在具有CUDA和ELLPACK-R稀疏格式的GPU上使用大规模并行计算在生物信息学中进行快速并行Markov聚类

获取原文
获取原文并翻译 | 示例
           

摘要

Markov clustering (MCL) is becoming a key algorithm within bioinformatics for determining clusters in networks. However, with increasing vast amount of data on biological networks, performance and scalability issues are becoming a critical limiting factor in applications. Meanwhile, GPU computing, which uses CUDA tool for implementing a massively parallel computing environment in the GPU card, is becoming a very powerful, efficient, and low-cost option to achieve substantial performance gains over CPU approaches. The use of on-chip memory on the GPU is efficiently lowering the latency time, thus, circumventing a major issue in other parallel computing environments, such as MPI. We introduce a very fast Markov clustering algorithm using CUDA (CUDA-MCL) to perform parallel sparse matrix-matrix computations and parallel sparse Markov matrix normalizations, which are at the heart of MCL. We utilized ELLPACK-R sparse format to allow the effective and fine-grain massively parallel processing to cope with the sparse nature of interaction networks data sets in bioinformatics applications. As the results show, CUDA-MCL is significantly faster than the original MCL running on CPU. Thus, large-scale parallel computation on off-the-shelf desktop-machines, that were previously only possible on supercomputing architectures, can significantly change the way bioinformaticians and biologists deal with their data.
机译:马尔可夫聚类(MCL)正在成为生物信息学中确定网络中聚类的关键算法。但是,随着生物网络上海量数据的增加,性能和可伸缩性问题已成为应用程序中的关键限制因素。同时,使用CUDA工具在GPU卡中实现大规模并行计算环境的GPU计算正成为一种非常强大,高效且低成本的选择,以实现比CPU方法更高的性能。在GPU上使用片上内存可有效缩短等待时间,从而避免了其他并行计算环境(例如MPI)中的主要问题。我们介绍一种使用CUDA(CUDA-MCL)的非常快的马尔可夫聚类算法,以执行并行稀疏矩阵计算和并行稀疏马尔可夫矩阵归一化,这是MCL的核心。我们利用ELLPACK-R稀疏格式来进行有效且细粒度的大规模并行处理,以应对生物信息学应用程序中交互网络数据集的稀疏性质。结果表明,CUDA-MCL明显快于在CPU上运行的原始MCL。因此,以前只能在超级计算架构上进行的现成台式计算机上的大规模并行计算可以显着改变生物信息学家和生物学家处理数据的方式。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号