A Paradigm for Parallel Matrix Algorithms: Scalable Cholesky

机译：并行矩阵算法的范例：可扩展的Cholesky

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

A style for programming problems from matrix algebra is developed with a familiar example and new tools, yielding high performance with a couple of surprising exceptions. The underlying philosophy is to use block recursion as the exclusive control structure, down to a 2~P x 2~P base case anyway, where hardware favors iterative style to fill its pipe. Use of Morton-ordered matrices yields excellent locality within the memory hierarchy—including block sharing among distributed computers. The recursion generalizes nicely to an SPMD program where such sharing is the only communication. Cholesky factorization of an n x n SPD matrix is used as a simple non-trivial example to expose the paradigm. The program amounts to four functions, two of which are finalizers for the other two. This insight allows final blocks to be shared with inter-node communication ∈ Θ(n~2) for this algorithm ∈ Θ(n~3) FLOPS.

机译：通过熟悉的示例和新工具，开发了一种用于解决矩阵代数编程问题的方式，除了一些令人惊讶的异常之外，还产生了高性能。基本的原理是使用块递归作为排他的控制结构，无论如何，直到2〜P x 2〜P基本情况，硬件都喜欢使用迭代样式填充其管道。使用Morton排序矩阵可在内存层次结构中产生出色的局部性-包括分布式计算机之间的块共享。递归可以很好地概括为SPMD程序，其中这种共享是唯一的通信。 n x n SPD矩阵的Cholesky分解被用作暴露范式的简单非平凡示例。该程序共有四个功能，其中两个是另外两个的终结器。对于该算法∈Θ（n〜3）FLOPS，此见解允许最终块与节点间通信∈Θ（n〜2）共享。

著录项

来源
《International Euro-Par Parallel Processing Conference; 20050830-0902; Lisbon(PT)》|2005年|P.687-698|共12页
会议地点 Lisbon(PT)
作者
David S. Wise; Craig Citro; Joshua Hursey; Fang Liu; Michael Rainey;
展开▼
作者单位

Indiana University, Bloomington;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机软件;理论、方法;
关键词

相似文献

外文文献
中文文献
专利

1. A novel parallel algorithm for large-scale Fock matrix construction with small locally distributed memory architectures: RT parallel algorithm [J] . Takashima H., Yamada S., Obara S., Journal of Computational Chemistry: Organic, Inorganic, Physical, Biological . 2002,第14期

机译：用于具有小的局部分布式存储体系结构的大规模Fock矩阵构建的新颖并行算法：RT并行算法
2. A novel parallel algorithm for large-scale fock matrix construction with smalllocally distributed memory architectures:RT parallel algorithm [J] . Hajime Takashima, Kunihiro Kitamura, So Yamada, Journal of Computational Chemistry: Organic, Inorganic, Physical, Biological . 2002,第14a15期

机译：一种用于具有小局部分布存储架构的大规模Fock矩阵构造的新颖并行算法：RT并行算法
3. Fast and Scalable Parallel Algorithms for Matrix Chain Product and Matrix Powers on Reconfigurable Pipelined Optical Buses [J] . KEQIN LI Journal of information science and engineering . 2002,第5期

机译：可重构流水线总线上矩阵链乘积和矩阵功率的快速可扩展并行算法
4. A Paradigm for Parallel Matrix Algorithms: Scalable Cholesky [C] . David S. Wise, Craig Citro, Joshua Hursey, International Euro-Par Parallel Processing Conference . 2005

机译：平行矩阵算法的范例：可扩展的Cholesky
5. Efficient and portable parallel algorithms for Cholesky decomposition. [D] . Chu, Pei Yue Liu. 2003

机译：用于Cholesky分解的高效且可移植的并行算法。
6. Large-Scale Modeling of Epileptic Seizures: Scaling Properties of Two Parallel Neuronal Network Simulation Algorithms [O] . Lorenzo L. Pesce, Hyong C. Lee, Mark Hereld, 2013

机译：癫痫发作的大规模建模：两种并行神经元网络仿真算法的缩放性质。
7. A high performance sparse Cholesky factorization algorithm for scalable parallel computers [O] . George Karypis, Vipin Kumar 1994

机译：用于可伸缩并行计算机的高性能稀疏Cholesky分解算法

A Paradigm for Parallel Matrix Algorithms: Scalable Cholesky

摘要

著录项

相似文献

相关主题

期刊订阅