首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Analysis of Parallel Algorithms for Matrix Chain Product and Matrix Powers on Distributed Memory Systems
【24h】

Analysis of Parallel Algorithms for Matrix Chain Product and Matrix Powers on Distributed Memory Systems

机译:分布式存储系统矩阵链乘积和矩阵幂的并行算法分析

获取原文
获取原文并翻译 | 示例
           

摘要

Given N matrices A_{1}, A_{2}, ldots, A_{N} of size N times N, the matrix chain product problem is to compute A_{1} times A_{2} times cdots times A_{N}. Given an N times N matrix A, the matrix powers problem is to calculate the first N powers of A, that is, A, A^{2}, A^{3}, ldots, A^{N}. We solve the two problems on distributed memory systems (DMSs) with p processors that can support one-to-one communications in T(p) time. Assume that the fastest sequential matrix multiplication algorithm has time complexity O(N^{alpha}), where the currently best value of alpha is less than 2.3755. Let p be arbitrarily chosen in the range 1 leq p leq N^{alpha + 1}/(log N)^{2}. We show that the two problems can be solved by a DMS with p processors in T_{rm chain}(N,p) = O({frac{N^{alpha + 1}}{p}} + T(p)(({frac{N^{2(1 + 1/alpha)}}{p^{2/alpha}}})(log^{+}{frac{p}{N}})^{1 - 2/alpha} + log^{+}({frac{plog N}{N^{alpha}}})log N)) and T_{rm power}(N,p) = O({frac{N^{alpha + 1}}{p}} + T(p)(({frac{N^{2(1 + 1/alpha)}}{p^{2/alpha}}})(log^{+}{frac{p}{2log N}})^{1 - 2/alpha}+ (log N)^{2})) times, respectively, where the function log^{+} is defined as follows: log^{+}x = log x if x geq 1 and log^{+}x = 1 if 0 < x < 1. We also give instantiations of the above results on several typical DMSs and show that computing matrix chain product and matrix powers are fully scalable on distributed memory parallel computers (DMPCs), highly scalable on DMSs with hypercubic networks, and not highly scalable on DMSs with mesh and torus networks.
机译:给定N个大小为N乘以N的矩阵A_ {1},A_ {2},ldots,A_ {N},矩阵链乘积问题将计算出A_ {1}乘以A_ {2}乘以cdot乘以A_ {N}。给定N倍N矩阵A,矩阵幂问题是计算A的前N个幂,即A,A ^ {2},A ^ {3},ldots,A ^ {N}。我们用p个处理器解决了分布式存储系统(DMS)上的两个问题,这些处理器可以支持在T(p)时间内进行一对一的通信。假设最快的顺序矩阵乘法算法具有时间复杂度O(N ^ {alpha}),其中当前的最佳alpha值小于2.3755。令p在1leq p leq N ^ {α+ 1} /(log N)^ {2}的范围内任意选择。我们证明,通过T_ {rm链}(N,p)= O({frac {N ^ {alpha + 1}} {p}} + T(p)( ({frac {N ^ {2(1 + 1/1 / alpha)}} {p ^ {2 / alpha}}})(log ^ {+} {frac {p} {N}})^ {1-2 / alpha} + log ^ {+}({frac {plog N} {N ^ {alpha}}})log N))和T_ {rm power}(N,p)= O({frac {N ^ {alpha + 1}} {p}} + T(p)(({frac {N ^ {2(1 + 1 / alpha)}} {p ^ {2 / alpha}}})(log ^ {+} {frac { p} {2log N}})^ {1-2-/ alpha} +(log N)^ {2}))次,其中函数log ^ {+}的定义如下:log ^ {+} x = x x如果x geq 1,log ^ {+} x = 1,如果0

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号