Site-Based Partitioning and Repartitioning Techniques for Parallel PageRank Computation

Cevahir Ali; Aykanat Cevdet; Turk Ata; Cambazoglu B. Barla

首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Site-Based Partitioning and Repartitioning Techniques for Parallel PageRank Computation

【24h】

Site-Based Partitioning and Repartitioning Techniques for Parallel PageRank Computation

机译：并行PageRank计算的基于站点的分区和重新分区技术

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The PageRank algorithm is an important component in effective web search. At the core of this algorithm are repeated sparse matrix-vector multiplications where the involved web matrices grow in parallel with the growth of the web and are stored in a distributed manner due to space limitations. Hence, the PageRank computation, which is frequently repeated, must be performed in parallel with high-efficiency and low-preprocessing overhead while considering the initial distributed nature of the web matrices. Our contributions in this work are twofold. We first investigate the application of state-of-the-art sparse matrix partitioning models in order to attain high efficiency in parallel PageRank computations with a particular focus on reducing the preprocessing overhead they introduce. For this purpose, we evaluate two different compression schemes on the web matrix using the site information inherently available in links. Second, we consider the more realistic scenario of starting with an initially distributed data and extend our algorithms to cover the repartitioning of such data for efficient PageRank computation. We report performance results using our parallelization of a state-of-the-art PageRank algorithm on two different PC clusters with 40 and 64 processors. Experiments show that the proposed techniques achieve considerably high speedups while incurring a preprocessing overhead of several iterations (for some instances even less than a single iteration) of the underlying sequential PageRank algorithm.

机译：PageRank算法是有效Web搜索中的重要组成部分。该算法的核心是重复的稀疏矩阵矢量乘法，其中涉及的Web矩阵与Web的增长并行增长，并且由于空间限制而以分布式方式存储。因此，在考虑网络矩阵的初始分布式特性的同时，必须以高效和低预处理开销并行执行的频繁重复的PageRank计算。我们在这项工作中的贡献是双重的。我们首先研究最新的稀疏矩阵分区模型的应用，以便在并行PageRank计算中获得高效率，特别着重于减少它们引入的预处理开销。为此，我们使用链接中固有的站点信息在Web矩阵上评估两种不同的压缩方案。其次，我们考虑从初始分配的数据开始的更现实的情况，并扩展我们的算法以覆盖此类数据的重新分区以进行有效的PageRank计算。我们在两个具有40和64个处理器的不同PC群集上并行使用最新的PageRank算法，报告了性能结果。实验表明，所提出的技术可实现相当高的速度，同时又会产生底层顺序PageRank算法的多次迭代（对于某些情况甚至少于一次迭代）的预处理开销。

著录项

来源
《Parallel and Distributed Systems, IEEE Transactions on》 |2011年第5期|p.786-802|共17页
作者
Cevahir Ali; Aykanat Cevdet; Turk Ata; Cambazoglu B. Barla;
展开▼
作者单位

Tokyo Institute of Technology, Tokyo;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
PageRank; graph partitioning; hypergraph partitioning; parallelization; repartitioning.; sparse matrix partitioning; sparse matrix-vector multiplication; web search;

机译：PageRank;图分区;超图分区;并行化;重新分区;稀疏矩阵划分;稀疏矩阵向量乘法;Web搜索;

相似文献

外文文献
中文文献
专利

1. The effect of graph partitioning techniques on parallel Block FSAI preconditioning: a computational study [J] . Janna Carlo, Castelletto Nicola, Ferronato Massimiliano Numerical algorithms . 2015,第4期

机译：图分割技术对并行Block FSAI预处理的影响：计算研究
2. The comparison of two domain repartitioning methods used for parallel discrete element computations of the hopper discharge [J] . Darius Markauskas, Arnas Kaceniauskas Advances in Engineering Software . 2015,第juna期

机译：用于料斗排料的并行离散元素计算的两种域重分配方法的比较
3. Parallel computations of local PageRank problem based on Graphics Processing Unit [J] . Siyan Lai, Bo Shao, Ying Xu, Concurrency, practice and experience . 2017,第24期

机译：基于图形处理单元的局部PageRank问题的并行计算
4. A Web-Site-Based Partitioning Technique for Reducing Preprocessing Overhead of Parallel PageRank Computation [C] . Ali Cevahir, Cevdet Aykanat, Ata Turk, International Workshop on Applied Parallel Computing . 2007

机译：一种基于网站的分区技术，用于减少并行PageRank计算的预处理开销
5. Data partitioning, query processing and optimization techniques for parallel object-oriented databases [D] . Huang, Ying 1996

机译：并行面向对象数据库的数据分区，查询处理和优化技术
6. Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization [O] . Krishna M Roskin, Benedict Paten, David Haussler 2011

机译：使用Crumble和Prune进行元对齐：对非常大的对齐问题进行分区以实现性能和并行化
7. Site-Based Partitioning and Repartitioning Techniques for Parallel PageRank Computation [O] . Ali Cevahir, Cevdet Aykanat, Ata Turk, 2013

机译：用于并行PageRank计算的基于站点的分区和重新分区技术

Site-Based Partitioning and Repartitioning Techniques for Parallel PageRank Computation

摘要

著录项

相似文献

相关主题

期刊订阅