首页> 外文会议>International conference on world wide web >High Quality, Scalable and Parallel Community Detection for Large Real Graphs
【24h】

High Quality, Scalable and Parallel Community Detection for Large Real Graphs

机译:大型实图的高质量,可伸缩和并行社区检测

获取原文

摘要

Community detection has arisen as one of the most relevant topics in the field of graph mining, principally for its applications in domains such as social or biological networks analysis. Different community detection algorithms have been proposed during the last decade, approaching the problem from different perspectives. However, existing algorithms are, in general, based on complex and expensive computations, making them unsuitable for large graphs with millions of vertices and edges such as those usually found in the real world. In this paper, we propose a novel disjoint community detection algorithm called Scalable Community Detection (SCD). By combining different strategies, SCD partitions the graph by maximizing the Weighted Community Clustering (WCC), a recently proposed community detection metric based on triangle analysis. Using real graphs with ground truth overlapped communities, we show that SCD outperforms the current state of the art proposals (even those aimed at finding overlapping communities) in terms of quality and performance. SCD provides the speed of the fastest algorithms and the quality in terms of NMI and F1Score of the most accurate state of the art proposals. We show that SCD is able to run up to two orders of magnitude faster than practical existing solutions by exploiting the parallelism of current multi-core processors, enabling us to process graphs of unprecedented size in short execution times.
机译:社区检测已成为图挖掘领域中最相关的主题之一,主要是因为其在诸如社会或生物网络分析等领域中的应用。在过去的十年中,已经提出了不同的社区检测算法,从不同角度解决了该问题。但是,现有算法通常基于复杂且昂贵的计算,这使其不适用于具有数百万个顶点和边的大型图形,例如现实世界中通常会发现的那些图形。在本文中,我们提出了一种新颖的不相交的社区检测算法,称为可伸缩社区检测(SCD)。通过组合不同的策略,SCD通过最大化加权社区聚类(WCC)(最近提出的基于三角分析的社区检测指标)对图进行分区。使用具有基本事实重叠社区的真实图,我们显示SCD在质量和性能方面都优于当前的最新建议(甚至是那些旨在寻找重叠社区的建议)。 SCD提供了最快算法的速度以及NMI和F1Score方面最准确的最新技术提案的质量。通过利用当前多核处理器的并行性,我们证明SCD能够比实际的现有解决方案快两个数量级,从而使我们能够在较短的执行时间内处理规模空前的图形。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号