首页> 外文会议>2017 Evolving and Adaptive Intelligent Systems >Scalable implementation of dependence clustering in Apache Spark
【24h】

Scalable implementation of dependence clustering in Apache Spark

机译:Apache Spark中依赖项群集的可扩展实现

获取原文
获取原文并翻译 | 示例

摘要

This article proposes a scalable version of the Dependence Clustering algorithm which belongs to the class of spectral clustering methods. The method is implemented in Apache Spark using GraphX API primitives. Moreover, a fast approximate diffusion procedure that enables algorithms of spectral clustering type in Spark environment is introduced. In addition, the proposed algorithm is benchmarked against Spectral clustering. Results of applying the method to real-life data allow concluding that the implementation scales well, yet demonstrating good performance for densely connected graphs.
机译:本文提出了一种依赖谱聚类算法的可扩展版本,它属于频谱聚类方法的一类。该方法是使用GraphX API原语在Apache Spark中实现的。此外,介绍了一种在Spark环境中实现光谱聚类算法的快速近似扩散过程。另外,该算法针对频谱聚类进行了基准测试。将方法应用于实际数据的结果可以得出结论,该实现可很好地扩展,但对于密集连接的图则表现出良好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号