...
首页> 外文期刊>Journal of Residuals Science & Technology >Research on K-medoids clustering algorithm based on data density and its parallel processing based on MapReduce
【24h】

Research on K-medoids clustering algorithm based on data density and its parallel processing based on MapReduce

机译:基于数据密度的K-medoids聚类算法及其基于MapReduce的并行处理研究

获取原文
           

摘要

First of all, in order to solve the problem with varying clustering results from selecting randomly the initial k clustering centers in the k-medoids algorithm, we propose combining the k-medoids algorithm and the density-based clustering algorithm. The improved k-medoids algorithm uses the density-based clustering algorithm to generate automatically the best appropriate k-clustering centers that are used as the initial representation seeds in the k-medoids algorithm. Secondly, considering the k-medoids algorithm does not scale well for large data sets, a parallel processing procedure of the improved k-medoids algorithm based on MapReduce computing model is designed and implemented on Hadoop platform. The parallel processing of the improved k-medoids algorithm is tested on some data sets. And experimental results show that the clustering effectiveness of the improved k-medoids algorithm becomes better and the designed parallel processing can do scale well for large data sets.
机译:首先,为了解决通过在k-medoids算法中随机选择初始k个聚类中心来解决聚类结果变化的问题,我们建议将k-medoids算法与基于密度的聚类算法相结合。改进的k-medoids算法使用基于密度的聚类算法自动生成最佳的k-聚类中心,这些中心将用作k-medoids算法中的初始表示种子。其次,考虑到k-medoids算法不能很好地适应大数据集,在Hadoop平台上设计并实现了基于MapReduce计算模型的改进k-medoids算法的并行处理过程。在某些数据集上测试了改进的k-medoids算法的并行处理。实验结果表明,改进的k-medoids算法的聚类效果更好,所设计的并行处理可以很好地扩展大数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号