【24h】

Clustering stability-based Evolutionary K-Means

机译:基于聚类的基于稳定性的进化k-mease

获取原文
获取原文并翻译 | 示例
           

摘要

Evolutionary K-Means (EKM), which combines K-Means and genetic algorithm, solves K-Means' initiation problem by selecting parameters automatically through the evolution of partitions. Currently, EKM algorithms usually choose silhouette index as cluster validity index, and they are effective in clustering well-separated clusters. However, their performance of clustering noisy data is often disappointing. On the other hand, clustering stability-based approaches are more robust to noise; yet, they should start intelligently to find some challenging clusters. It is necessary to join EKM with clustering stability-based analysis. In this paper, we present a novel EKM algorithm that uses clustering stability to evaluate partitions. We firstly introduce two weighted aggregated consensus matrices, positive aggregated consensus matrix (PA) and negative aggregated consensus matrix (NA), to store clustering tendency for each pair of instances. Specifically, PA stores the tendency of sharing the same label and NA stores that of having different labels. Based upon the matrices, clusters and partitions can be evaluated from the view of clustering stability. Then, we propose a clustering stability-based EKM algorithm CSEKM that evolves partitions and the aggregated matrices simultaneously. To evaluate the algorithm's performance, we compare it with an EKM algorithm, two consensus clustering algorithms, a clustering stability-based algorithm and a multi-index-based clustering approach. Experimental results on a series of artificial datasets, two simulated datasets and eight UCI datasets suggest CSEKM is more robust to noise.
机译:将K-means和遗传算法组合的进化K-means(EKM)通过自动通过分区的演变自动选择参数来解决K-Mease的启动问题。目前,EKM算法通常选择剪影索引作为群集有效性索引,它们在聚类良好分离的群集中有效。但是,它们对噪声数据的表现往往令人失望。另一方面,基于聚类的基于稳定性的方法对噪声更加坚固;然而,他们应该开始智能地找到一些挑战的集群。有必要加入EKM与基于聚类的基于稳定性的分析。在本文中,我们介绍了一种新颖的EKM算法,它使用聚类稳定性来评估分区。我们首先介绍了两种加权汇总共识矩阵,阳性聚合共识矩阵(PA)和负聚合共识矩阵(NA),以存储每对实例的聚类趋势。具体地,PA存储共享具有不同标签的相同标签和NA存储的趋势。基于矩阵,可以从聚类稳定性的视图评估群集和分区。然后,我们提出了一种基于聚类的稳定性的EKM算法CSEKM,其同时演化分区和聚合矩阵。为了评估算法的性能,我们将其与EKM算法进行比较,两个共识群集算法,群集稳定性的算法和基于多索引的聚类方法。关于一系列人工数据集的实验结果,两个模拟数据集和八个UCI数据集建议CSEKM对噪声更加强大。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号