首页> 外文期刊>Computational and mathematical methods in medicine >Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm, Minimum Spanning Tree, and Hierarchical Clustering in an Applied Study
【24h】

Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm, Minimum Spanning Tree, and Hierarchical Clustering in an Applied Study

机译:初始簇质心的确定是否提高了K-Means聚类算法的性能?应用研究中遗传算法,最小生成树和分层聚类的三种混合方法的比较

获取原文
           

摘要

Random selection of initial centroids (centers) for clusters is a fundamental defect in K-means clustering algorithm as the algorithm’s performance depends on initial centroids and may end up in local optimizations. Various hybrid methods have been introduced to resolve this defect in K-means clustering algorithm. As regards, there are no comparative studies comparing these methods in various aspects, the present paper compared three hybrid methods with K-means clustering algorithm using concepts of genetic algorithm, minimum spanning tree, and hierarchical clustering method. Although these three hybrid methods have received more attention in previous researches, fewer studies have compared their results. Hence, seven quantitative datasets with different characteristics in terms of sample size, number of features, and number of different classes are utilized in present study. Eleven indices of external and internal evaluating index were also considered for comparing the methods. Data indicated that the hybrid methods resulted in higher convergence rate in obtaining the final solution than the ordinary K-means method. Furthermore, the hybrid method with hierarchical clustering algorithm converges to the optimal solution with less iteration than the other two hybrid methods. However, hybrid methods with minimal spanning trees and genetic algorithms may not always or often be more effective than the ordinary K-means method. Therefore, despite the computational complexity, these three hybrid methods have not led to much improvement in the K-means method. However, a simulation study is required to compare the methods and complete the conclusion.
机译:随机选择集群的初始质心(中心)是K-means聚类算法中的基本缺陷,因为算法的性能取决于初始质心,并可能最终以当地优化结束。已经引入了各种混合方法来解决K-Means聚类算法中的这种缺陷。关于,没有比较这些方法在各个方面的比较研究,本文将三种混合方法与遗传算法,最小生成树和分层聚类方法的概念进行了比较了k-means聚类算法。虽然这三种混合方法在以前的研究中得到了更多关注,但研究的结果更少。因此,在本研究中使用了在样本大小,特征数量和不同类别的不同特征的七个定量数据集。还考虑了对外部和内部评估指标的11个指标进行比较方法。数据表明,在获得最终解决方案时,混合方法导致比普通的K-均值方法更高的收敛速度。此外,具有分层聚类算法的混合方法会聚到比其他两个混合方法更少的迭代的最佳解决方案。然而,具有最小跨越树木和遗传算法的混合方法可能并不总是比普通的K-均值方法更有效。因此,尽管计算了复杂性,但这三种混合方法没有导致K-Means方法的大大改进。但是,需要模拟研究来比较方法并完成结论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号