A Hybrid Genetic K-means Algorithm for Clustering High Dimensional Data

机译：高维数据聚类的混合遗传K-均值算法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we propose a hybrid genetic K-means algorithm combining the K-means algorithm and the genetic algorithm for clustering high dimensional data. The genetic algorithm is an effective global optimization technique that has been shown to be good in finding optimal or near optimal solutions. As the high dimensional data have many local minima, the clustering results crucially depend on the convergence speed of the clustering algorithm for finite performing iterations. The K-means algorithm is a commonly used distance-based clustering algorithm that is good at local search. The hybridization of genetic algorithm and K-means algorithm implying a global stochastic searching algorithm guided by a heuristic local searching algorithm that can improve the searching capacity of the genetic algorithm. The traditional crossover and mutation operators of the genetic algorithm generate invalid offspring during the process of reproduction thus depress the convergence speed of the genetic clustering algorithm. To circumvent this problem, novel crossover and mutation operators are proposed in this paper. We propose a crossover operator based on rearrangement the cluster centers of paired chromosomes according to the minimum distance between cluster centers of paired chromosomes. A mutation operator based on reassignment of the cluster centers within chromosomes is proposed in this paper. Comparative experiments performed on some publicly available data sets demonstrate the effectiveness of the proposed algorithm.

机译：在本文中，我们提出了一种混合遗传K-均值算法，结合了K-means算法和遗传算法对高维数据进行聚类。遗传算法是一种有效的全局优化技术，已被证明可以很好地找到最优解或接近最优解。由于高维数据具有许多局部最小值，因此聚类结果关键取决于聚类算法在有限执行迭代中的收敛速度。 K-means算法是一种常用的基于距离的聚类算法，擅长局部搜索。遗传算法和K-means算法的混合意味着以启发式局部搜索算法为指导的全局随机搜索算法，可以提高遗传算法的搜索能力。遗传算法的传统交叉和变异算子在繁殖过程中产生无效的后代，从而降低了遗传聚类算法的收敛速度。为了解决这个问题，本文提出了新颖的交叉和变异算子。我们根据交叉配对染色体簇中心之间的最小距离，根据配对染色体簇中心的重排提出了一个交叉算子。提出了基于染色体内簇中心重分配的变异算子。在一些公开数据集上进行的比较实验证明了该算法的有效性。

著录项

来源
《International Symposium on Knowledge and Systems Sciences(KSS2004); 20041110-12; Ishikawa(JP)》|2004年|P.232-237|共6页
会议地点 Ishikawa(JP)
作者
Hui Zhang; Tu Bao Ho;
展开▼
作者单位

School of Knowledge Science, Japan Advanced Institute of Science and Technology 1 -1 Asahidai, Tatsunokuchi, Ishikawa 923-1292, Japan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类系统科学;
关键词
data mining; clustering; genetic algorithm; K-means;

机译：数据挖掘聚类遗传算法K均值;

相似文献

外文文献
中文文献
专利

1. Genetic Algorithm Based Dimensionality Reduction for Improving Performance of K-Means Clustering: A Case Study for Categorization of Medical Dataset [J] . Asha Gowda Karegowda, Vidya T. Shama, M.A. Jayaram, International journal of soft computing . 2012,第5期

机译：基于遗传算法的降维方法提高K-Means聚类性能：以医学数据集分类为例
2. Genetic Algorithm Based Dimensionality Reduction for Improving Performance of K-Means Clustering: A Case Study for Categorization of Medical Dataset [J] . Asha Gowda Karegowda, Vidya T. Shama, M.A. Jayaram, International journal of soft computing . 2012,第5期

机译：基于遗传算法的降维方法提高K-Means聚类性能：以医学数据集分类为例
3. Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm, Minimum Spanning Tree, and Hierarchical Clustering in an Applied Study [J] . Saeedeh Pourahmad, Atefeh Basirat, Amir Rahimi, Computational and mathematical methods in medicine . 2020,第1期

机译：初始簇质心的确定是否提高了K-Means聚类算法的性能？应用研究中遗传算法，最小生成树和分层聚类的三种混合方法的比较
4. A Hybrid Genetic K-means Algorithm for Clustering High Dimensional Data [C] . Hui Zhang, Tu Bao Ho, International Society for Knowledge and Systems Sciences(ISKSS), International Symposium on Knowledge and Systems Sciences . 2004

机译：用于聚类高维数据的混合遗传k均值算法
5. Efficient genetic k-means clustering algorithm and its application to data mining on different domains. [D] . Alsayat, Ahmed Mosa. 2016

机译：高效的遗传k均值聚类算法及其在不同领域数据挖掘中的应用。
6. Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm Minimum Spanning Tree and Hierarchical Clustering in an Applied Study [O] . Saeedeh Pourahmad, Atefeh Basirat, Amir Rahimi, 2020

机译：初始簇质心的确定是否提高了K-Means聚类算法的性能？应用研究中遗传算法最小生成树和分层聚类的三种混合方法的比较
7. Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm, Minimum Spanning Tree, and Hierarchical Clustering in an Applied Study [O] . Saeedeh Pourahmad, Atefeh Basirat, Amir Rahimi, 2020

机译：初始簇质心的确定是否提高了K-Means聚类算法的性能？应用研究中遗传算法，最小生成树和分层聚类的三种混合方法的比较

A Hybrid Genetic K-means Algorithm for Clustering High Dimensional Data

摘要

著录项

相似文献

相关主题

期刊订阅