An Improved Clustering Algorithm Based On K-Means Algorithm

机译：一种基于K-Means算法的改进聚类算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A new document clustering algorithm is put forward in this paper by improving the existing K-means and Neural Gas algorithm.The difference between our new algorithm and K-means algorithm is that in our algorithm each point is not only attributable to one cluster,only affects the value of one cluster centroid,but like the Neural Gas algorithm,each point affects the value of multiple cluster centroids.The difference between our new algorithm and Neural Gas algorithm is that in our algorithm the degree of effect of any point on a cluster centroid depends on the distance values between this point and the other more recent cluster centroids.Experiments show that in terms of five metrics such as entropy,purity,F1 values,Rand Index and normalized mutual information,our new algorithm has better clustering results than other clustering algorithms when clustering on a number of different text data sets;when clustering on one text data set WAP under many different initial conditions,our clustering algorithm is more stable and better than other algorithms;when clustering on different size data sets,our algorithm is faster than other algorithms,with linear scalability.

机译：一个新的文档聚类算法通过改进我们的新算法和K-means算法之间现有的K-手段和神经燃气algorithm.The差异提出本文的是，在我们的算法每个点不仅归因于一个群集，只影响一个簇质心的值，但像神经气体算法一样，每个点影响多个簇质心的值。我们的新算法与神经气体气体算法之间的差异是在我们的算法中，群集群集的任何点的效果程度质心取决于这一点与其他更新的群集质心之间的距离值。实验表明，在五个指标方面，诸如熵，纯度，F1值，Rand指数和标准化的互信息，我们的新算法具有比其他更好的聚类结果在多个不同的文本数据集上群集群集算法;当在许多不同的初始条件下群集一个文本数据集WAP时，我们的群集G算法比其他算法更稳定，更好;当在不同大小的数据集中聚类时，我们的算法比其他算法快，具有线性可伸缩性。

著录项

来源
《International conference on computer and network technology》|2011年||共5页
会议地点
作者
Yehang Zhu; Yanling Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机网络;
关键词
clustering; document clustering; text clustering; K-means; Neural Gas; algorithm;

机译：聚类;文档聚类;文本聚类;k-means;神经气体;算法;

相似文献

外文文献
中文文献
专利

1. Landslide susceptibility zonation method based on C5.0 decision tree and K-means cluster algorithms to improve the efficiency of risk management [J] . Zizheng Guo, Yu Shi, Faming Huang, Geoscience frontiers . 2021,第6期

机译：基于C5.0决策树的滑坡敏感性分区方法和K-MEARE群算法提高风险管理效率
2. Landslide susceptibility zonation method based on C5.0 decision tree and K-means cluster algorithms to improve the efficiency of risk management [J] . Zizheng Guo, Yu Shi, Faming Huang, 地学前缘(英文版) . 2021,第006期

机译：基于C5.0决策树的滑坡敏感性分区方法和K-MEARE群算法提高风险管理效率
3. A Nonuniform Clustering Routing Algorithm Based on an Improved K-Means Algorithm [J] . Xinliang Tang, Man Zhang, Pingping Yu, Computers, Materials & Continua . 2020,第3期

机译：一种基于改进的K平均算法的非均匀聚类路由算法
4. Research on k-means Clustering Algorithm: An Improved k-means Clustering Algorithm [C] . Shi Na, Liu Xumin, Guan Yong Intelligent Information Technology and Security Informatics (IITSI), 2010 . 2010

机译：k均值聚类算法研究：一种改进的k均值聚类算法
5. Hardware Implementation and Performance Evaluation of K-Means and K-Means++ Clustering Algorithms [D] . Singh, Manisha . 2019

机译：K-Means和K-Means ++聚类算法的硬件实现和性能评估
6. Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm Minimum Spanning Tree and Hierarchical Clustering in an Applied Study [O] . Saeedeh Pourahmad, Atefeh Basirat, Amir Rahimi, 2020

机译：初始簇质心的确定是否提高了K-Means聚类算法的性能？应用研究中遗传算法最小生成树和分层聚类的三种混合方法的比较
7. Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm, Minimum Spanning Tree, and Hierarchical Clustering in an Applied Study [O] . Saeedeh Pourahmad, Atefeh Basirat, Amir Rahimi, 2020

机译：初始簇质心的确定是否提高了K-Means聚类算法的性能？应用研究中遗传算法，最小生成树和分层聚类的三种混合方法的比较

An Improved Clustering Algorithm Based On K-Means Algorithm

摘要

著录项

相似文献

相关主题

期刊订阅