首页> 外文会议>International Conference on Pattern Recognition Applications and Methods >Kernel Hierarchical Agglomerative Clustering Comparison of Different Gap Statistics to Estimate the Number of Clusters
【24h】

Kernel Hierarchical Agglomerative Clustering Comparison of Different Gap Statistics to Estimate the Number of Clusters

机译:不同差距统计数据估算集群数量的内核分层凝聚聚类比较

获取原文

摘要

Clustering algorithms, as unsupervised analysis tools, are useful for exploring data structure and have owned great success in many disciplines. For most of the clustering algorithms like k-means, determining the number of the clusters is a crucial step and is one of the most difficult problems. Hierarchical Agglomerative Clustering (HAC) has the advantage of giving a data representation by the dendrogram that allows clustering by cutting the dendrogram at some optimal level. In the past years and within the context of HAC, efficient statistics have been proposed to estimate the number of clusters and the Gap Statistic by Tibshirani has shown interesting performances. In this paper, we propose some new Gap Statistics to further improve the determination of the number of clusters. Our works focus on the kernelized version of the widely-used Hierarchical Clustering Algorithm.
机译:作为无监督分析工具的聚类算法对于探索数据结构非常有用,并且在许多学科中取得了巨大的成功。对于大多数聚类算法,如K-means,确定集群的数量是关键步骤,并且是最困难的问题之一。分层凝聚聚类(HAC)的优点是通过在一些最佳水平下切割树枝图来提供允许聚类的树木图来提供数据表示的优点。在过去的几年内,在HAC的背景下,已经提出了有效的统计数据来估计群集的数量,蒂巴里拉尼的差距统计显示了有趣的表现。在本文中,我们提出了一些新的差距统计数据,以进一步改善群集数量的确定。我们的作品侧重于广泛使用的分层聚类算法的内核版本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号