...
首页> 外文期刊>Information Theory, IEEE Transactions on >Clustering Based on Pairwise Distances When the Data is of Mixed Dimensions
【24h】

Clustering Based on Pairwise Distances When the Data is of Mixed Dimensions

机译:数据为混合维时基于成对距离的聚类

获取原文
获取原文并翻译 | 示例
           

摘要

In the context of clustering, we consider a generative model in a Euclidean ambient space with clusters of different shapes, dimensions, sizes, and densities. In an asymptotic setting where the number of points becomes large, we obtain theoretical guaranties for some emblematic methods based on pairwise distances: a simple algorithm based on the extraction of connected components in a neighborhood graph; hierarchical clustering with single linkage; and the spectral clustering method of Ng, Jordan, and Weiss. The methods are shown to enjoy some near-optimal properties in terms of separation between clusters and robustness to outliers. The local scaling method of Zelnik-Manor and Perona is shown to lead to a near-optimal choice for the scale in the first and third methods. We also provide a lower bound on the spectral gap to consistently choose the correct number of clusters in the spectral method.
机译:在聚类的背景下,我们考虑在欧几里得环境空间中具有不同形状,尺寸,大小和密度的聚类的生成模型。在点数变大的渐近环境中,我们获得了基于成对距离的某些象征性方法的理论保证:一种基于邻域图中连通分量的提取的简单算法;具有单个链接的层次聚类;以及Ng,Jordan和Weiss的光谱聚类方法。在聚类之间的分离和离群值的鲁棒性方面,这些方法显示出一些最佳性能。 Zelnik-Manor和Perona的局部缩放方法显示出导致第一种和第三种方法中缩放比例的接近最佳选择。我们还提供了光谱间隙的下限,以便在光谱方法中始终选择正确数量的簇。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号