首页> 外文期刊>Pattern Analysis and Machine Intelligence, IEEE Transactions on >Coordination of Cluster Ensembles via Exact Methods
【24h】

Coordination of Cluster Ensembles via Exact Methods

机译:通过精确方法协调群集集合

获取原文
获取原文并翻译 | 示例
           

摘要

We present a novel optimization-based method for the combination of cluster ensembles for the class of problems with intracluster criteria, such as Minimum-Sum-of-Squares-Clustering (MSSC). We propose a simple and efficient algorithmȁ4;called EXAMCEȁ4;for this class of problems that is inspired from a Set-Partitioning formulation of the original clustering problem. We prove some theoretical properties of the solutions produced by our algorithm, and in particular that, under general assumptions, though the algorithm recombines solution fragments so as to find the solution of a Set-Covering relaxation of the original formulation, it is guaranteed to find better solutions than the ones in the ensemble. For the MSSC problem in particular, a prototype implementation of our algorithm found a new better solution than the previously best known for 21 of the test instances of the 40-instance TSPLIB benchmark data sets used in [CHECK END OF SENTENCE], [CHECK END OF SENTENCE], and [CHECK END OF SENTENCE], and found a worse-quality solution than the best known only five times. For other published benchmark data sets where the optimal MSSC solution is known, we match them. The algorithm is particularly effective when the number of clusters is large, in which case it is able to escape the local minima found by K-means type algorithms by recombining the solutions in a Set-Covering context. We also establish the stability of the algorithm with extensive computational experiments, by showing that multiple runs of EXAMCE for the same clustering problem instance produce high-quality solutions whose Adjusted Rand Index is consistently above 0.95. Finally, in experiments utilizing external criteria to compute the validity of clustering, EXAMCE is capable of producing high-quality results that are comparable in quality to those of the best known clustering algorithms.
机译:我们提出了一种基于优化的新颖方法,用于对簇内准则(例如最小和平方聚类(MSSC))问题进行聚类集成。对于此类问题,我们提出了一种简单高效的算法ȁ4;称为EXAMCEȁ4;该算法的灵感来自于原始聚类问题的集合划分公式。我们证明了算法产生的解的一些理论性质,特别是在一般假设下,尽管算法重新组合了解片段以找到原始公式的集覆盖松弛的解,但可以保证找到比整体中更好的解决方案。尤其是对于MSSC问题,我们的算法的原型实现找到了一种新的更好的解决方案,该解决方案比以前在[CHECK END OF SENTENCE],[CHECK END]中使用的40例TSPLIB基准数据集的21个测试实例中最为人所知。和[检查结束],发现质量比最知名的解决方案差五倍。对于其他已知最佳MSSC解决方案的基准数据集,我们对其进行了匹配。当群集数量很大时,该算法特别有效,在这种情况下,它可以通过在Set-Covering上下文中重新组合解决方案来避开K-means类型算法找到的局部最小值。我们还通过大量计算实验来证明算法的稳定性,方法是证明针对同一聚类问题实例的多次EXAMCE运行会产生高质量的解决方案,其调整兰德指数始终高于0.95。最后,在利用外部标准计算聚类有效性的实验中,EXAMCE能够产生高质量的结果,其质量可与最知名的聚类算法相媲美。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号