首页> 外文期刊>Statistics and computing >Robust clustering tools based on optimal transportation
【24h】

Robust clustering tools based on optimal transportation

机译:基于最优运输的强大聚类工具

获取原文
获取原文并翻译 | 示例
           

摘要

A robust clustering method for probabilities in Wasserstein space is introduced. This new trimmed k-barycenters' approach relies on recent results on barycenters in Wasserstein space that allow intensive computation, as required by clustering algorithms to be feasible. The possibility of trimming the most discrepant distributions results in a gain in stability and robustness, highly convenient in this setting. As a remarkable application, we consider a parallelized clustering setup in which each of m units processes a portion of the data, producing a clustering report, encoded as k probabilities. We prove that the trimmed k-barycenter of the mxk reports produces a consistent aggregation which we consider the result of a wide consensus'. We also prove that a weighted version of trimmed k-means algorithms based on k-barycenters in the space of Wasserstein keeps the descending character of the concentration step, guaranteeing convergence to local minima. We illustrate the methodology with simulated and real data examples. These include clustering populations by age distributions and analysis of cytometric data.
机译:介绍了Wassersein空间中概率的强大聚类方法。这种新的修剪K-BaryCenders的方法依赖于近期Cenders在允许密集的计算中的近斯特·斯坦空间的结果,因为聚类算法是可行的。修剪最差异分布的可能性导致稳定性和稳健性的增益,在此设置中非常方便。作为一个显着的应用程序,我们考虑一个并行化聚类设置,其中每个单位处理一部分数据,产生群集报告,编码为k概率。我们证明了MXK报告的修剪了K-BaryCenters产生了一致的聚合,我们认为广泛共识的结果。我们还证明了基于Wasserstein的空间的基于K-BaryCenters的修剪的K-Means算法的加权版本保持浓缩步骤的下降性,保证对局部最小值的收敛性。我们说明了模拟和实际数据示例的方法。这些包括按年龄分布和细胞仪数据分析的聚类群体。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号