首页> 外文会议>International Conference on Computer and Information Sciences >Statistical Analysis of Clustering Performances of NMF, Spectral Clustering, and K-means
【24h】

Statistical Analysis of Clustering Performances of NMF, Spectral Clustering, and K-means

机译:NMF,谱聚类和K均值聚类性能的统计分析

获取原文

摘要

Nonnegative matrix factorization (NMF), spectral clustering, and k-means are the most used clustering methods in machine learning research. They have been used in many domains including text, image, and cancer clustering. However, there is still a limited number of works that discuss statistical significance of performance differences between these methods. This issue is epecially important in NMF as this method is still very actively researched with a sheer number of new algorithms are published every year, and being able to demonstrate newly proposed algorithms statistically outperform previous ones is certainly desired. In this paper, we present statistical analysis of clustering performance differences between NMF, spectral clustering, and k-means. We use ten NMF algorithms, six spectral clustering algorithms, and one standard k-means algorithm for benchmark. For data, eleven publicly available microarray gene expression datasets with numbers of classes range from two to ten are used. The experimental results show that statistically performance differences between NMF algorithms and the standard k-means algorithm are not significant, and spectral methods surprisingly perform less well than NMF and k-means.
机译:非负矩阵分解(NMF),频谱聚类和k均值是机器学习研究中最常用的聚类方法。它们已用于许多领域,包括文本,图像和癌症聚类。但是,仍然有数量有限的工作讨论这些方法之间的性能差异的统计意义。这个问题在NMF中尤为重要,因为该方法仍处于非常积极的研究之中,每年都会发布大量新算法,并且肯定希望能够以统计学的方式证明新提出的算法优于以前的算法。在本文中,我们对NMF,频谱聚类和k均值之间的聚类性能差异进行了统计分析。我们使用十种NMF算法,六种频谱聚类算法和一种标准的k均值算法进行基准测试。对于数据,使用了11种可公开获得的微阵列基因表达数据集,其类别数范围为2到10。实验结果表明,NMF算法和标准k均值算法之间的统计性能差异不明显,并且光谱方法出人意料地不如NMF和k均值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号