首页> 外文会议>Fuzzy logic and applications >What Can Fuzzy Cluster Analysis Contribute to Clustering of High-Dimensional Data?
【24h】

What Can Fuzzy Cluster Analysis Contribute to Clustering of High-Dimensional Data?

机译:模糊聚类分析对高维数据聚类有何贡献?

获取原文
获取原文并翻译 | 示例

摘要

Cluster analysis of high-dimensional data has become of special interest in recent years. The term high-dimensional data can refer to a larger number of attributes-20 or more-as they often occur in database tables. But high-dimensional data can also mean that we have to deal with thousands of attributes as in the context of genomics or proteomics data where thousands of genes or proteins are measured and are considered in some analysis tasks as attributes. A main reason, why cluster analysis of high-dimensional data is different from clustering low-dimensional data, is the concentration of norm phenomenon, which states more or less that the relative differences between distances between randomly distributed points tend to be more and more similar in higher dimensions. On the one hand, fuzzy cluster analysis has been shown to be less sensitive to initialisation than, for instance, the classical k-means algorithm. On the other, standard fuzzy clustering is stronger affected by the concentration of norm phenomenon and tends to fail easily in high dimensions. Here we present a review of why fuzzy clustering has special problems with high-dimensional data and how this can be amended by modifying the fuzzifier concept. We also describe a recently introduced approach based on correlation and an attribute selection fuzzy clustering technique that can be applied when clusters can only be found in lower dimensions.
机译:近年来,对高维数据进行聚类分析已引起特别关注。高维数据一词可以指20个或更多的大量属性,因为它们经常出现在数据库表中。但是高维数据也可能意味着我们必须处理成千上万个属性,例如在基因组学或蛋白质组学数据中,要测量成千上万的基因或蛋白质,并在某些分析任务中将其视为属性。高维数据的聚类分析与低维数据的聚类分析不同的主要原因是规范现象的集中,它或多或少地表明随机分布点之间的距离之间的相对差异趋于越来越相似在更高的尺寸。一方面,已证明模糊聚类分析对初始化的敏感性不如例如经典k均值算法。另一方面,标准模糊聚类受规范现象集中度的影响更大,并且在高维方面容易失败。在这里,我们对模糊聚类为何对高维数据存在特殊问题以及如何通过修改模糊器概念进行修正的问题进行了综述。我们还描述了一种基于相关性和属性选择模糊聚类技术的最新介绍的方法,该方法可以在只能在较低维中找到聚类时应用。

著录项

  • 来源
    《Fuzzy logic and applications》|2013年|1-14|共14页
  • 会议地点 Genoa(IT)
  • 作者

    Frank Klawonn;

  • 作者单位

    Bioinformatics Statistics Helmholtz-Centre for Infection ResearchInhoffenstr. 7, D-38124 Braunschweig, Germany,Department of Computer Science Ostfalia University of Applied Sciences Salzdahlumer Str. 46/48, D-38302 Wolfenbuettel, Germany;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号