...
首页> 外文期刊>Knowledge-Based Systems >A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data
【24h】

A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data

机译:混合数值和分类数据的模糊k-原型聚类算法

获取原文
获取原文并翻译 | 示例
           

摘要

In many applications, data objects are described by both numeric and categorical features. The k-prototype algorithm is one of the most important algorithms for clustering this type of data. However, this method performs hard partition, which may lead to misclassification for the data objects in the boundaries of regions, and the dissimilarity measure only uses the user-given parameter for adjusting the significance of attribute. In this paper, first, we combine mean and fuzzy centroid to represent the prototype of a cluster, and employ a new measure based on co-occurrence of values to evaluate the dissimilarity between data objects and prototypes of clusters. This measure also takes into account the significance of different attributes towards the clustering process. Then we present our algorithm for clustering mixed data. Finally, the performance of the proposed method is demonstrated by a series of experiments on four real world datasets in comparison with that of traditional clustering algorithms.
机译:在许多应用程序中,数据对象由数字和类别特征来描述。 k原型算法是对此类数据进行聚类的最重要算法之一。但是,该方法执行硬分区,可能导致区域边界中的数据对象分类错误,并且相异性度量仅使用用户提供的参数来调整属性的重要性。在本文中,首先,我们结合均值和模糊质心来表示集群的原型,并采用基于值共现的新度量来评估数据对象与集群的原型之间的相似性。该措施还考虑了聚类过程中不同属性的重要性。然后,我们提出了用于混合数据聚类的算法。最后,与传统聚类算法相比,通过在四个真实世界数据集上进行的一系列实验证明了该方法的性能。

著录项

  • 来源
    《Knowledge-Based Systems》 |2012年第2012期|p.129-135|共7页
  • 作者单位

    College of Computer Science and Technology, Jilin University. Changchun 130012, China;

    College of Computer Science and Technology, Jilin University. Changchun 130012, China,School of Natural and Computing Sciences, University of Aberdeen, Aberdeen, AB24 3UE, UK;

    College of Computer Science and Technology, Jilin University. Changchun 130012, China;

    College of Mathematics, Jilin University, Changchun 130012, China;

    College of Computer Science and Technology, Jilin University. Changchun 130012, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    fuzzy clustering; data mining; mixed data; dissimilarity measure; attribute significance;

    机译:模糊聚类数据挖掘;混合数据差异度量;属性重要性;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号