...
首页> 外文期刊>Expert Systems with Application >Extended Gaussian kernel version of fuzzy c-means in the problem of data analyzing
【24h】

Extended Gaussian kernel version of fuzzy c-means in the problem of data analyzing

机译:数据分析问题中模糊c均值的扩展高斯核版本

获取原文
获取原文并翻译 | 示例
           

摘要

Fuzzy c-means clustering with spatial constraints is considered as suitable algorithm for data clustering or data analyzing. But FCM has still lacks enough robustness to employ with noise data, because of its Euclidean distance measure objective function for finding the relationship between the objects. It can only be effective in clustering 'spherical' clusters, and it may not give reasonable clustering results for "non-compactly filled" spherical data such as "annular-shaped" data. This paper realized the drawbacks of the general fuzzy c-mean algorithm and it tries to introduce an extended Gaussian version of fuzzy C-means by replacing the Euclidean distance in the original object function of FCM. Firstly, this paper proposes initial kernel version of fuzzy c-means to aim at simplifying its computation and then extended it to extended Gaussian kernel version of fuzzy c-means. It derives an effective method to construct the membership matrix for objects, and it derives a robust method for updating centers from extended Gaussian version of fuzzy C-means. Furthermore, this paper proposes a new prototypes learning method and it obtains initial cluster centers using new mathematical initialization centers for the new effective objective function of fuzzy c-means, so that this paper tries to minimize the iteration of algorithms to obtain more accurate result. Initial experiment will be done with an artificially generated data to show how effectively the new proposed Gaussian version of fuzzy C-means works in obtaining clusters, and then the proposed methods can be implemented to cluster the Wisconsin breast cancer database into two clusters for the classes benign and malignant. To show the effective performance of proposed fuzzy c-means with new initialization of centers of clusters, this work compares the results with results of recent fuzzy c-means algorithm; in addition, it uses Silhouette method to validate the obtained clusters from breast cancer datasets.
机译:具有空间约束的模糊c均值聚类被认为是适合数据聚类或数据分析的算法。但是,FCM仍然缺乏足够的鲁棒性来处理噪声数据,因为它具有用于查找对象之间关系的欧氏距离测量目标函数。它只能在对“球形”簇进行聚类时有效,并且对于“非紧凑填充”球形数据(例如“环形”数据)可能无法给出合理的聚类结果。本文认识到通用模糊c均值算法的弊端,并试图通过在FCM的原始对象函数中替换欧几里得距离来引入模糊C均值的扩展高斯形式。首先,本文提出了模糊c均值的初始内核版本,旨在简化其计算,然后将其扩展为扩展的模糊c均值的高斯内核版本。它推导了一种构造对象隶属度矩阵的有效方法,并推导了从扩展的高斯模糊C均值版本更新中心的鲁棒方法。此外,本文提出了一种新的原型学习方法,并使用新的数学初始化中心获得了新的初始聚类中心,以用于模糊c均值的新有效目标函数,从而力求最小化算法的迭代以获得更准确的结果。将使用人工生成的数据进行初始实验,以展示新提出的高斯模糊C均值版本在获取聚类中的工作效率,然后可以实施所提出的方法将威斯康星州乳腺癌数据库分为两个类良性和恶性。为了证明所提出的模糊c均值在聚类中心的新初始化下的有效性能,该工作将结果与最新的模糊c均值算法的结果进行了比较;此外,它使用Silhouette方法验证从乳腺癌数据集中获得的聚类。

著录项

  • 来源
    《Expert Systems with Application》 |2011年第4期|p.3793-3805|共13页
  • 作者单位

    Department of Engineering Science, National Cheng Kung University, No. 1, Ta-Hsueh Road, Tainan 701, Taiwan, ROC;

    Department of Engineering Science, National Cheng Kung University, No. 1, Ta-Hsueh Road, Tainan 701, Taiwan, ROC,Department of Applied Ceoinformatics, Chia Nan University of Pharmacy ε Science, No. 60, Erh-Jen RD., Sec.1, Jen-Te, Tainan 717, Taiwan, ROC;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    cluster centers; fuzzy c-means; memberships; objective function; data grouping;

    机译:集群中心;模糊c均值;成员资格;目标函数;数据分组;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号