首页> 外文会议>Advances in intelligent web mastering-3 >Evaluation of Categorical Data Clustering
【24h】

Evaluation of Categorical Data Clustering

机译:分类数据聚类评估

获取原文
获取原文并翻译 | 示例

摘要

Methods of cluster analysis are well known techniques of multi-variate analysis used for many years. Their main applications concern clustering objects characterized by quantitative variables. For this case various coefficients for clustering evaluation and determination of cluster numbers have been proposed. However, in some areas, i.e., for segmentation of Internet users, the variables are often nominal or ordinal as their origin in questionnaire responses. That is why we are dealing with the evaluation criteria for the case of categorical variables here. The criteria based on variability measures are proposed. Instead of variance as a measure for quantitative variables, three measures for nominal variables are considered: the variability measure based on a modal frequency, Gini's coefficient of mutability, and the entropy. The proposed evaluation criteria are applied to a real-dataset.
机译:聚类分析方法是多年使用的众所周知的多变量分析技术。它们的主要应用涉及以定量变量为特征的聚类对象。对于这种情况,已经提出了用于聚类评估和确定聚类数的各种系数。但是,在某些区域,即,对于互联网用户的细分,变量通常是名义的或有序的,作为它们在问卷调查中的来源。这就是为什么我们在这里处理分类变量情况的评估标准。提出了基于可变性度量的标准。可以考虑使用三种测量名义变量的方法,而不是使用方差作为定量变量的量度:基于模态频率的变异性量度,基尼的变异系数和熵。建议的评估标准将应用于实际数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号