首页> 外文会议>Databases and information systems VII >The Influence of Data Quality on Clustering Outcomes
【24h】

The Influence of Data Quality on Clustering Outcomes

机译:数据质量对聚类结果的影响

获取原文
获取原文并翻译 | 示例

摘要

Relationship between clustering and data quality has not been thoroughly established. It is usually assumed that input dataset does not contain any errors or contains some "noise", and this concept of "noise" is not related to any data quality concept. In this paper we focus on the four most commonly used data quality dimensions, namely accuracy, completeness, consistency and timeliness. We evaluate the impact these quality dimensions on clustering outcomes in order to find out which of them has the most negative effect. Four different clustering algorithms and five real datasets were selected to show the interaction between data quality and cluster validity.
机译:群集和数据质量之间的关系尚未完全建立。通常假定输入数据集不包含任何错误或包含一些“噪声”,并且这种“噪声”的概念与任何数据质量概念都不相关。在本文中,我们关注四个最常用的数据质量维度,即准确性,完整性,一致性和及时性。我们评估了这些质量维度对聚类结果的影响,以找出其中哪个影响最大。选择了四种不同的聚类算法和五个真实的数据集来显示数据质量和聚类有效性之间的相互作用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号