首页> 外文会议>IEEE International Conference on Data Engineering >CrowdFusion: A Crowdsourced Approach on Data Fusion Refinement
【24h】

CrowdFusion: A Crowdsourced Approach on Data Fusion Refinement

机译:众群:数据融合细化的众群方法

获取原文

摘要

Data fusion has played an important role in data mining because high quality data is required in a lot of applications. As on-line data may be out-of-date and errors in the data may propagate with copying and referring between sources, it is hard to achieve satisfying results with merely applying existing data fusion methods to fuse Web data. In this paper, we make use of the crowd to achieve high quality data fusion result. We design a framework selecting a set of tasks to ask crowds in order to improve the confidence of data. Since data are correlated and crowds may provide incorrect answers, how to select a proper set of tasks to ask the crowd is a very challenging problem. In this paper, we design an approximation solution to address this challenge since we prove that the problem is at NP-hard. To further improve the efficiency, we design a pruning strategy and a preprocessing method, which effectively improve the performance of the proposed approximation solution. We verify the solutions with extensive experiments on a real crowdsourcing platform.
机译:数据融合在数据挖掘中发挥着重要作用,因为许多应用中需要高质量的数据。由于在线数据可能是过期的,并且数据中的错误可以通过复制和源之间传播来传播源,很难通过仅将现有的数据融合方法应用于保险丝Web数据来实现满意的结果。在本文中,我们利用人群来实现高质量的数据融合结果。我们设计一个框架,选择一组任务以提出人群以提高数据的置信度。由于数据是相关的,并且人群可以提供不正确的答案,如何选择一个正确的任务,以便询问人群是一个非常具有挑战性的问题。在本文中,我们设计了一个近似解决方案来解决这一挑战,因为我们证明问题处于NP-Hard。为了进一步提高效率,我们设计了修剪策略和预处理方法,有效地提高了所提出的近似解的性能。我们验证了对真实众包平台的广泛实验的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号