...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >DataOps-4G: On Supporting Generalists in Data Quality Discovery
【24h】

DataOps-4G: On Supporting Generalists in Data Quality Discovery

机译:DataOps-4G: On Supporting Generalists in Data Quality Discovery

获取原文
获取原文并翻译 | 示例
           

摘要

Data preparation has become a necessary but labor and resource-intensive step to perform data analytics. To date, such activities still require considerable manual effort from experts. In this paper, we focus on a specific data preparation activity, namely data quality discovery. We explore different settings in which data workers undertake data quality discovery tasks and the implications of those settings for the efficiency and effectiveness of data workers. To this end, we propose DataOps-4G, a data quality discovery platform for generalists that allows users to interact with data without the need to write code. We wrap up pre-defined code snippets that implement useful functionalities to explore data quality and bundle the code into so-called DataOps. Then, we conduct a lab-based user study to evaluate our DataOps-4G platform from two perspectives: (i) effectiveness, the accuracy of the outcomes achieved by participants; and (ii) efficiency, their effort and strategies in task completion. Our experimental results uncover how effectiveness and efficiency can be affected by their task completion patterns and strategies. This opens up the possibility of popularizing data quality discovery processes by employing non-experts (e.g., from crowdsourcing platforms) and consequently allowing experts to focus on more complex activities (e.g., building machine learning models).

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号