首页> 外文期刊>Journal of Data and Information Science >A Tailor-made Data Quality Approach for Higher Educational Data
【24h】

A Tailor-made Data Quality Approach for Higher Educational Data

机译:较高教育数据的量身定制的数据质量方法

获取原文
           

摘要

Purpose This paper relates the definition of data quality procedures for knowledge organizations such as Higher Education Institutions. The main purpose is to present the flexible approach developed for monitoring the data quality of the European Tertiary Education Register (ETER) database, illustrating its functioning and highlighting the main challenges that still have to be faced in this domain. Design/methodology/approach The proposed data quality methodology is based on two kinds of checks, one to assess the consistency of cross-sectional data and the other to evaluate the stability of multiannual data. This methodology has an operational and empirical orientation. This means that the proposed checks do not assume any theoretical distribution for the determination of the threshold parameters that identify potential outliers, inconsistencies, and errors in the data. Findings We show that the proposed cross-sectional checks and multiannual checks are helpful to identify outliers, extreme observations and to detect ontological inconsistencies not described in the available meta-data. For this reason, they may be a useful complement to integrate the processing of the available information. Research limitations The coverage of the study is limited to European Higher Education Institutions. The cross-sectional and multiannual checks are not yet completely integrated. Practical implications The consideration of the quality of the available data and information is important to enhance data quality-aware empirical investigations, highlighting problems, and areas where to invest for improving the coverage and interoperability of data in future data collection initiatives. Originality/value The data-driven quality checks proposed in this paper may be useful as a reference for building and monitoring the data quality of new databases or of existing databases available for other countries or systems characterized by high heterogeneity and complexity of the units of analysis without relying on pre-specified theoretical distributions.
机译:目的本文涉及高等教育机构等知识组织的数据质量程序的定义。主要目的是提供用于监控欧洲三级教育登记册(eTet)数据库的数据质量的灵活方法,说明其运作和突出仍然面临该领域的主要挑战。设计/方法/方法建议的数据质量方法基于两种检查,一个评估横截面数据的一致性,另一个评估多年份数据的稳定性。该方法具有操作和经验的定位。这意味着所提出的检查不假设确定识别潜在异常值,不一致性和错误中的阈值参数的任何理论分布。调查结果我们表明,所提出的横截面检查和多年份检查有助于识别异常值,极端观察和检测可用元数据中未描述的本体不一致。因此,它们可能是集成可用信息处理的有用补充。研究限制该研究的覆盖范围仅限于欧洲高等教育机构。横截面和多年份检查尚未完全集成。实际含义考虑了可用数据和信息的质量,以提高数据质量意识的经验调查,突出问题以及在将来的数据收集计划中提高数据的覆盖率和互操作性的地区来提高数据质量意识的经验研究。原创性/值本文提出的数据驱动质量检查可能是用于构建和监控新数据库的数据质量或可用于其他国家或系统的现有数据库的参考,其特征在于分析单位的高异质性和复杂性。不依赖于预先指定的理论分布。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号