【24h】

Lineage Tracing for General Data Warehouse Transformations

机译:常规数据仓库转换的沿袭跟踪

获取原文
获取原文并翻译 | 示例

摘要

Data warehousing systems integrate information from operational data sources into a central repository to enable analysis and mining of the integrated information. During the integration process, source data typically undergoes a series of transformations, which may vary from simple algebraic operations or aggregations to complex "data cleansing" procedures. In a warehousing environment, the data lineage problem is that of tracing warehouse data items back to the original source items from which they were derived. We formally define the lineage tracing problem in the presence of general data warehouse transformations, and we present algorithms for lineage tracing in this environment. Our tracing procedures take advantage of known structure or properties of transformations when present, but also work in the absence of such information. Our results can be used as the basis for a lineage tracing tool in a general warehousing setting, and also can guide the design of data warehouses that enable efficient lineage tracing.
机译:数据仓库系统将来自运营数据源的信息集成到中央存储库中,以实现对集成信息的分析和挖掘。在集成过程中,源数据通常会经历一系列转换,这些转换可能会从简单的代数运算或聚合到复杂的“数据清理”过程有所不同。在仓库环境中,数据沿袭问题是将仓库数据项追溯到其来源的原始来源。在存在常规数据仓库转换的情况下,我们正式定义了沿袭跟踪问题,并且我们提出了在这种环境下进行沿袭跟踪的算法。当存在时,我们的跟踪过程会利用转换的已知结构或属性,但是在没有此类信息的情况下也可以工作。我们的结果可以用作常规仓库设置中谱系跟踪工具的基础,还可以指导实现有效谱系跟踪的数据仓库设计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号