QETL: An approach to on-demand ETL from non-owned data sources

Baldacci Lorenzo; Golfarelli Matteo; Graziani Simone; Rizzi Stefano

首页> 外文期刊>Data & Knowledge Engineering >QETL: An approach to on-demand ETL from non-owned data sources

【24h】

QETL: An approach to on-demand ETL from non-owned data sources

机译：QETL：一种从非自有数据源进行按需ETL的方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In traditional OLAP systems, the ETL process loads all available data in the data warehouse before users start querying them. In some cases, this may be either inconvenient (because data are supplied from a provider for a fee) or unfeasible (because of their size); on the other hand, directly launching each analysis query on source data would not enable data reuse, leading to poor performance and high costs. The alternative investigated in this paper is that of fetching and storing data on-demand, i.e., as they are needed during the analysis process. In this direction we propose the Query-Extract-Transform-Load (QETL) paradigm to feed a multidimensional cube; the idea is to fetch facts from the source data provider, load them into the cube only when they are needed to answer some OLAP query, and drop them when some free space is needed to load other facts. Remarkably, QETL includes an optimization step to cheaply extract the required data based on the specific features of the data provider. The experimental tests, made on a real case study in the genomics area, show that QETL effectively reuses data to cut extraction costs, thus leading to significant performance improvements.

机译：在传统的OLAP系统中，ETL流程在用户开始查询之前将所有可用数据加载到数据仓库中。在某些情况下，这可能是不方便的（因为从提供者处付费提供数据）或不可行的（由于其大小）；另一方面，直接对源数据启动每个分析查询将无法实现数据重用，从而导致性能低下和成本高昂。本文研究的替代方案是按需获取和存储数据，即在分析过程中需要的数据。在这个方向上，我们提出了查询-提取-转换-加载（QETL）范例来提供多维多维数据集。想法是从源数据提供程序中获取事实，仅在需要它们来回答某些OLAP查询时才将它们加载到多维数据集中，并在需要一些可用空间来加载其他事实时将其删除。值得注意的是，QETL包含一个优化步骤，可以根据数据提供者的特定功能廉价地提取所需数据。在基因组学领域的实际案例研究上进行的实验测试表明，QETL有效地重用了数据以降低提取成本，从而显着改善了性能。

著录项

来源
《Data & Knowledge Engineering》 |2017年第11期|17-37|共21页
作者
Baldacci Lorenzo; Golfarelli Matteo; Graziani Simone; Rizzi Stefano;
展开▼
作者单位

Univ Bologna, DISI, Viale Risorgimento 2, I-40136 Bologna, Italy;

Univ Bologna, DISI, Viale Risorgimento 2, I-40136 Bologna, Italy;

Univ Bologna, DISI, Viale Risorgimento 2, I-40136 Bologna, Italy;

Univ Bologna, DISI, Viale Risorgimento 2, I-40136 Bologna, Italy;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
On-demand ETL; Incremental loading; OLAP;

机译：按需ETL;增量加载;OLAP;

相似文献

外文文献
中文文献
专利

1. On-demand big data integration: A hybrid ETL approach for reproducible scientific research [J] . Kathiravelu Pradeeban, Sharma Ashish, Galhardas Helena, Distributed and Parallel Databases . 2019,第2期

机译：按需大数据集成：可重现科学研究的混合ETL方法
2. On-demand big data integration: A hybrid ETL approach for reproducible scientific research [J] . Kathiravelu Pradeeban, Sharma Ashish, Galhardas Helena, Distributed and Parallel Databases . 2019,第2期

机译：按需大数据集成：一种可再生科学研究的混合ETL方法
3. Analyzing data and data sources towards a unified approach for ensuring end-to-end data and data sources quality in healthcare 4.0 [J] . Mavrogiorgou Argyro, Kiourtis Athanasios, Perakis Konstantinos, Computer Methods and Programs in Biomedicine: An International Journal Devoted to the Development, Implementation and Exchange of Computing Methodology and Software Systems in Biomedical Research and Medical Practice . 2019,第期

机译：分析数据和数据源以确保保健4.0中的最终数据和数据源质量的统一方法
4. Lenses: An On-Demand Approach to ETL [C] . Ying Yang, Niccolo Meneghetti, Ronny Fehling, International conference on very large data bases . 2015

机译：镜头：ETL的按需方法
5. Domain-concept mining: An efficient on-demand data mining approach. [D] . Mahamaneerat, Wannapa Kay. 2008

机译：域概念挖掘：一种高效的按需数据挖掘方法。
6. Multi-Sensor-Fusion Approach for a Data-Science-Oriented Preventive Health Management System: Concept and Development of a Decentralized Data Collection Approach for Heterogeneous Data Sources [O] . Sebastian Neubert, André Geißler, Thomas Roddelkopf, 2019

机译：面向数据科学的预防性健康管理系统的多传感器融合方法：异构数据源的分散数据收集方法的概念和发展
7. Theory and Limits of On-Demand Single-Photon Sources Using Plasmonic Resonators: A Quantized Quasinormal Mode Approach [O] . Stephen Hughes, Sebastian Franke, Chris Gustin, 2019

机译：采用等离子体谐振器的按需单光子源的理论和限制：量化的Quasinormal模式方法

QETL: An approach to on-demand ETL from non-owned data sources

摘要

著录项

相似文献

相关主题

期刊订阅