A parallel query processing system based on graph-based database partitioning

Nam Yoon-Min; Han Donghyoung; Kim Min-Soo

首页> 外文期刊>Information Sciences: An International Journal >A parallel query processing system based on graph-based database partitioning

【24h】

A parallel query processing system based on graph-based database partitioning

机译：基于基于图形的数据库分区的并行查询处理系统

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

As parallel database systems have large amounts of data to process, it is important to utilize a scalable and efficient horizontal database partitioning method. The existing partitioning methods have major drawbacks that not only cause large amounts of data redundancy but also still require expensive shuffle operations for join queries in many cases-despite their high data redundancy. We elucidate upon the drawbacks originating from the tree-based partitioning schemes and propose a novel graph-based database partitioning method called GPT that both improves the query performance and reduces data redundancy. We integrate the proposed GPT method into a parallel query processing system, Spark SQL, across all the relevant layers and modules, including the query plan generator and the scan operator. Through extensive experiments using three benchmarks, TPC-DS, IMDB and BioWarehouse, we show that GPT significantly outperforms the state-of-the-art method in terms of both storage overhead and query performance. (C) 2018 Elsevier Inc. All rights reserved.

机译：作为并行数据库系统具有大量数据来处理，重要的是利用可扩展和高效的水平数据库分区方法。现有的分区方法具有主要的缺点，不仅导致大量数据冗余，而且还需要在许多情况下为加入查询进行昂贵的Shuffle操作 - 尽管他们的数据冗余高。我们阐明源自基于树的分区方案的缺点，并提出了一种名为GPT的基于图形的数据库分区方法，其均提高了查询性能并降低了数据冗余。我们将所提出的GPT方法集成到并行查询处理系统中，Spark SQL，跨所有相关层和模块，包括查询计划生成器和扫描操作员。通过使用三个基准测试，TPC-DS，IMDB和BiowareHouse的广泛实验，我们表明GPT在存储开销和查询性能方面显着优于最先进的方法。（c）2018年Elsevier Inc.保留所有权利。

著录项

来源
《Information Sciences: An International Journal》 |2019年第2019期|共24页
作者
Nam Yoon-Min; Han Donghyoung; Kim Min-Soo;
展开▼
作者单位

DGIST Daegu South Korea;

DGIST Daegu South Korea;

DGIST Daegu South Korea;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动信息理论;计算机的应用;信息与知识传播;自动化技术、计算机技术;
关键词
Horizontal database partitioning; Graph-based partitioning; Parallel query processing;

机译：水平数据库分区;基于图形的分区;并行查询处理;

相似文献

外文文献
中文文献
专利

1. A parallel query processing system based on graph-based database partitioning [J] . Nam Yoon-Min, Han Donghyoung, Kim Min-Soo Information Sciences: An International Journal . 2019,第期

机译：基于基于图形的数据库分区的并行查询处理系统
2. Parallel algorithms for selection query processing involving index in parallel database systems [J] . David Taniar, J Wenny Rahayu, Rebecca Boon-Noi Tan International Journal of Computer Systems Science & Engineering . 2004,第2期

机译：并行数据库系统中涉及索引的选择查询处理的并行算法
3. Load-aware inter-co-processor parallelism in database query processing [J] . Sebastian Bress, Norbert Siegmund, Max Heimel, Data & Knowledge Engineering . 2014,第sepa期

机译：数据库查询处理中的负载感知协同处理器间并行性
4. A Graph-based Database Partitioning Method for Parallel OLAP Query Processing [C] . Yoon-Min Nam, Min-Soo Kim, Donghyoung Han International Conference on Data Engineering . 2018

机译：基于图形的并行OLAP查询处理的数据库分区方法
5. Parallel query processing on a cluster-based database system. [D] . Imasaki, Kenji. 2004

机译：基于集群的数据库系统上的并行查询处理。
6. Algorithms for effective querying of compound graph-based pathway databases [O] . Ugur Dogrusoz, Ahmet Cetintas, Emek Demir, 2009

机译：基于复合图的路径数据库的有效查询算法
7. Parallel query processing on a cluster-based database system [O] . Kenji Imasaki -1

机译：基于群集的数据库系统的并行查询处理
8. Methodolgy, Based on Analytical Modeling, for the Design of Parallel and Distributed Architectures for Relational Database Query Processors [R] . Kearns, T. G. 1987

机译：methodolgy，基于分析建模，用于关系数据库查询处理器的并行和分布式架构设计

A parallel query processing system based on graph-based database partitioning

摘要

著录项

相似文献

相关主题

期刊订阅