...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Understanding and Optimizing Conjunctive Predicates Under Memory-Efficient Storage Layouts
【24h】

Understanding and Optimizing Conjunctive Predicates Under Memory-Efficient Storage Layouts

机译:在内存高效的存储布局下了解和优化联合谓词

获取原文
获取原文并翻译 | 示例
           

摘要

Database queries can contain multiple predicates. The optimization of conjunctive predicates is still vital to the overall performance of analytic data processing tasks. Prior work proposes several memory-efficient storage layouts, e.g., BitWeaving and ByteSlice, to significantly accelerate predicate evaluation, as circuit-level intra-cycle parallelism available in modern CPUs can be exploited such that the total number of instructions can be dramatically reduced. However, the performance potential of conjunctive predicates has not been harvested yet under such storage layouts as there is no accurate cost model to provide necessary insights that guide the optimization process. In this paper, we propose a hybrid empirical/analytical cost model (Understanding) to unveil the performance characteristics of such storage layouts when applying to predicate evaluation. Our cost model takes into account effect of non-linear factors, e.g., cache miss and branch misprediction, and easily applies to different CPUs. The main finding from our cost model is to distinguish high-cost instruction (which suffers from cache miss and/or branch misprediction) from low-cost instruction (which enjoys cache hit and correct branch prediction) in the context of predicate evaluation under these storage layouts. Guided by such a finding, we propose a simple execution scheme Hebe (Optimizing), which is order-oblivious while maintaining high performance. Hebe is attractive to the query optimizer (QO), as the QO does not need to go through a sampling process to decide the optimal evaluation order in advance. The intuition behind Hebe is to significantly reduce the number of high-cost instructions while keeping low-cost instructions unchanged. Our finding from Hebe sheds light on the importance of accurate cost model that guide us to derive an efficient execution scheme for query processing on modern CPUs.
机译:数据库查询可以包含多个谓词。联合谓词的优化对分析数据处理任务的整体性能仍然至关重要。事先工作提出了几个记忆有效的存储布局,例如,位织造和Byteslice,以显着加速谓词评估,因为可以利用现代CPU中可用的电路级内循环并行性,以便可以显着降低指令总数。然而,在这种存储布局之下尚未收获联合谓词的性能潜力,因为没有准确的成本模型,以提供指导优化过程的必要洞察。在本文中,我们提出了一种混合实证/分析成本模型(理解),以在申请谓词评估时揭示这种存储布局的性能特征。我们的成本模型考虑了非线性因素的影响,例如缓存未命中和分支错误规定,并轻松适用于不同的CPU。我们的成本模型的主要发现是将高成本指令(遭受高速缓存未命中和/或分支错误规定)在这些存储下的谓词评估的上下文中,从低成本指令(在高速缓存命中和正确的分支预测)中布局。通过这样的发现,我们提出了一个简单的执行方案Hebe(优化),这是令人满意的,同时保持高性能。 Hebe对查询优化器(Qo)有吸引力,因为Qo不需要通过采样过程来提前决定最佳评估顺序。 Hebe背后的直觉是显着减少高成本指令的数量,同时保持低成本指令不变。我们从Hebbe Sheds阐明了准确成本模型的重要性,指导我们在现代CPU上获得了高效执行方案进行查询处理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号