...
首页> 外文期刊>International Journal of Innovative Computing Information and Control >CETLH: A NEW HISTOGRAM-BASED CARDINALITY ESTIMATE APPROACH
【24h】

CETLH: A NEW HISTOGRAM-BASED CARDINALITY ESTIMATE APPROACH

机译:CETLH:一种新的基于直方图的基数估计方法

获取原文
获取原文并翻译 | 示例
           

摘要

For the mainstream relational database management systems, histograms play important roles in cardinality estimate. The main histogram-based cardinality estimate approaches can be classified into two categories: the proactive approach and the reactive approach. For the former, histograms are constructed and updated by periodical data scans. Data updates can not be incorporated into a histogram in real time, so between two data scans, large errors of cardinality estimate will occur. For the latter, data scans are avoided, as an alternative, query feedback records (QFRs) are collected to construct and update histograms. Although data updates can be incorporated into a histogram by replacing stale QFRs in real time, the cost of time is very expensive. For each histogram reconstruction, all buckets in the histogram will be recalculated and the large amount of computation leads to the inefficiency of the reactive approach. In this paper, we propose a novel QFR-based cardinality estimate approach which can balance the efficiency issue and the data update issue: on the one hand, it can improve the efficiency of QFR-based cardinality estimate to a practical level; on the other hand, it can incorporate data updates into a histogram in real time to fully ensure the accuracy of cardinality estimate. In our approach, a serial of small second-level histograms covering different parts of the whole value range of an attribute will be constructed. These second-level histograms can be updated independently over time to ensure the performance of the approach. As the update of each second-level histogram, QFRs still play their important roles in incorporating data updates into the histogram in real time. Extensive comparison experiments fully demonstrate the advantages of our approach in performance and accuracy.
机译:对于主流的关系数据库管理系统,直方图在基数估计中起着重要作用。基于直方图的主要基数估计方法可以分为两类:主动方法和被动方法。对于前者,通过定期数据扫描来构建和更新直方图。数据更新无法实时并入直方图中,因此在两次数据扫描之间,将发生基数估计的较大误差。对于后者,避免了数据扫描,或者,可以收集查询反馈记录(QFR)来构建和更新直方图。尽管可以通过实时替换陈旧的QFR将数据更新合并到直方图中,但是时间成本非常昂贵。对于每个直方图重构,将重新计算直方图中的所有存储桶,并且大量的计算导致反应式方法效率低下。在本文中,我们提出了一种新颖的基于QFR的基数估计方法,该方法可以在效率问题和数据更新问题之间取得平衡:一方面,它可以将基于QFR的基数估计的效率提高到实际水平。另一方面,它可以将数据更新实时合并到直方图中,以完全确保基数估计的准确性。在我们的方法中,将构造一系列小的二级直方图,它们覆盖属性整个值范围的不同部分。这些第二级直方图可以随时间独立更新,以确保方法的性能。随着每个第二级直方图的更新,QFR在实时将数据更新并入直方图中仍发挥着重要作用。大量的比较实验充分证明了我们的方法在性能和准确性方面的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号