...
首页> 外文期刊>Information Systems >Multi-query Optimization For Sketch-based Estimation
【24h】

Multi-query Optimization For Sketch-based Estimation

机译:用于基于草图的估计的多查询优化

获取原文
获取原文并翻译 | 示例
           

摘要

Randomized techniques, based on computing small "sketch" synopses for each stream, have recently been shown to be a very effective tool for approximating the result of a single SQL query over streaming data tuples. In this paper, we investigate the problems arising when data-stream sketches are used to process multiple such queries concurrently. We demonstrate that, in the presence of multiple query expressions, intelligently sharing sketches among concurrent query evaluations can result in substantial improvements in the utilization of the available sketching space and the quality of the resulting approximation error guarantees. We provide necessary and sufficient conditions for multi-query sketch sharing that guarantee the correctness of the result-estimation process. We also investigate the difficult optimization problem of determining sketch-sharing configurations that are optimal (e.g., under a certain error metric for a given amount of space). We prove that optimal sketch sharing typically gives rise to NP-hard questions, and we propose novel heuristic algorithms for finding good sketch-sharing configurations in practice. Results from our experimental study with queries from the TPC-H benchmark verify the effectiveness of our approach, clearly demonstrating the benefits of our sketch-sharing methodology.
机译:最近,基于对每个流计算较小的“草图”提要的随机技术已被证明是一种非常有效的工具,用于近似流数据元组上的单个SQL查询的结果。在本文中,我们调查了使用数据流草图同时处理多个此类查询时出现的问题。我们证明,在存在多个查询表达式的情况下,在并发查询评估之间智能地共享草图可以显着提高可用草图空间的利用率以及所产生的近似误差保证的质量。我们为多查询草图共享提供了必要和充分的条件,以保证结果估计过程的正确性。我们还研究确定最佳草图共享配置的困难优化问题(例如,在给定空间的特定误差度量下)。我们证明最佳的草图共享通常会引起NP难题,并且我们提出了新颖的启发式算法,以便在实践中找到良好的草图共享配置。来自TPC-H基准测试的实验研究结果证实了我们方法的有效性,清楚地证明了草图共享方法的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号