首页> 外文会议>International conference on very large data bases >Read-Once Functions and Query Evaluation in Probabilistic Databases
【24h】

Read-Once Functions and Query Evaluation in Probabilistic Databases

机译:概率数据库中的读一次函数和查询评估

获取原文

摘要

Probabilistic databases hold promise of being a viable means for large-scale uncertainty management, increasingly needed in a number of real world applications domains. However, query evaluation in probabilistic databases remains a computational challenge. Prior work on efficient exact query evaluation in probabilistic databases has largely concentrated on query-centric formulations (e.g., safe plans, hierarchical queries), in that, they only consider characteristics of the query and not the data in the database. It is easy to construct examples where a supposedly hard query run on an appropriate database gives rise to a tractable query evaluation problem. In this paper, we develop efficient query evaluation techniques that leverage characteristics of both the query and the data in the database. We focus on tuple-independent databases where the query evaluation problem is equivalent to computing marginal probabilities of Boolean formulas associated with the result tuples. This latter task is easy if the Boolean formulas can be factorized into a form that has every variable appearing at most once (called read-once). However, a naive approach that directly uses previously developed Boolean formula factorization algorithms is inefficient, because those algorithms require the input formulas to be in the disjunctive normal form (DNF). We instead develop novel, more efficient factorization algorithms that directly construct the read-once expression for a result tuple Boolean formula (if one exists), for a large subclass of queries (specifically, conjunctive queries without self-joins). We empirically demonstrate that (1) our proposed techniques are orders of magnitude faster than generic inference algorithms for queries where the result Boolean formulas can be factorized into read-once expressions, and (2) for the special case of hierarchical queries, they rival the efficiency of prior techniques specifically designed to handle such queries.
机译:概率数据库持有承诺成为大规模不确定性管理的可行方法,越来越需要在许多现实世界应用领域中所需的。但是,概率数据库中的查询评估仍然是计算挑战。在概率数据库中有效的有效精确查询评估的事先在很大程度上集中在查询中心的配方上(例如,安全计划,分层查询),其中,它们仅考虑查询的特征而不是数据库中的数据。很容易构建一个例子,其中在适当的数据库上运行了据说硬查询,导致了易诊的查询评估问题。在本文中,我们开发了高效的查询评估技术,该技术利用了数据库中查询和数据的特征。我们专注于与Query评估问题相当于计算与结果元组相关联的布尔公式的边际概率的组元数据库。如果布尔的公式可以将其分解为每个变量的表单,后者任务很容易就是最多一次(称为Read-一次)。然而,直接使用先前开发的布尔公式分子化算法的天真方法效率低下,因为这些算法需要输入公式处于析出正常形式(DNF)。我们改为开发新颖,更高效的分解算法,该算法直接构建结果元组布尔公式(如果存在一个),用于查询的大量子类(具体地,没有自行连接的联合查询)。我们经验表明,(1)我们所提出的技术是数量级比一般的推理算法更快的查询项目的结果布尔公式可以分解成只读一次表达,和(2)的分层查询的特殊情况,他们对手专门用于处理此类查询的现有技术的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号