Read-Once Functions and Query Evaluation in Probabilistic Databases

机译：概率数据库中的读一次函数和查询评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Probabilistic databases hold promise of being a viable means for large-scale uncertainty management, increasingly needed in a number of real world applications domains. However, query evaluation in probabilistic databases remains a computational challenge. Prior work on efficient exact query evaluation in probabilistic databases has largely concentrated on query-centric formulations (e.g., safe plans, hierarchical queries), in that, they only consider characteristics of the query and not the data in the database. It is easy to construct examples where a supposedly hard query run on an appropriate database gives rise to a tractable query evaluation problem. In this paper, we develop efficient query evaluation techniques that leverage characteristics of both the query and the data in the database. We focus on tuple-independent databases where the query evaluation problem is equivalent to computing marginal probabilities of Boolean formulas associated with the result tuples. This latter task is easy if the Boolean formulas can be factorized into a form that has every variable appearing at most once (called read-once). However, a naive approach that directly uses previously developed Boolean formula factorization algorithms is inefficient, because those algorithms require the input formulas to be in the disjunctive normal form (DNF). We instead develop novel, more efficient factorization algorithms that directly construct the read-once expression for a result tuple Boolean formula (if one exists), for a large subclass of queries (specifically, conjunctive queries without self-joins). We empirically demonstrate that (1) our proposed techniques are orders of magnitude faster than generic inference algorithms for queries where the result Boolean formulas can be factorized into read-once expressions, and (2) for the special case of hierarchical queries, they rival the efficiency of prior techniques specifically designed to handle such queries.

机译：概率数据库持有承诺成为大规模不确定性管理的可行方法，越来越需要在许多现实世界应用领域中所需的。但是，概率数据库中的查询评估仍然是计算挑战。在概率数据库中有效的有效精确查询评估的事先在很大程度上集中在查询中心的配方上（例如，安全计划，分层查询），其中，它们仅考虑查询的特征而不是数据库中的数据。很容易构建一个例子，其中在适当的数据库上运行了据说硬查询，导致了易诊的查询评估问题。在本文中，我们开发了高效的查询评估技术，该技术利用了数据库中查询和数据的特征。我们专注于与Query评估问题相当于计算与结果元组相关联的布尔公式的边际概率的组元数据库。如果布尔的公式可以将其分解为每个变量的表单，后者任务很容易就是最多一次（称为Read-一次）。然而，直接使用先前开发的布尔公式分子化算法的天真方法效率低下，因为这些算法需要输入公式处于析出正常形式（DNF）。我们改为开发新颖，更高效的分解算法，该算法直接构建结果元组布尔公式（如果存在一个），用于查询的大量子类（具体地，没有自行连接的联合查询）。我们经验表明，（1）我们所提出的技术是数量级比一般的推理算法更快的查询项目的结果布尔公式可以分解成只读一次表达，和（2）的分层查询的特殊情况，他们对手专门用于处理此类查询的现有技术的效率。

著录项

来源
《International conference on very large data bases》|2010年||共12页
会议地点
作者
Prithviraj Sen; Amol Deshpande; Lise Getoor;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13;
关键词

相似文献

外文文献
中文文献
专利

1. Combining intensional with extensional query evaluation in tuple independent probabilistic databases [J] . Qin B., Wang S. Information Sciences: An International Journal . 2011,第4期

机译：在元组独立的概率数据库中将内涵式与扩展式查询评估相结合
2. Semantics and evaluation of top-k queries in probabilistic databases [J] . Xi Zhang, Jan Chomicki Distributed and Parallel Databases . 2009,第1期

机译：概率数据库中前k个查询的语义和评估
3. QUERY EVALUATION IN PROBABILISTIC RELATIONAL DATABASES [J] . Zimanyi E. Theoretical computer science . 1997,第1a2期

机译：概率关系数据库中的查询评估
4. Read-Once Functions and Query Evaluation in Probabilistic Databases [C] . Prithviraj Sen, Amol Deshpande, Lise Getoor International conference on very large data bases;VLDB 2010 . 2011

机译：概率数据库中的只读函数和查询评估
5. Scalable Query Evaluation over Complex Probabilistic Databases. [D] . Jha, Abhay. 2012

机译：对复杂概率数据库的可扩展查询评估。
6. Evaluating the Use of Existing Data Sources Probabilistic Linkage and Multiple Imputation to Build Population-based Injury Databases Across Phases of Trauma Care [O] . Craig Newgard, Susan Malveau, Kristan Staudenmayer, -1

机译：评估现有数据源概率联动多重归因于构建基于人口伤数据库的使用在整个创伤护理的阶段
7. Read-Once Functions and Query Evaluation in Probabilistic Databases [O] . Prithviraj Sen, Amol Deshpande, Lise Getoor 2010

机译：概率数据库中的只读函数和查询评估
8. Querying databases of trajectories of differential equations 2: Index functions [R] . Grossman, Robert 1991

机译：查询微分方程轨迹数据库2：索引函数

Read-Once Functions and Query Evaluation in Probabilistic Databases

摘要

著录项

相似文献

相关主题

期刊订阅