Bayesian networks for supporting query processing over incomplete autonomous databases

Rohit Raghunathan; Sushovan De; Subbarao Kambhampati

首页> 外文期刊>Journal of Intelligent Information Systems >Bayesian networks for supporting query processing over incomplete autonomous databases

【24h】

Bayesian networks for supporting query processing over incomplete autonomous databases

机译：贝叶斯网络，用于支持对不完整自治数据库的查询处理

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

As the information available to naive users through autonomous data sources continues to increase, mediators become important to ensure that the wealth of information available is tapped effectively. A key challenge that these information mediators need to handle is the varying levels of incompleteness in the underlying databases in terms of missing attribute values. Existing approaches such as QPIAD aim to mine and use Approximate Functional Dependencies (AFDs) to predict and retrieve relevant incomplete tuples. These approaches make independence assumptions about missing values-which critically hobbles their performance when there are tuples containing missing values for multiple correlated attributes. In this paper, we present a principled probabilistic alternative that views an incomplete tuple as defining a distribution over the complete tuples that it stands for. We learn this distribution in terms of Bayesian networks. Our approach involves min-ing/"learning" Bayesian networks from a sample of the database, and using it to do both imputation (predict a missing value) and query rewriting (retrieve relevant results with incompleteness on the query-constrained attributes, when the data sources are autonomous). We present empirical studies to demonstrate that (ⅰ) at higher levels of incompleteness, when multiple attribute values are missing, Bayesian networks do provide a significantly higher classification accuracy and (ⅱ) the relevant possible answers retrieved by the queries reformulated using Bayesian networks provide higher precision and recall than AFDs while keeping query processing costs manageable.

机译：随着通过自治数据源提供给天真用户的信息的不断增加，调解员对于确保有效利用可用信息的财富变得至关重要。这些信息中介者需要处理的一个关键挑战是，在缺少属性值方面，底层数据库中的不完整程度各不相同。 QPIAD等现有方法旨在挖掘和使用近似功能依赖项（AFD）来预测和检索相关的不完整元组。这些方法对缺失值进行独立性假设-当存在包含多个相关属性的缺失值的元组时，这将严重阻碍其性能。在本文中，我们提出了一种原则上的概率替代方案，该方案将不完整的元组视为定义了代表完整的元组的分布。我们根据贝叶斯网络来学习这种分布。我们的方法涉及从数据库样本中挖掘/“学习”贝叶斯网络，并使用它来进行插补（预测缺失值）和查询重写（当查询约束的属性不完整时检索相关结果）。数据源是自主的）。我们目前的经验研究表明，（ⅰ）在较高的不完整性级别上，当缺少多个属性值时，贝叶斯网络的确提供了更高的分类准确性，并且（ⅱ）通过使用贝叶斯网络重新构造的查询所检索的相关可能答案提供了更高的在保持查询处理成本可控的同时，比AFD具有更高的精确度和召回率。

著录项

来源
《Journal of Intelligent Information Systems》 |2014年第3期|595-618|共24页
作者
Rohit Raghunathan; Sushovan De; Subbarao Kambhampati;
展开▼
作者单位

Amazon, Seattle WA, USA;

Computer Science and Engineering, Arizona State University, Tempe AZ, USA;

Computer Science and Engineering, Arizona State University, Tempe AZ, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Data cleaning; Bayesian networks; Query rewriting; Autonomous database;

机译：数据清理;贝叶斯网络;查询重写;自治数据库;

相似文献

外文文献
中文文献
专利

1. Query processing over incomplete autonomous databases: query rewriting using learned data dependencies [J] . Garrett Wolf, Aravind Kalavagattu, Hemal Khatri, VLDB journal . 2009,第5期

机译：在不完整的自治数据库上进行查询处理：使用学习到的数据依赖项进行查询重写
2. Query Processing Over Incomplete Databases [J] . Hathairat Ketmaneechairat Journal of digital information management . 2020,第1期

机译：不完整数据库的查询处理
3. Query Processing Over Incomplete Databases [J] . Hathairat Ketmaneechairat Journal of Information Security Research . 2020,第1期

机译：在不完整数据库上查询处理
4. Query Processing over Incomplete Autonomous Databases [C] . Garret Wolf, Hemal Khatri, Bhaumik Chokshi, 33rd International Conference on Very Large Data Bases(VLDB 2007) . 2007

机译：不完整自治数据库上的查询处理
5. Atomic commitment and query processing in database systems over wide-area active networks. [D] . Zhang, Zhili. 1999

机译：广域活动网络上数据库系统中的原子承诺和查询处理。
6. In-Network Processing of an Iceberg Join Query in Wireless Sensor Networks Based on 2-Way Fragment Semijoins [O] . Hyunchul Kang 2015

机译：基于2-Way片段半联接的无线传感器网络中Iceberg联接查询的网络内处理
7. Bayes Networks for Supporting Query Processing Over Incomplete Autonomous Databases [O] . Raghunathan, Rohit, De, Sushovan, Kambhampati, Subbarao 2012

机译：贝叶斯网络支持不完整的查询处理自治数据库

Bayesian networks for supporting query processing over incomplete autonomous databases

摘要

著录项

相似文献

相关主题

期刊订阅