首页> 外文会议>International conference on very large data bases >Effective and Complete Discovery of Order Dependencies via Set-based Axiomatization
【24h】

Effective and Complete Discovery of Order Dependencies via Set-based Axiomatization

机译:通过基于集合的公理化有效有效地发现订单相关性

获取原文

摘要

Integrity constraints (ICs) are useful for query optimization and for expressing and enforcing application semantics. However, formulating constraints manually requires domain expertise, is prone to human errors, and may be excessively time consuming, especially on large datasets. Hence, proposals for automatic discovery have been made for some classes of ICs, such as functional dependencies (FDs), and recently, order dependencies (ODs). ODs properly subsume FDs, as they can additionally express business rules involving order; e.g., an employee never has a higher salary while paying lower taxes than another employee. We present a new OD discovery algorithm enabled by a novel polynomial mapping to a canonical form of ODs, and a sound and complete set of axioms (inference rules) for canonical ODs. Our algorithm has exponential worst-case time complexity, O(2~(|R|) ), in the number of attributes |R| and linear complexity in the number of tuples. We prove that it produces a complete and minimal set of ODs. Using real and synthetic datasets, we experimentally show orders-of-lnagnitude performance improvements over the prior state-of-the-art.
机译:完整性约束(IC)可用于查询优化以及表达和加强应用程序语义。但是,手动制定约束条件需要领域专业知识,容易出现人为错误,并且可能会非常耗时,尤其是在大型数据集上。因此,已经针对某些类型的IC提出了自动发现的建议,例如功能相关性(FD)和最近的顺序相关性(OD)。 OD可以适当地包含FD,因为它们可以额外表达涉及订单的业务规则;例如,某位员工的薪水从来没有比另一位员工高,而缴纳的税款却更低。我们提出了一种新的OD发现算法,该算法通过将新颖的多项式映射到OD的规范形式以及规范OD的健全且完整的公理集(推理规则)来实现。我们的算法在属性数量| R |中具有指数最坏情况下的时间复杂度O(2〜(| R |))。和元组数量的线性复杂度。我们证明它产生了一套完整而最少的OD。使用真实的和综合的数据集,我们实验性地显示了在先验技术水平上性能的提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号