Handling probabilistic integrity constraints in pay-as-you-go reconciliation of data models

Nguyen Quoc Viet Hung; Weidlich Matthias; Nguyen Thanh Tam; Miklos Zoltan; Aberer Karl; Gal Avigdor; Stantic Bela

首页> 外文期刊>Information Systems >Handling probabilistic integrity constraints in pay-as-you-go reconciliation of data models

【24h】

Handling probabilistic integrity constraints in pay-as-you-go reconciliation of data models

机译：在数据模型的按需付费对帐中处理概率完整性约束

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data models capture the structure and characteristic properties of data entities, e.g., in terms of a database schema or an ontology. They are the backbone of diverse applications, reaching from information integration, through peer-to-peer systems and electronic commerce to social networking. Many of these applications involve models of diverse data sources. Effective utilisation and evolution of data models, therefore, calls for matching techniques that generate correspondences between their elements. Various such matching tools have been developed in the past. Yet, their results are often incomplete or erroneous, and thus need to be reconciled, i.e., validated by an expert. This paper analyses the reconciliation process in the presence of large collections of data models, where the network induced by generated correspondences shall meet consistency expectations in terms of integrity constraints. We specifically focus on how to handle data models that show some internal structure and potentially differ in terms of their assumed level of abstraction. We argue that such a setting calls for a probabilistic model of integrity constraints, for which satisfaction is preferred, but not required. In this work, we present a model for probabilistic constraints that enables reasoning on the correctness of individual correspondences within a network of data models, in order to guide an expert in the validation process. To support pay-as-you-go reconciliation, we also show how to construct a set of high-quality correspondences, even if an expert validates only a subset of all generated correspondences. We demonstrate the efficiency of our techniques for real-world datasets comprising database schemas and ontologies from various application domains. (C) 2019 Elsevier Ltd. All rights reserved.

机译：数据模型例如根据数据库模式或本体来捕获数据实体的结构和特性。它们是各种应用程序的中坚力量，从信息集成到对等系统和电子商务，再到社交网络。其中许多应用程序涉及各种数据源的模型。因此，数据模型的有效利用和发展要求匹配技术能够在其元素之间生成对应关系。过去已经开发了各种这样的匹配工具。然而，它们的结果通常是不完整的或错误的，因此需要对账，即由专家进行验证。本文分析了在存在大量数据模型的情况下的对帐过程，其中由生成的对应关系引起的网络应在完整性约束方面满足一致性期望。我们特别关注于如何处理显示某些内部结构并可能在假定的抽象级别方面有所不同的数据模型。我们认为，这样的设置需要完整性约束的概率模型，对于该模型而言，满意是优选的，但不是必需的。在这项工作中，我们提出了一个概率约束模型，该模型可以对数据模型网络中各个对应关系的正确性进行推理，以指导验证过程的专家。为了支持现收现付对帐，即使专家仅验证所有生成的通信中的一部分，我们也将展示如何构造一组高质量的通信。我们演示了我们的技术对于包含来自不同应用程序领域的数据库模式和本体的真实数据集的效率。（C）2019 Elsevier Ltd.保留所有权利。

著录项

来源
《Information Systems》 |2019年第7期|166-180|共15页
作者
Nguyen Quoc Viet Hung; Weidlich Matthias; Nguyen Thanh Tam; Miklos Zoltan; Aberer Karl; Gal Avigdor; Stantic Bela;
展开▼
作者单位

Griffith Univ, Gold Coast, Australia;

Humboldt Univ, Berlin, Germany;

Ecole Polytech Fed Lausanne, Lausanne, Switzerland;

Univ Rennes 1, Rennes, France;

Ecole Polytech Fed Lausanne, Lausanne, Switzerland;

Israel Inst Technol, Haifa, Israel;

Griffith Univ, Gold Coast, Australia;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Data integration; Probabilistic constraints; Model reconciliation;

机译：数据集成;概率约束;模型协调;

相似文献

外文文献
中文文献
专利

1. Handling probabilistic integrity constraints in pay-as-you-go reconciliation of data models [J] . Nguyen Quoc Viet Hung, Weidlich Matthias, Nguyen Thanh Tam, Information Systems . 2019,第Jula期

机译：处理数据模型的支付和解中的概率完整性约束
2. Consistency checking and querying in probabilistic databases under integrity constraints [J] . Sergio Flesca, Filippo Furfaro, Francesco Parisi Journal of computer and system sciences . 2014,第7期

机译：完整性约束下概率数据库中的一致性检查和查询
3. Extended tuple constraint type as a complex integrity constraint type in XML data model - definition and enforcement [J] . Vidakovi? Jovana, Risti? Sonja, Kordi? Slavica, Computer Science and Information Systems . 2018,第3期

机译：扩展元组约束类型作为XML数据模型中的复杂完整性约束类型-定义和实施
4. HANDLING INTEGRITY CONSTRAINTS OF COMPLEX OBJECTS IN SPATIAL DATABASES [C] . Hanna H. Kemppainen International Society for Photogrammetry and Remote Sensing Congress . 2009

机译：处理空间数据库中复杂对象的完整性约束
5. A Probabilistic Approach to Uncertainty Quantification in Pay-As-You-Go Data Integration [D] . ?Sánchez Serrano, Fernando René 2019

机译：概率方法来量化的不确定性在即付即用即付数据集成
6. A benchmark comparison of deterministic and probabilistic methods for defining manual review datasets in duplicate records reconciliation [O] . Erel Joffe, Michael J Byrne, Phillip Reeder, 2014

机译：在重复记录对账中定义手动审核数据集的确定性方法和概率方法的基准比较
7. Consistency Checking and Querying in Probabilistic Databases under Integrity Constraints [O] . Sergio Flescaa, Francesco Parisia 2016

机译：完整性约束下概率数据库中的一致性检查和查询
8. Integrity Constraint Handling in a Parallel Database System [R] . Grefen, P. W. P. J. 1989

机译：并行数据库系统中的完整性约束处理

Handling probabilistic integrity constraints in pay-as-you-go reconciliation of data models

摘要

著录项

相似文献

相关主题

期刊订阅