首页> 外文期刊>Knowledge-Based Systems >Large scale instance matching via multiple indexes and candidate selection
【24h】

Large scale instance matching via multiple indexes and candidate selection

机译:通过多个索引和候选者选择进行大规模实例匹配

获取原文
获取原文并翻译 | 示例
           

摘要

Instance matching aims to discover the linkage between different descriptions of real objects across heterogeneous data sources. With the rapid development of Semantic Web, especially of the linked data, automatically instance matching has been become the fundamental issue for ontological data sharing and integration. Instances in the ontologies are often in large scale, which contains millions of, or even hundreds of millions objects. Directly applying previous schema level ontology matching methods is infeasible. In this paper, we systematically investigate the characteristics of instance matching, and then propose a scalable and efficient instance matching approach named VMI. VMI generates multiple vectors for different kinds of intained in the ontology instances, and uses a set of inverted indexes based rules to get the primary matching candidates. Then it employs user customized property values to further eliminate the incorrect matchings. Finally the similarities of matching candidates are computed as the integrated vector distances and the matching results are extracted. Experiments on instance track from OAEI 2009 and OAEI 2010 show that the proposed method achieves better effectiveness and efficiency (a speedup of more than 100 times and a bit better performance (+3.0% to 5.0% in terms of F1-score) than top performer RiMOM on most of the datasets). Experiments on Linked MDB and DBpedia show that VMI can obtain comparable results with the SILK system (about 26,000 results with good quality).
机译:实例匹配旨在发现跨异构数据源的真实对象的不同描述之间的联系。随着语义Web(尤其是链接数据)的飞速发展,自动实例匹配已成为本体数据共享和集成的基本问题。本体中的实例通常是大规模的,其中包含数百万甚至数亿个对象。直接应用先前的模式级别本体匹配方法是不可行的。在本文中,我们系统地研究了实例匹配的特征,然后提出了一种可扩展且高效的实例匹配方法,称为VMI。 VMI为本体实例中的不同类型的对象生成多个向量,并使用一组基于倒排索引的规则来获取主要匹配候选对象。然后,它使用用户自定义的属性值来进一步消除不正确的匹配。最终,随着积分矢量距离的计算出匹配候选者的相似度,并提取出匹配结果。在OAEI 2009和OAEI 2010上进行的实例跟踪实验表明,所提出的方法比性能最高的方法具有更好的有效性和效率(速度提高了100倍以上,性能也有所提高(按F1评分为+ 3.0%到5.0%)) RiMOM在大多数数据集上)。在链接的MDB和DBpedia上进行的实验表明,VMI可以在SILK系统上获得可比的结果(大约26,000个结果具有良好的质量)。

著录项

  • 来源
    《Knowledge-Based Systems》 |2013年第9期|112-120|共9页
  • 作者单位

    Department of Computer Science and Technology, Tsinghua University. Beijing, China;

    Department of Computer Science and Technology, Tsinghua University. Beijing, China,College of Information Science and Technology, Beijing Normal University, Beijing, China;

    Department of Computer Science and Technology, Tsinghua University. Beijing, China;

    Department of Computer Science and Technology, Tsinghua University. Beijing, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Heterogeneous data; Semantic web; Instance matching; Ontology matching; Linked data;

    机译:异构数据;语义网;实例匹配;本体匹配;关联数据;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号