...
首页> 外文期刊>Knowledge and information systems >An efficient path computing model for measuring semantic similarity using edge and density
【24h】

An efficient path computing model for measuring semantic similarity using edge and density

机译:使用边缘和密度测量语义相似性的有效路径计算模型

获取原文
获取原文并翻译 | 示例
           

摘要

The shortest path between two concepts in a taxonomic ontology is commonly used to represent the semantic distance between concepts in edge-based semantic similarity measures. In the past, edge counting, which is simple and intuitive and has low computational complexity, was considered the default method for path computation. However, a large lexical taxonomy, such as WordNet, has irregular link densities between concepts due to its broad domain, but edge counting-based path computation is powerless for this non-uniformity problem. In this paper, we advocate that the path computation can be separated from edge-based similarity measures and can form various general computing models. Therefore, to solve the problem of the non-uniformity of concept density in a large taxonomic ontology, we propose a new path computing model based on the compensation of local area density of concepts, which is equal to the number of direct hyponyms of the subsumers for concepts in the shortest path. This path model considers the local area density of concepts as an extension of the edge counting-based path according to the information theory. This model is a general path computing model and can be applied in various edge-based similarity approaches. The experimental results show that the proposed path model improves the average optimal correlation between edge-based measures and human judgments on the Miller and Charles benchmark for WordNet from less than 0.79 to more than 0.86, on the Pedersenet al. benchmark (average of both Physician and Coder) for SNOMED-CT from less than 0.75 to more than 0.82, and it has a large advantage in efficiency compared with information content computation in a dynamic ontology, thereby successfully improving the edge-based similarity measure as an excellent method with high performance and high efficiency.
机译:分类本体本体中的两个概念之间的最短路径通常用于表示基于边缘的语义相似度措施的概念之间的语义距离。过去,边缘计数,简单且直观并具有低计算复杂性,被认为是路径计算的默认方法。然而,由于其宽域,诸如Wordnet等大型词汇分类,例如Wordnet,在概念之间具有不规则的链路密度,但是对于这种非均匀性问题,基于边缘计数的路径计算是无能为力的。在本文中,我们倡导路径计算可以与基于边缘的相似度测量分离,并且可以形成各种常规计算模型。因此,为了解决大型分类本体中的概念密度的不均匀性问题,我们提出了一种基于局域局部密度补偿的新路径计算模型,其等于Supumers的直接假设的数量对于最短路径的概念。该路径模型认为概念的局部密度作为基于边缘计数的路径的扩展,根据信息理论。该模型是一般路径计算模型,可以应用于各种基于边缘的相似性方法。实验结果表明,在Pedersenet Al上,所提出的路径模型提高了基于米勒和查理基准的边缘措施和人力判断之间的平均最佳相关性和用于Wordnet的Charles基准。基准(医生和编码器的平均值)对于小于0.75至大于0.82的SnoMed-CT,与动态本体中的信息内容计算相比,它具有很大的优势,从而成功地提高了基于边缘的相似度测量一种高性能和高效率的优异方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号