首页> 外文会议>World Conference on Information Systems and Technologies >An Approach for Deriving Semantically Related Category Hierarchies from Wikipedia Category Graphs
【24h】

An Approach for Deriving Semantically Related Category Hierarchies from Wikipedia Category Graphs

机译:从维基百科类图形中导出语义相关类别层次结构的方法

获取原文

摘要

Wikipedia is the largest online encyclopedia known to date. Its rich content and semi-structured nature has made it into a very valuable research tool used for classification, information extraction, and semantic annotation, among others. Many applications can benefit from the presence of a topic hierarchy in Wikipedia. However, what Wikipedia currently offers is a category graph built through hierarchical category links the semantics of which are un-defined. Because of this lack of semantics, a sub-category in Wikipedia does not necessarily comply with the concept of a sub-category in a hierarchy. Instead, all it signifies is that there is some sort of relationship between the parent category and its sub-category. As a result, traversing the category links of any given category can often result in surprising results. For example, following the category of "Computing" down its sub-category links, the totally unrelated category of "Theology" appears. In this paper, we introduce a novel algorithm that through measuring the semantic relatedness between any given Wikipedia category and nodes in its sub-graph is capable of extracting a category hierarchy containing only nodes that are relevant to the parent category. The algorithm has been evaluated by comparing its output with a gold standard data set. The experimental setup and results are presented.
机译:维基百科是迄今为止的最大的在线百科全书。其丰富的内容和半结构性性质使其成为一种非常有价值的研究工具,用于分类,信息提取和语义注释等。许多应用程序可以从维基百科的主题层次结构中受益。但是,Wikipedia目前提供的是通过分层类别构建的类别图链接,其语义是未定义的。由于这种缺乏语义,维基百科的子类别不一定符合层次结构中子类别的概念。相反,所有它都表示的是,父类别与其子类别之间存在某种关系。因此,遍历任何给定类别的类别链接通常可能导致令人惊讶的结果。例如,遵循“计算”下的“计算”的类别链接,出现完全不相关的“神学”类别。在本文中,我们介绍了一种新颖算法,通过测量其子图中的任何给定维基百科类别和节点之间的语义相关性,能够提取仅包含与父类别相关的节点的类别层次结构。通过将其输出与金标准数据集进行比较来评估该算法。提出了实验设置和结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号