首页> 外文会议>International Conference on Computational Semantics >Unsupervised Learning of Meaningful Semantic Classes for Entity Aggregates
【24h】

Unsupervised Learning of Meaningful Semantic Classes for Entity Aggregates

机译:实体集合的有意义语义类的无监督学习

获取原文

摘要

This paper addresses the task of semantic class learning by introducing a new methodology to identify the set of semantic classes underlying an aggregate of instances (i.e, a set of nominal phrases observed as a particular semantic role in a collection of text documents). The aim is to identify a set of semantically coherent (i.e., interpretable) and general enough classes capable of accurately describing the full extension that the set of instances is intended to represent. Thus, the set of learned classes is then used to devise a generative model for entity categorization tasks such as semantic class induction. The proposed methods are completely unsupervised and rely on an (unlabeled) open-domain collection of text documents used as the source of background knowledge. We demonstrate our proposal on a collection of news stories. Specifically, we model the set of classes underlying the predicate arguments in a Proposition Store built from the news. The experiments carried out show significant improvements over a (baseline) generative model of entities based on latent classes that is defined by means of Hierarchical Dirichlet Processes.
机译:本文通过介绍一种新的方法来识别语义集合的实例来解决语义类学习的任务,这些语义类集合是实例集合的基础(即,一组名义短语被观察为文本文档集合中的特定语义角色)。目的是确定一组语义上一致的(即,可解释的)和足够通用的类,这些类能够准确地描述实例集旨在表示的完整扩展。因此,然后使用一组学习的类来设计用于实体分类任务(例如语义类归纳)的生成模型。所提出的方法是完全不受监督的,并且依赖于文本文档的(未标记的)开放域集合作为背景知识的来源。我们在一系列新闻报道中展示我们的建议。具体来说,我们在根据新闻构建的Proposition Store中对谓词参数基础的一组类进行建模。进行的实验表明,对基于潜在类的实体的(基准)生成模型进行了重大改进,该潜在类是通过递阶Dirichlet过程定义的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号