首页> 外文会议>International Conference on Computational Semantics >Unsupervised Learning of Meaningful Semantic Classes for Entity Aggregates
【24h】

Unsupervised Learning of Meaningful Semantic Classes for Entity Aggregates

机译:未经监督的实体聚集体的有意义语义类的学习

获取原文

摘要

This paper addresses the task of semantic class learning by introducing a new methodology to identify the set of semantic classes underlying an aggregate of instances (i.e, a set of nominal phrases observed as a particular semantic role in a collection of text documents). The aim is to identify a set of semantically coherent (i.e., interpretable) and general enough classes capable of accurately describing the full extension that the set of instances is intended to represent. Thus, the set of learned classes is then used to devise a generative model for entity categorization tasks such as semantic class induction. The proposed methods are completely unsupervised and rely on an (unlabeled) open-domain collection of text documents used as the source of background knowledge. We demonstrate our proposal on a collection of news stories. Specifically, we model the set of classes underlying the predicate arguments in a Proposition Store built from the news. The experiments carried out show significant improvements over a (baseline) generative model of entities based on latent classes that is defined by means of Hierarchical Dirichlet Processes.
机译:本文通过引入一种新方法来介绍一种新方法来识别实例底层的语义类集(即,观察到文本文档集合中的特定语义角色的一组语义类别来解决语义类学习的任务。目的是识别一组语义相干(即,可解释)和一般的类,能够准确地描述该组实例旨在表示的完整扩展。因此,然后使用该集合的类别用于设计用于实体分类任务的生成模型,例如语义类归纳。所提出的方法完全无监督,依赖于(未标记的)开放式域收集文本文件,作为背景知识的来源。我们展示了我们关于一系列新闻报道的提案。具体而言,我们在从新闻建造的命题商店中模拟了潜在的谓词参数的类集。进行的实验显示了基于通过分层Dirichlet过程定义的潜在类的实体的(基线)生成模型的显着改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号