首页> 中文期刊> 《计算机科学与探索》 >谓词概念连通度的中文实体关系抽取策略

谓词概念连通度的中文实体关系抽取策略

         

摘要

中文实体关系抽取是开放域文本检索与知识发现的研究热点,传统的抽取策略普遍存在人工标注量大,模式通用性受限,关系抽取粒度相对固定等问题,限制了其在开放领域的关系抽取效果。基于概念的结构分层和关系连通,面向中文实体关系构建了谓词概念模型(predicate concept model,PCM),在此基础上,提出了增量学习的谓词概念获取策略PCIA和基于谓词概念连通的关系抽取策略PCCS,由此进行了开放域非紧密的、远距离实体关系的抽取。各谓词概念的构建相对独立,概念组合更为灵活,对关系的描述具有更好的通用性和可解释性,为开放域未知关系的识别与抽取提供了有效手段。实验结果表明,PCCS有效提升了中文实体识别及实体连通路径选择的质量,获得了良好的关系抽取性能。%Chinese entities relation extraction task is a research focus of text retrieval and knowledge discovery in the open corpus. In the traditional extraction strategies, there exist some problems such as heavy workload of manual annotating, poor pattern versatility and relatively fixed relational granularity, etc. All these restrict the extraction effect in open corpus especially. This paper builds the predicate concept model (PCM) relying on hierarchical structure and relational connectivity of concept, proposes the predicate concept acquisition strategy for incremental concept learning (PCIA), achieves the extraction strategy based on predicate concept connectivity (PCCS), and carries out the untight, long-distant relation extraction ultimately. The construction of the formal concepts is relatively independent, and the combination of concept granularities is more flexible. Therefore, the description approach of the relationship has a better versatility and interpretability, and provides an effective means for unknown relationship identifying and extracting in the open corpus. The experimental results show that PCCS improves the effect of entities identification and entities connectivity path choice, and obtains good entities relation extracting performance.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号