首页> 中文期刊> 《计算机应用研究》 >基于上下文信息的中文命名实体消歧方法研究

基于上下文信息的中文命名实体消歧方法研究

         

摘要

在语义标注过程中,为了消除文本中给定的命名实体与知识库中实体映射过程中出现的歧义问题,提出了一种基于上下文信息相似度值排序的命名实体消歧方法.消歧方法包括实体表示预处理、候选实体列表构建和相似度值排序算法三部分.针对命名实体指称多样性问题,使用实体表示预处理方法抽取标准实体;然后利用中文在线百科构建语义知识库,得到标准实体的语义列表;同时提出利用相似度值排序方法解决标准实体与语义列袁映射的指称歧义性问题,对于在知识库中未找到语义的实体采用HAC聚类算法进行消歧处理.实验结果表明,该的方法能够有效地把中文网页真实数据集中文本的实体映射到知识库中对应无歧义的实体上.%In the process of semantic annotation,in order to eliminate the ambiguity problem of the text in a given named entity and the mapping of the knowledge base entities,this paper put forward a context based semantic similarity value of the sorted named entity disambiguation method.Disambiguation method included three sections that entity preprocessing,constructing candidate list of entities and similarity value ranking algorithms.In view of the problem of the named entity reference multiplicity,it used the new entity to represent the preprocess method to extract the standard entity.Then it used the online encyclopedia in Chinese to construct the semantic knowledge base,and got the semantic list of standard entities.At the same time,this paper also put forward using the similarity value ranking method for solving standard substance and semantic list mapping referential ambiguity problem,for in the knowledge base not found semantic entity disambiguation processing by clustering algorithm.The results of the experiment show that the proposed method can effectively reflect the real data set of Chinese Web pages to the corresponding non-ambiguous entities in the knowledge base.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号