首页> 外国专利> METHOD FOR DISAMBIGUATING BETWEEN AUTHORS WITH SAME NAME ON BASIS OF NETWORK REPRESENTATION AND SEMANTIC REPRESENTATION

METHOD FOR DISAMBIGUATING BETWEEN AUTHORS WITH SAME NAME ON BASIS OF NETWORK REPRESENTATION AND SEMANTIC REPRESENTATION

机译:基于网络表示和语义表示的作者在作者之间消除作者的方法

摘要

The present invention discloses a method for disambiguating between authors with a same name on basis of network representation and semantic representation. This method comprises: 1) extracting semantic and discrete features of each publication in a target publication library; 2) calculating a similarity between the theses based on the discrete features to obtain a relationship similarity matrix of the theses; if the publication has no common author or institution with other theses, it is added into an discrete publication set; 3) calculating a semantic similarity matrix of the theses based on the semantic features of the theses; and adding theses which do not contain the semantic features in the target publication library to the discrete publication set; 4) performing weighted summation on the relationship similarity matrix and the semantic similarity matrix to obtain a publication similarity matrix and clustering the same; adding theses which do not belong to any cluster to the publication discrete set; and 5) allocating the theses in the discrete publication set to corresponding clusters by using a method based on similarity threshold matching. The present invention enables disambiguation between the authors of the same name of theses with high accuracy.
机译:本发明公开了一种在基于网络表示和语义表示的具有相同名称的作者之间消除作者之间的方法。该方法包括:1)在目标出版物库中提取每个出版物的语义和离散特征; 2)基于离散特征计算这些同学之间的相似性以获得这些关系的关系相似性矩阵;如果出版物没有与其他论文的公共作者或机构,则将其添加到离散的出版集中; 3)基于语义的语义特征计算这些语义的语义相似性矩阵;并添加不包含目标出版物库中的语义功能的文字到离散出版集; 4)对关系相似性矩阵的加权求和和语义相似度矩阵来获得发布相似性矩阵和群集相同;添加不属于任何集群的文字到出版物离散集; 5)通过使用基于相似性阈值匹配的方法将其分配给相应的群集的离散出版物。本发明能够高精度地歧义相同名称的作者之间的作者。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号