首页> 外国专利> METHOD AND SYSTEM FOR CONSTRUCTING NAMED ENTITY DICTIONARY OF USING UNSUPERVISED LEARNING

METHOD AND SYSTEM FOR CONSTRUCTING NAMED ENTITY DICTIONARY OF USING UNSUPERVISED LEARNING

机译:使用非监督学习构建命名实体词典的方法和系统

摘要

The present invention relates to a method and system for constructing a named entity dictionary by unsupervised learning. The method includes: a categorized document data collection step of collecting document data on the internet and then extracting a category of the document data; a named entity usage pattern registration step of analyzing a sentence structure by a natural language processing procedure with respect to the document data, analyzing a usage pattern for each category for the named entity based on meaning of each category of the named entity registered in a named entity dictionary DB, and then registering an extracted official usage pattern for each category in the usage pattern DB for each category; and a named entity dictionary registration step of applying the official usage pattern, registered in the usage pattern DB for each category according to a category, to a usage pattern wherein an unregistered named entity is used in document data including the unregistered named entity among the document data, matching the unregistered named entity to the meaning, and then registering the unregistered named entity as a new named entity in the named entity dictionary DB. According to the present invention, meaning of a named entity for each category is figured out by comprehensively analyzing categorized document data. Therefore, reliability of recognizing a newly introduced named entity is increased, and, furthermore, change in meaning of the name entity can be effectively recognized.
机译:本发明涉及通过无监督学习来构造命名实体字典的方法和系统。该方法包括:分类文档数据收集步骤,其在互联网上收集文档数据,然后提取文档数据的类别。命名实体使用模式注册步骤,通过自然语言处理过程对文档数据分析句子结构,基于在命名实体中注册的命名实体的每个类别的含义,分析命名实体每个类别的使用模式实体字典数据库,然后在每个类别的使用模式数据库中注册每个类别的提取的官方使用模式;命名实体字典注册步骤,将在使用模式数据库中按类别对每个类别注册的正式使用模式应用于其中在文档数据中使用未注册的命名实体的文档数据中包括文档中未注册的命名实体的使用模式数据,将未注册的命名实体与其含义进行匹配,然后将未注册的命名实体注册为命名实体字典DB中的新命名实体。根据本发明,通过全面分析分类的文档数据来找出每个类别的命名实体的含义。因此,增加了识别新引入的命名实体的可靠性,此外,可以有效地识别名称实体的含义变化。

著录项

  • 公开/公告号KR20150066160A

    专利类型

  • 公开/公告日2015-06-16

    原文格式PDF

  • 申请/专利权人 KT CORPORATION;

    申请/专利号KR20130151365

  • 发明设计人 PARK JAE HAN;

    申请日2013-12-06

  • 分类号G06F17/30;G06F17/40;

  • 国家 KR

  • 入库时间 2022-08-21 14:59:53

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号