首页> 外文期刊>Computer speech and language >A Korean named entity recognition method using Bi-LSTM-CRF and masked self-attention
【24h】

A Korean named entity recognition method using Bi-LSTM-CRF and masked self-attention

机译:韩国命名实体识别方法使用Bi-LSTM-CRF并掩盖自我关注

获取原文
获取原文并翻译 | 示例
           

摘要

Named entity recognition (NER) is a fundamental task in natural language processing. The existing Korean NER methods use the Korean morpheme, syllable sequence, and part-of-speech as features, and use a sequence labeling model to tackle this problem. In Korean, on one hand, morpheme itself contains strong indicative information of named entity (especially for time and person). On the other hand, the context of the target morpheme plays an important role in recognizing the named entity(NE) tag of the target morpheme. To make full use of these two features, we propose two auxiliary tasks. One of them is the morpheme-level NE tagging task which will capture the NE feature of syllable sequence composing morpheme. The other one is the context-based NE tagging task which aims to capture the context feature of target morpheme through the masked self-attention network. These two tasks are jointly trained with Bi-LSTM-CRF NER Tagger. The experimental results on Klpexpo 2016 corpus and Naver NLP Challenge 2018 corpus show that our model outperforms the strong baseline systems and achieves the state of the art.
机译:命名实体识别(ner)是自然语言处理中的基本任务。现有的韩文方法使用韩语语素,音节序列和语音部分作为功能,并使用序列标记模型来解决这个问题。在韩国,一方面,语素本身包含命名实体的强大指示信息(特别是时间和人员)。另一方面,目标语素的背景在识别目标语素的命名实体(NE)标签方面发挥着重要作用。要充分利用这两个功能,我们提出了两个辅助任务。其中一个是语音级网元标记任务,它将捕获构成语素的音节序列的网元特征。另一个是基于上下文的网元标记任务,其目的是通过屏蔽自我关注网络捕获目标语素的上下文特征。这两个任务是用BI-LSTM-CRF NER标记训练的联合培训。 KLPEXPO 2016 Corpus和Naver NLP挑战2018语料库的实验结果表明,我们的模型优于强大的基线系统,实现了现有技术。

著录项

  • 来源
    《Computer speech and language》 |2021年第1期|101134.1-101134.11|共11页
  • 作者

    Guozhe Jin; Zhezhou Yu;

  • 作者单位

    College of Computer Science and Technology Jilin University Qianjin Street: Jilin Province 2699 China Department of Computer Science and Technology Yanbian University 977 Gongyuan Road Yanji 133002 PR China;

    College of Computer Science and Technology Jilin University Qianjin Street: Jilin Province 2699 China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Korean; named entity recognition; auxiliary tasks; Bi-LSTM-CRF;

    机译:韩国人;命名实体识别;辅助任务;Bi-LSTM-CRF;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号