首页> 中文期刊> 《中国科学》 >Hierarchical LSTM with char-subword-word tree-structure representation for Chinese named entity recognition

Hierarchical LSTM with char-subword-word tree-structure representation for Chinese named entity recognition

         

摘要

Chinese named entity recognition(CNER) aims to identify entity names such as person names and organization names from Chinese raw text and thus can quickly extract the entity information that people are concerned about from large-scale texts. Recent studies attempt to improve performance by integrating lexicon words into char-based CNER models. These existing studies, however, usually focus on leveraging the context-free words in lexicon without considering the contextual information of words and subwords in the sentences. To address this issue, in addition to utilizing the lexicon words, we further propose to construct a hierarchical tree structure representation composed of characters, subwords and context-aware predicted words from segmentor to represent each sentence for CNER. Based on the tree-structure representation, we propose a hierarchical long short-term memory(HiLSTM) framework, which consists of hierarchical encoding layer, fusion layer and CRF layer, to capture linguistic knowledge at different levels. On the one hand, the interactions within each level help to obtain the contextual information. On the other hand, the propagations from the lower-levels to the upper-levels can provide additional semantic knowledge for CNER. Experimental results on three widely used CNER datasets show that our proposed HiLSTM model achieves significant improvement over several strong benchmark methods.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号