An ERNIE-Based Joint Model for Chinese Named Entity Recognition

Yu Wang; Yining Sun; Zuchang Ma; Lisheng Gao; Yang Xu

首页> 外文期刊>Applied Sciences >An ERNIE-Based Joint Model for Chinese Named Entity Recognition

【24h】

An ERNIE-Based Joint Model for Chinese Named Entity Recognition

机译：基于ERNIE的中文名称实体识别联合模型

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Named Entity Recognition (NER) is the fundamental task for Natural Language Processing (NLP) and the initial step in building a Knowledge Graph (KG). Recently, BERT (Bidirectional Encoder Representations from Transformers), which is a pre-training model, has achieved state-of-the-art (SOTA) results in various NLP tasks, including the NER. However, Chinese NER is still a more challenging task for BERT because there are no physical separations between Chinese words, and BERT can only obtain the representations of Chinese characters. Nevertheless, the Chinese NER cannot be well handled with character-level representations, because the meaning of a Chinese word is quite different from that of the characters, which make up the word. ERNIE (Enhanced Representation through kNowledge IntEgration), which is an improved pre-training model of BERT, is more suitable for Chinese NER because it is designed to learn language representations enhanced by the knowledge masking strategy. However, the potential of ERNIE has not been fully explored. ERNIE only utilizes the token-level features and ignores the sentence-level feature when performing the NER task. In this paper, we propose the ERNIE-Joint, which is a joint model based on ERNIE. The ERNIE-Joint can utilize both the sentence-level and token-level features by joint training the NER and text classification tasks. In order to use the raw NER datasets for joint training and avoid additional annotations, we perform the text classification task according to the number of entities in the sentences. The experiments are conducted on two datasets: MSRA-NER and Weibo. These datasets contain Chinese news data and Chinese social media data, respectively. The results demonstrate that the ERNIE-Joint not only outperforms BERT and ERNIE but also achieves the SOTA results on both datasets.

机译：命名实体识别（ner）是自然语言处理（NLP）的基本任务以及构建知识图（kg）的初始步骤。最近，作为一种预训练模型的BERT（来自变压器的双向编码器表示）已经实现了最先进的（SOTA）导致包括NER的各种NLP任务。然而，中国人仍然是伯特的一个更具挑战性的任务，因为中文单词之间没有物理分离，伯特只能获得汉字的表示。尽管如此，中国人不能很好地处理字符级别表示，因为中文单词的含义与构成这个词的字符的意义与字符相比。 ernie（通过知识集成增强的代表），这是一种改进的伯特预训练模型，更适合中国人，因为它旨在学习通过知识掩蔽战略增强的语言表示。但是，厄尼的潜力尚未完全探索。 Ernie仅利用令牌级别功能，并在执行NER任务时忽略句子级功能。在本文中，我们提出了Ernie-Connie，这是一个基于厄尼的联合模型。 Ernie-jock可以通过联合培训NER和文本分类任务来利用句子级和令牌级功能。为了使用RAW NER数据集进行联合培训并避免额外的注释，我们根据句子中的实体数执行文本分类任务。实验是在两个数据集：MSRA-NER和WEIBO上进行的。这些数据集分别包含中国新闻数据和中国社交媒体数据。结果表明，Ernie-关节不仅优于BERT和ERNIE，而且还达到了两个数据集的SOTA结果。

著录项

来源
《Applied Sciences》 |2020年第16期|共13页
作者
Yu Wang; Yining Sun; Zuchang Ma; Lisheng Gao; Yang Xu;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
joint trainingnamed entity recognitionpre-training modelsERNIEBERT;

机译：联合培训基实体识别普通培训型号;

相似文献

外文文献
中文文献
专利

1. Joint Pre-Trained Chinese Named Entity Recognition Based on Bi-Directional Language Model [J] . Ma Changxia, Zhang Chen International Journal of Pattern Recognition and Artificial Intelligence . 2021,第9期

机译：基于双向语言模型的联合预先培训的中文命名实体识别
2. TaggerOne: joint named entity recognition and normalization with semi-Markov Models [J] . Leaman Robert, Lu Zhiyong Bioinformatics . 2016,第18期

机译：TaggerOne：使用半马尔可夫模型进行的联合命名实体识别和规范化
3. Chinese Named Entity Recognition_via Joint Identification and Categorization [J] . ZHOU Junsheng, QU Weiguang, ZHANG Fen 电子学报：英文版 . 2013,第002期

机译：中文命名实体识别_通过联合识别和分类
4. Joint Self-Attention and Multi-Embeddings for Chinese Named Entity Recognition [C] . Cijian Song, Yan Xiong, Wenchao Huang, International Conference on Big Data Computing and Communications . 2020

机译：汉字命名实体识别的联合自我注意和多嵌入
5. An Application of Natural Language Processing: Named Entity Recognition with BLSTM in Chinese Corpora [D] . Mao, Lihui 2019

机译：自然语言处理的应用：BLSTM在中文语料库中的命名实体识别
6. TaggerOne: joint named entity recognition and normalization with semi-Markov Models [O] . Robert Leaman, Zhiyong Lu -1

机译：TaggerOne：使用半马尔可夫模型进行的联合命名实体识别和规范化
7. An ERNIE-Based Joint Model for Chinese Named Entity Recognition [O] . Yu Wang, Yining Sun, Zuchang Ma, 2020

机译：基于ERNIE的中文名称实体识别联合模型

An ERNIE-Based Joint Model for Chinese Named Entity Recognition

摘要

著录项

相似文献

相关主题

期刊订阅