基于ALBERT的中文命名实体识别方法

邓博研; 程良伦

摘要

在中文命名实体识别任务中，BERT预训练语言模型因其良好的性能得到了广泛的应用，但由于参数量过大、训练时间长，其实际应用场景受限。针对这个问题，提出了一种基于ALBERT的中文命名实体识别模型ALBERT-BiLSTM-CRF。在结构上，先通过ALBERT预训练语言模型在大规模文本上训练字符级别的词嵌入，然后将其输入BiLSTM模型以获取更多的字符间依赖，最后通过CRF进行解码并提取出相应实体。该模型结合ALBERT与BiLSTM-CRF模型的优势对中文实体进行识别，在MSRA数据集上达到了95.22%的F1值。实验表明，在大幅削减预训练参数的同时，该模型保留了相对良好的性能，并具有很好的可扩展性。 The BERT pre-trained language model has been widely used in Chinese named entity recognition due to its good performance, but the large number of parameters and long training time has limited its practical application scenarios. In order to solve these problems, we propose ALBERT-BiLSTM-CRF, a model for Chinese named entity recognition task based on ALBERT. Structurally, the model firstly trains character-level word embeddings on large-scale text through the ALBERT pre-training language model, and then inputs the word embeddings into the BiLSTM model to obtain more inter-character dependencies, and finally decodes through CRF and extracts the corresponding entities. This model combines the advantages of ALBERT and BiLSTM-CRF models to identify Chinese entities, and achieves an F1 value of 95.22% on the MSRA dataset. Experiments show that while greatly reducing the pre-training parameters, the model retains relatively good performance and has good scalability.

机译：在中文命名实体识别任务中，BERT预训练语言模型因其良好的性能得到了广泛的应用，但由于参数量过大、训练时间长，其实际应用场景受限。针对这个问题，提出了一种基于ALBERT的中文命名实体识别模型ALBERT-BiLSTM-CRF。在结构上，先通过ALBERT预训练语言模型在大规模文本上训练字符级别的词嵌入，然后将其输入BiLSTM模型以获取更多的字符间依赖，最后通过CRF进行解码并提取出相应实体。该模型结合ALBERT与BiLSTM-CRF模型的优势对中文实体进行识别，在MSRA数据集上达到了95.22%的F1值。实验表明，在大幅削减预训练参数的同时，该模型保留了相对良好的性能，并具有很好的可扩展性。 The BERT pre-trained language model has been widely used in Chinese named entity recognition due to its good performance, but the large number of parameters and long training time has limited its practical application scenarios. In order to solve these problems, we propose ALBERT-BiLSTM-CRF, a model for Chinese named entity recognition task based on ALBERT. Structurally, the model firstly trains character-level word embeddings on large-scale text through the ALBERT pre-training language model, and then inputs the word embeddings into the BiLSTM model to obtain more inter-character dependencies, and finally decodes through CRF and extracts the corresponding entities. This model combines the advantages of ALBERT and BiLSTM-CRF models to identify Chinese entities, and achieves an F1 value of 95.22% on the MSRA dataset. Experiments show that while greatly reducing the pre-training parameters, the model retains relatively good performance and has good scalability.

基于ALBERT的中文命名实体识别方法

摘要

著录项

相关主题

期刊订阅