PhoNLP: A joint multi-task learning model for Vietnamese part-of-speech tagging, named entity recognition and dependency parsing

机译：Phonlp：越南语术语标记的联合多任务学习模型，名为实体识别和依赖解析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present the first multi-task learning model-named PhoNLP-for joint Vietnamese part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing. Experiments on Vietnamese benchmark datasets show that PhoNLP produces state-of-the-art results, outperforming a single-task learning approach that fine-tunes the pre-trained Vietnamese language model PhoBERT (Nguyen and Nguyen, 2020) for each task independently. We publicly release PhoNLP as an open-source toolkit under the Apache License 2.0. Although we specify PhoNLP for Vietnamese, our PhoNLP training and evaluation command scripts in fact can directly work for other languages that have a pre-trained BERT-based language model and gold annotated corpora available for the three tasks of POS tagging, NER and dependency parsing. We hope that PhoNLP can serve as a strong baseline and useful toolkit for future NLP research and applications to not only Vietnamese but also the other languages.

机译：我们介绍了第一个多任务学习模型名为phonlp-for联合越南语部分 - 语音（pos）标记，命名实体识别（ner）和依赖关系解析。越南基准数据集的实验表明，PhonlP会产生最先进的结果，优先表现出单一任务学习方法，可以独立调整每项任务的预先培训的越南语模型Phobert（Nguyen和Nguyen，2020）。我们将PhonlP公开作为Apache许可证2.0下的开源工具包释放。虽然我们为越南语指定了Phonlp，但我们的PhonlP培训和评估命令脚本实际上可以直接为其他语言工作，这些语言可以为POS标记的三个任务提供预先训练的BERT的语言模型和Gold注释语言，但是。我们希望Phonlp可以作为未来NLP研究和应用程序的强大基线和有用的工具包，不仅是越南语，而且是其他语言。

著录项

来源
《Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2021年|1-7|共7页
会议地点
作者
Linh The Nguyen; Dat Quoc Nguyen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
专利

1. Combining Multi-task Learning with Transfer Learning for Biomedical Named Entity Recognition [J] . Tahir Mehmood, Alfonso E. Gerevini, Alberto Lavelli, Procedia Computer Science . 2020,第5期

机译：将多任务学习与生物医学命名实体识别的转移学习相结合
2. Improving Named Entity Recognition in Vietnamese Texts by a Character-Level Deep Lifelong Learning Model [J] . Ngoc-Vu Nguyen, Thi-Lan Nguyen, Cam-Van Nguyen Thi, Vietnam Journal of Computer Science . 2019,第4期

机译：通过角色级深终身学习模型改善越南文本中的命名实体识别
3. Dataset-aware multi-task learning approaches for biomedical named entity recognition [J] . Zuo Mei, Zhang Yang Bioinformatics . 2020,第15期

机译：DataSet感知生物医学名为实体识别的多任务学习方法
4. Joint Part-of-Speech Tagging and Named Entity Recognition Using Factor Graphs [C] . Gyorgy Mora, Veronika Vincze International Conference on Text, Speech and Dialogue . 2012

机译：使用因子图形图形标记和命名实体识别的联合部分
5. Using a named entity tagger and a syntactic parser to improve Web-based answer extraction [D] . Kamel, Yasser. 2004

机译：使用命名实体标记器和语法解析器来改进基于Web的答案提取
6. TaggerOne: joint named entity recognition and normalization with semi-Markov Models [O] . Robert Leaman, Zhiyong Lu -1

机译：TaggerOne：使用半马尔可夫模型进行的联合命名实体识别和规范化
7. Semi-supervised deep learning based named entity recognition model to parse education section of resumes [O] . Bodhvi Gaur, Gurpreet Singh Saluja, Hamsa Bharathi Sivakumar, 2020

机译：基于半监督的基于深度学习的命名实体识别模型，解析教育部分简历

PhoNLP: A joint multi-task learning model for Vietnamese part-of-speech tagging, named entity recognition and dependency parsing

摘要

著录项

相似文献

相关主题

期刊订阅