Unsupervised Active Learning of CRF Model for Cross-Lingual Named Entity Recognition

机译：CRF模型的无监督积极学习，用于交叉语言名称实体识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Manual annotation of the training data of information extraction models is a time consuming and expensive process but necessary for the building of information extraction systems. Active learning has been proven to be effective in reducing manual annotation efforts for supervised learning tasks where a human judge is asked to annotate the most informative examples with respect to a given model. However, in most cases reliable human judges are not available for all languages. In this paper, we propose a cross-lingual unsupervised active learning paradigm (XLADA) that generates high-quality automatically annotated training data from a word-aligned parallel corpus. To evaluate our paradigm, we applied XLADA on English-French and English-Chinese bilingual corpora then we trained French and Chinese information extraction models. The experimental results show that XLADA can produce effective models without manually-annotated training data.

机译：手动注释信息提取模型的训练数据是耗时且昂贵的过程，但是建立信息提取系统所必需的。已被证明在减少监督学习任务的手动注释工作方面已经证明是有效的，要求人类法官被要求向给定模型提供最佳信息示例。但是，在大多数情况下，所有语言都没有可靠的人类法官。在本文中，我们提出了一种交叉语言无监督的主动学习范式（XLADA），它从单词对齐的并行语料库产生高质量的自动注释的训练数据。为了评估我们的范式，我们在英语 - 法语和英汉双语语言中应用了Xlada，然后我们接受了法语和中文信息提取模型。实验结果表明，XLADA可以在没有手动注释的训练数据的情况下生产有效的模型。

著录项

来源
《IAPR TC3 International Workshop on Artificial Neural Networks in Pattern Recognition》|2014年||共12页
会议地点
作者
Mohamed Farouk Abdel Hady; Abubakrelsedik Karali; Eslam Kamal; Rania Ibrahim;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP183-53;
关键词
Information extraction; Named entity recognition; Cross-lingual domain adaptation; Unsupervised active learning;

机译：信息提取;命名实体识别;交叉语言域适应;无监督的积极学习;
入库时间 2022-08-20 22:38:23

相似文献

外文文献
中文文献
专利

1. Unsupervised Active Learning of CRF Model for Cross-Lingual Information Extraction [J] . Mohamed Farouk Abdel Hady, Abubakrelsedik Karali, Eslam Kamal, International journal of computational linguistics and applications . 2014,第2期

机译：跨语言信息提取的CRF模型的无监督主动学习
2. Chemical named entity recognition in patents by domain knowledge and unsupervised feature learning [J] . Hua Xu, Hui Chen, Jingqi Wang, Database . 2016,第2010期

机译：通过领域知识和无监督特征学习来识别专利中的化学命名实体
3. LSTM-CRF Models for Named Entity Recognition [J] . Changki LEE IEICE transactions on information and systems . 2017,第4期

机译：用于命名实体识别的LSTM-CRF模型
4. Unsupervised Active Learning of CRF Model for Cross-Lingual Named Entity Recognition [C] . Mohamed Farouk Abdel Hady, Abubakrelsedik Karali, Eslam Kamal, Artificial neural networks in pattern recognition . 2014

机译：跨语言命名实体识别的CRF模型的无监督主动学习
5. From Preprocessing to Named Entity Recognition, Linking and Clustering in Multilingual, Cross-Lingual, High-Low Resources Settings [D] . Zirikly, Ayah. 2018

机译：从预处理到命名实体识别，多语言，跨语言，高低资源设置中的链接和聚类
6. Wide-scope biomedical named entity recognition and normalization with CRFs fuzzy matching and character level modeling [O] . Suwisa Kaewphan, Kai Hakala, Niko Miekka, 2018

机译：具有CRF模糊匹配和字符级建模的宽范围生物医学命名实体识别和归一化
7. UniTrans : Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data [O] . Qianhui Wu, Zijia Lin, Börje F. Karlsson, 2020

机译：Unitrans：使用未标记数据的交叉命名实体识别的统一模型传输和数据传输

Unsupervised Active Learning of CRF Model for Cross-Lingual Named Entity Recognition

摘要

著录项

相似文献

相关主题

期刊订阅