首页> 外文会议>IAPR TC3 International Workshop on Artificial Neural Networks in Pattern Recognition >Unsupervised Active Learning of CRF Model for Cross-Lingual Named Entity Recognition
【24h】

Unsupervised Active Learning of CRF Model for Cross-Lingual Named Entity Recognition

机译:CRF模型的无监督积极学习,用于交叉语言名称实体识别

获取原文

摘要

Manual annotation of the training data of information extraction models is a time consuming and expensive process but necessary for the building of information extraction systems. Active learning has been proven to be effective in reducing manual annotation efforts for supervised learning tasks where a human judge is asked to annotate the most informative examples with respect to a given model. However, in most cases reliable human judges are not available for all languages. In this paper, we propose a cross-lingual unsupervised active learning paradigm (XLADA) that generates high-quality automatically annotated training data from a word-aligned parallel corpus. To evaluate our paradigm, we applied XLADA on English-French and English-Chinese bilingual corpora then we trained French and Chinese information extraction models. The experimental results show that XLADA can produce effective models without manually-annotated training data.
机译:手动注释信息提取模型的训练数据是耗时且昂贵的过程,但是建立信息提取系统所必需的。已被证明在减少监督学习任务的手动注释工作方面已经证明是有效的,要求人类法官被要求向给定模型提供最佳信息示例。但是,在大多数情况下,所有语言都没有可靠的人类法官。在本文中,我们提出了一种交叉语言无监督的主动学习范式(XLADA),它从单词对齐的并行语料库产生高质量的自动注释的训练数据。为了评估我们的范式,我们在英语 - 法语和英汉双语语言中应用了Xlada,然后我们接受了法语和中文信息提取模型。实验结果表明,XLADA可以在没有手动注释的训练数据的情况下生产有效的模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号