首页> 外文会议>Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies >Name Tagging for Low-resource Incident Languages based on Expectation-driven Learning
【24h】

Name Tagging for Low-resource Incident Languages based on Expectation-driven Learning

机译:基于期望驱动学习的低资源入射语言的名称标记

获取原文

摘要

In this paper we tackle a challenging name tagging problem in an emergent setting - the tagger needs to be complete within a few hours for a new incident language (IL) using very few resources. Inspired by observing how human annotators attack this challenge, we propose a new expectation-driven learning framework. In this framework we rapidly acquire, categorize, structure and zoom in on IL-specific expectations (rules, features, patterns, gazetteers, etc.) from various non-traditional sources: consulting and encoding linguistic knowledge from native speakers, mining and projecting patterns from both mono-lingual and cross-lingual corpora, and typing based on cross-lingual entity linking. We also propose a cost-aware combination approach to compose expectations. Experiments on seven low-resource languages demonstrate the effectiveness and generality of this framework: we are able to setup a name tagger for a new IL within two hours, and achieve 33.8%-65.1% F-score.
机译:在本文中,我们在紧急设置中解决一个具有挑战性的标记问题 - 使用很少的资源,在几个小时内需要在几个小时内完成。灵感来自观察人类的注释者如何攻击这一挑战,我们提出了一个新的期望驱动的学习框架。在此框架中,我们迅速获取,分类,结构和放大来自各种非传统来源的IL特定的期望(规则,特征,模式,鸟类等):从母语扬声器,采矿和投影模式的咨询和编码语言知识从单语言和交叉语料库中,基于交叉实体链接的键入。我们还提出了一种成本感知的组合方法来撰写期望。七种低资源语言的实验证明了本框架的有效性和普遍性:我们能够在两小时内为新IL设置名称标记,并达到33.8%-65.1%f分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号