首页> 外文会议>Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies >Name Tagging for Low-resource Incident Languages based on Expectation-driven Learning
【24h】

Name Tagging for Low-resource Incident Languages based on Expectation-driven Learning

机译:基于期望驱动学习的低资源事件语言名称标记

获取原文

摘要

In this paper we tackle a challenging name tagging problem in an emergent setting - the tagger needs to be complete within a few hours for a new incident language (IL) using very few resources. Inspired by observing how human annotators attack this challenge, we propose a new expectation-driven learning framework. In this framework we rapidly acquire, categorize, structure and zoom in on IL-specific expectations (rules, features, patterns, gazetteers, etc.) from various non-traditional sources: consulting and encoding linguistic knowledge from native speakers, mining and projecting patterns from both mono-lingual and cross-lingual corpora, and typing based on cross-lingual entity linking. We also propose a cost-aware combination approach to compose expectations. Experiments on seven low-resource languages demonstrate the effectiveness and generality of this framework: we are able to setup a name tagger for a new IL within two hours, and achieve 33.8%-65.1% F-score.
机译:在本文中,我们在紧急情况下解决了一个具有挑战性的名称标记问题-对于使用一种新的事件语言(IL)的标记器,需要使用很少的资源,因此它需要在几个小时内完成。通过观察人类注释者如何应对这一挑战的启发,我们提出了一种新的以期望为导向的学习框架。在此框架中,我们从各种非传统来源快速获取,分类,构造和放大特定于IL的期望值(规则,功能,模式,地名词典等):咨询和编码母语人士的语言知识,挖掘和投射模式从单语言和跨语言语料库中提取,并基于跨语言实体链接进行键入。我们还提出了一种成本感知组合方法来构成期望。在7种低资源语言上进行的实验证明了该框架的有效性和普遍性:我们能够在两个小时内为新的IL设置名称标签,并获得33.8%-65.1%的F评分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号