首页> 外文OA文献 >Sentence-based active learning strategies for information extraction
【2h】

Sentence-based active learning strategies for information extraction

机译:基于句子的主动学习策略用于信息抽取

摘要

Given a classifier trained on relatively few training examples, active learning (AL) consists in ranking a set of unlabeled examples in terms of how informative they would be, if manually labeled, for retraining a (hopefully) better classifier. An important text learning task in which AL is potentially useful is information extraction (IE), namely, the task of identifying within a text the expressions that instantiate a given concept. We contend that, unlike in other text learning tasks, IE is unique in that it does not make sense to rank individual items (i.e., word occurrences) for annotation, and that the minimal unit of text that is presented to the annotator should be an entire sentence. In this paper we propose a range of active learning strategies for IE that are based on ranking individual sentences, and experimentally compare them on a standard dataset for named entity extraction.
机译:给定一个通过相对较少的训练示例进行训练的分类器,主动学习(AL)包括对一组未标记的示例进行分级,这些示例如果被手动标记,将为他们提供更多信息,以重新训练(希望)更好的分类器。 AL在其中可能有用的一项重要的文本学习任务是信息提取(IE),即在文本中识别实例化给定概念的表达式的任务。我们认为,与其他文本学习任务不同的是,IE的独特之处在于对注释的各个项目(即单词出现)进行排序没有意义,并且提供给注释者的最小文本单位应为整个句子。在本文中,我们提出了一系列针对IE的主动学习策略,这些策略基于对单个句子的排名,并在标准数据集上进行实验比较,以进行命名实体提取。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号