...
首页> 外文期刊>Bioinformatics >Boosting automatic event extraction from the literature using domain adaptation and coreference resolution.
【24h】

Boosting automatic event extraction from the literature using domain adaptation and coreference resolution.

机译:使用域自适应和共指解析来促进从文献中自动提取事件。

获取原文
获取原文并翻译 | 示例
           

摘要

MOTIVATION: In recent years, several biomedical event extraction (EE) systems have been developed. However, the nature of the annotated training corpora, as well as the training process itself, can limit the performance levels of the trained EE systems. In particular, most event-annotated corpora do not deal adequately with coreference. This impacts on the trained systems' ability to recognize biomedical entities, thus affecting their performance in extracting events accurately. Additionally, the fact that most EE systems are trained on a single annotated corpus further restricts their coverage. RESULTS: We have enhanced our existing EE system, EventMine, in two ways. First, we developed a new coreference resolution (CR) system and integrated it with EventMine. The standalone performance of our CR system in resolving anaphoric references to proteins is considerably higher than the best ranked system in the COREF subtask of the BioNLP'11 Shared Task. Secondly, the improved EventMine incorporates domain adaptation (DA) methods, which extend EE coverage by allowing several different annotated corpora to be used during training. Combined with a novel set of methods to increase the generality and efficiency of EventMine, the integration of both CR and DA have resulted in significant improvements in EE, ranging between 0.5% and 3.4% F-Score. The enhanced EventMine outperforms the highest ranked systems from the BioNLP'09 shared task, and from the GENIA and Infectious Diseases subtasks of the BioNLP'11 shared task. AVAILABILITY: The improved version of EventMine, incorporating the CR system and DA methods, is available at: http://www.nactem.ac.uk/EventMine/. CONTACT: makoto.miwa@manchester.ac.uk.
机译:动机:近年来,已经开发了几种生物医学事件提取(EE)系统。但是,带注释的训练资料集的性质以及训练过程本身会限制训练后的EE系统的性能水平。特别是,大多数带事件注释的语料库不能充分处理共指关系。这会影响受过训练的系统识别生物医学实体的能力,从而影响其准确提取事件的性能。此外,大多数EE系统都是在单个带注释的语料库上进行训练的事实进一步限制了它们的覆盖范围。结果:我们通过两种方式增强了现有的EE系统EventMine。首先,我们开发了一个新的共参照分辨率(CR)系统,并将其与EventMine集成。我们的CR系统在解析对蛋白质的照应性引用方面的独立性能大大高于BioNLP'11共享任务的COREF子任务中排名最高的系统。其次,改进的EventMine合并了域适应(DA)方法,该方法通过允许在训练期间使用几种不同的带注释语料库来扩展EE覆盖范围。结合新颖的方法来提高EventMine的通用性和效率,CR和DA的集成已导致EE的显着改善,F分数介于0.5%和3.4%之间。增强的EventMine胜过BioNLP'09共享任务以及BioNLP'11共享任务的GENIA和传染病子任务中排名最高的系统。可用性:结合了CR系统和DA方法的EventMine的改进版本可从以下网站获得:http://www.nactem.ac.uk/EventMine/。联系人:makoto.miwa@manchester.ac.uk。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号