首页> 外文期刊>Language Resources and Evaluation >Iarg-AnCora: Spanish corpus annotated with implicit arguments
【24h】

Iarg-AnCora: Spanish corpus annotated with implicit arguments

机译:Iarg-AnCora:用隐式参数注释的西班牙语语料库

获取原文
获取原文并翻译 | 示例
           

摘要

This article presents the Spanish Iarg-AnCora corpus (400 k-words, 13,883 sentences) annotated with the implicit arguments of deverbal nominalizations (18,397 occurrences). We describe the methodology used to create it, focusing on the annotation scheme and criteria adopted. The corpus was manually annotated and an interannotator agreement test was conducted (81 % observed agreement) in order to ensure the reliability of the final resource. The annotation of implicit arguments results in an important gain in argument and thematic role coverage (128 % on average). It is the first corpus annotated with implicit arguments for the Spanish language with a wide coverage that is freely available. This corpus can subsequently be used by machine learning-based semantic role labeling systems, and for the linguistic analysis of implicit arguments grounded on real data. Semantic analyzers are essential components of current language technology applications, which need to obtain a deeper understanding of the text in order to make inferences at the highest level to obtain qualitative improvements in the results.
机译:本文介绍了西班牙语的Iarg-AnCora语料库(400 k个单词,13,883个句子),并用混响名词化的隐式参数进行了注释(出现了18397次)。我们将重点介绍注释方案和采用的标准,以描述创建它的方法。为了确保最终资源的可靠性,对语料库进行了手动注释,并进行了注释者之间的协议测试(观察到的协议一致率为81%)。隐式论点的注释可显着提高论点和主题角色的覆盖率(平均128%)。它是第一个带有隐含参数的西班牙语语料库,该隐含参数适用于西班牙语,覆盖面广,可免费获得。该语料库随后可用于基于机器学习的语义角色标记系统,并用于基于真实数据的隐式参数的语言分析。语义分析器是当前语言技术应用程序中必不可少的组成部分,它们需要对文本有更深入的了解,以便在最高层次上进行推理,从而对结果进行定性改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号