【24h】

Turkish verbal multiword expressions corpus

机译:土耳其语口头次级表达式语料库

获取原文

摘要

In this study, a Turkish corpus with labeled verbal multiword expressions was built. The verbal multiword expressions in the corpus were annotated according to their subcategories. The Turkish train and test corpora that was published in PARSEME Shared Task 1.0 were updated as train and development corpora based on PARSEME Annotation Guidelines. Additionally, a new Turkish test corpus was created by following the guidelines. The corpus consists of newspaper articles on politics, world, life, art and columns. The corpus will be released in PARSEME Shared Task 1.1. The corpus will be an important source to be used in many Turkish natural languages processing applications such as syntactic parsing, machine translation and n-gram language modeling.
机译:在这项研究中,建立了一个带有标记的言语多相表达式的土耳其语料库。语料库中的口头多字大表示根据其子类别注释。在Parseme共享任务中发布的土耳其火车和测试Corpora是根据Parseme注释指南的基于Parseme注释指南更新为火车和开发基础。此外,通过遵循指南来创建新的土耳其测试语料库。语料库包括关于政治,世界,生活,艺术和专栏的报纸文章。语料库将在Parseme共享任务1.1中发布。语料库将是许多土耳其自然语言处理应用的重要来源,例如句法解析,机器翻译和n克语言建模。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号