【24h】

Automatic Generation of Parallel Treebanks

机译:自动生成并行树库

获取原文
获取原文并翻译 | 示例

摘要

The need for syntactically annotated data for use in natural language processing has increased dramatically in recent years. This is true especially for parallel treebanks, of which very few exist. The ones that exist are mainly hand-crafted and too small for reliable use in data-oriented applications. In this paper we introduce a novel platform for fast and robust automatic generation of parallel treebanks. The software we have developed based on this platform has been shown to handle large data sets. We also present evaluation results demonstrating the quality of the derived treebanks and discuss some possible modifications and improvements that can lead to even better results. We expect the presented platform to help boost research in the field of data-oriented machine translation and lead to advancements in other fields where parallel treebanks can be employed.
机译:近年来,对用于自然语言处理的带有语法注释的数据的需求已急剧增加。尤其是对于很少存在的并行树库,这是正确的。现有的程序主要是手工制作的,太小而无法在面向数据的应用程序中可靠使用。在本文中,我们介绍了一种新颖的平台,用于快速,强大地自动生成并行树库。我们已经证明了基于此平台开发的软件可以处理大数据集。我们还提供了评估结果,证明了衍生树库的质量,并讨论了一些可能导致更佳结果的修改和改进。我们希望该平台将有助于推动面向数据的机器翻译领域的研究,并在可以使用并行树库的其他领域取得进展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号