...
首页> 外文期刊>Machine translation >A roadmap to neural automatic post-editing: an empirical approach
【24h】

A roadmap to neural automatic post-editing: an empirical approach

机译:神经自动编辑的路线图:经验方法

获取原文
获取原文并翻译 | 示例
           

摘要

In a translation workflow, machine translation (MT) is almost always followed by a human post-editing step, where the raw MT output is corrected to meet required quality standards. To reduce the number of errors human translators need to correct, automatic post-editing (APE) methods have been developed and deployed in such workflows. With the advances in deep learning, neural APE (NPE) systems have outranked more traditional, statistical, ones. However, the plethora of options, variables and settings, as well as the relation between NPE performance and train/test data makes it difficult to select the most suitable approach for a given use case. In this article, we systematically analyse these different parameters with respect to NPE performance. We build an NPE "roadmap" to trace the different decision points and train a set of systems selecting different options through the roadmap. We also propose a novel approach for APE with data augmentation. We then analyse the performance of 15 of these systems and identify the best ones. In fact, the best systems are the ones that follow the newly-proposed method. The work presented in this article follows from a collaborative project between Microsoft and the ADAPT centre. The data provided by Microsoft originates from phrase-based statistical MT (PBSMT) systems employed in production. All tested NPE systems significantly increase the translation quality, proving the effectiveness of neural post-editing in the context of a commercial translation workflow that leverages PBSMT.
机译:在翻译工作流程中,机器翻译(MT)几乎总是跟随人类的后编辑步骤,其中R原MT输出被校正以满足所需的质量标准。为了减少人类翻译人员需要纠正的错误数量,已经开发了自动编辑后(APE)方法并部署在此类工作流程中。随着深度学习的进步,神经猿(NPE)系统已经远离了更传统的统计数据。但是,多种选项,变量和设置以及NPE性能和火车/测试数据之间的关系使得难以为给定用例选择最合适的方法。在本文中,我们系统地分析了与NPE性能的这些不同的参数。我们构建一个NPE“路线图”以追踪不同的决策点并通过路线图选择一组系统选择不同选项。我们还提出了一种具有数据增强的猿类的新方法。然后,我们分析了这些系统中的15个性能并识别最佳的性能。事实上,最好的系统是遵循新建方法的系统。本文中提出的工作遵循Microsoft和Adapt Center之间的协作项目。 Microsoft提供的数据源自生产中使用的基于短语的统计MT(PBSMT)系统。所有测试的NPE系统都显着提高了翻译质量,证明了在商业翻译工作流程中的神经后编辑的有效性,从而利用了PBSMT。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号