Data Augmentation using Back-translation for Context-aware Neural Machine Translation

机译：数据增强使用后翻版内容感知神经机翻译

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

a single sentence does not always convey information required to translate it into other languages: we sometimes need to add or specialize words that are omitted or ambiguous in the source languages (e.g.. zero pronouns in translating Japanese to English or epicene pronouns in translating English to French). To translate such ambiguous sentences, we exploit contexts around the source sentence, and have so far explored context-aware neural machine translation (NMT). However, a large amount of parallel corpora is not easily available to train accurate context-aware NMT models. In this study, we first obtain large-scale pseudo parallel corpora by back-translating target-side monolingual corpora, and then investigate its impact on the translation performance of context-aware NMT models. We evaluate NMT models trained with small parallel corpora and the large-scale pseudo parallel corpora on IWSLT2017 English-Japanese and English-French datasets, and demonstrate the large impact of the data augmentation for context-aware NMT models in terms of bleu score and specialized test sets on ja→en~1 and fr→en.

机译：单句并不总是传达信息，把它翻译成其他语言要求：我们有时需要添加或专注的是在源语言的日语翻译为英语或通性代词在翻译英语省略或模糊（如零个代词的话。法语）。要翻译这样的歧义句，我们利用周围的光源句子上下文，并迄今已探索了环境感知神经机器翻译（NMT）。然而，大量的平行语料库是不容易得到训练精确的上下文感知NMT模型。在这项研究中，我们首先回到转译目标端的单语获得大规模伪平行语料库语料库，并探讨其对上下文感知NMT模型的翻译性能的影响。我们评估与小平行语料库和IWSLT2017英语，日语，英语，法语数据集大型伪平行语料训练的NMT模型，并演示了上下文感知NMT模型数据增强的布鲁得分方面大的影响力和专业测试集上JA→恩〜1和FR→连接。

著录项

来源
《Workshop on discourse in machine translation》|2019年|63 p.|共10页
会议地点
作者
Amane Sugiyama; Naoki Yoshinaga;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. A Joint Back-Translation and Transfer Learning Method for Low-Resource Neural Machine Translation [J] . Gong-Xu Luo, Ya-Ting Yang, Rui Dong, Mathematical Problems in Engineering: Theory, Methods and Applications . 2020,第1期

机译：低资源神经电机翻译的联合背翻译与转移学习方法
2. A Context-Aware Recurrent Encoder for Neural Machine Translation [J] . Biao Zhang, Deyi Xiong, Jinsong Su, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2017,第12期

机译：用于神经机器翻译的上下文感知循环编码器
3. Corpus Augmentation for Improving Neural Machine Translation [J] . Zijian Li, Chengying Chi, Yunyun Zhan Computers, Materials & Continua . 2020,第1期

机译：改善神经电脑翻译的语料库增强
4. Data Augmentation using Back-translation for Context-aware Neural Machine Translation [C] . Amane Sugiyama, Naoki Yoshinaga Workshop on discourse in machine translation . 2019

机译：使用回译进行上下文感知的神经机器翻译的数据增强
5. Neural network based classification of bearing faults in rotating machines: Augmentation of vibration measurements with power measurements. [D] . Koosial, Jainarine. 2006

机译：基于神经网络的旋转机械轴承故障分类：振动测量与功率测量的增强。
6. A Chaotic Neural Network Model for English Machine Translation Based on Big Data Analysis [O] . Qianyu Cao, Hanmei Hao 2021

机译：基于大数据分析的英式电机翻译混沌神经网络模型
7. Data augmentation using back-translation for context-aware neural machine translation [O] . Amane Sugiyama, Naoki Yoshinaga 2019

机译：数据增强使用后翻版内容感知神经机翻译
8. Machine Translation Based Data Augmentation for Cantonese Keyword Spotting (Author's Manuscript). [R] . Huang, G., Gorin, A., Gauvain, J., 2016

机译：基于机器翻译的粤语关键词识别数据增强（作者手稿）。

Data Augmentation using Back-translation for Context-aware Neural Machine Translation

摘要

著录项

相似文献

相关主题

期刊订阅