Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation

机译：使用弹性重量固结的神经机翻译无人监测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This work presents our ongoing research of unsupervised pretraining in neural machine translation (NMT). In our method, we initialize the weights of the encoder and decoder with two language models that are trained with monolingual data and then fine-tune the model on parallel data using Elastic Weight Consolidation (EWC) to avoid forgetting of the original language modeling tasks. We compare the regularization by EWC with the previous work that focuses on regularization by language modeling objectives. The positive result is that using EWC with the decoder achieves BLEU scores similar to the previous work. However, the model converges 2-3 times faster and does not require the original unlabeled training data during the fine-tuning stage. In contrast, the regularization using EWC is less effective if the original and new tasks are not closely related. We show that initializing the bidirectional NMT encoder with a left-to-right language model and forcing the model to remember the original left-to-right language modeling task limits the learning capacity of the encoder for the whole bidirectional context.

机译：这项工作提出了对神经机翻译（NMT）中无监督预测的持续研究。在我们的方法中，我们用两种语言模型初始化编码器和解码器的权重，这些模型是用单晶体数据训练的，然后使用弹性权重合并（EWC）对并行数据进行微调模型，以避免忘记原始语言建模任务。我们将EWC的正规化与以前的工作进行比较，这些工作侧重于语言建模目标正规化。肯定结果是，使用EWC与解码器实现类似于先前工作的BLEU分数。但是，该模型将更快地收敛2-3倍，并且在微调阶段期间不需要原始未标记的训练数据。相比之下，如果原始任务和新任务不密切相关，则使用EWC的正则化效果较低。我们显示使用左右语言模型初始化双向NMT编码器并强制模型记住原始的左右语言建模任务限制了整个双向上下文的编码器的学习容量。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|xix 437 p.|共6页
会议地点
作者
Dusan Varis; Ondrej Bojar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Unsupervised Neural Machine Translation for Similar and Distant Language Pairs: An Empirical Study [J] . Sun Haipeng, Wang Rui, Utiyama Masao, ACM transactions on Asian and low-resource language information processing . 2021,第1期

机译：针对类似和遥远语言对的无监督神经机翻译：实证研究
2. Unsupervised dialectal neural machine translation [J] . Wael Farhan, Bashar Talafha, Analle Abuammar, Information Processing & Management . 2020,第3期

机译：无监督方言神经机器翻译
3. Iterative Training of Unsupervised Neural and Statistical Machine Translation Systems [J] . Marie Benjamin, Fujita Atsushi ACM transactions on Asian language information processing . 2020,第5期

机译：无监督的神经和统计机器翻译系统迭代培训
4. Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation [C] . Dusan Varis, Ondrej Bojar Annual meeting of the Association for Computational Linguistics . 2019

机译：基于弹性权重合并的神经机器翻译的无监督预训练
5. An Unsupervised Machine Learning Approach to Diabetic Neuropathy Data with Internal View Comparison and Weighting [D] . McCune, Jared. 2021

机译：内部视图比较和加权的糖尿病神经病变数据的无监督机器学习方法
6. Weighted elastic net for unsupervised domain adaptation with application to age prediction from DNA methylation data [O] . Lisa Handl, Adrin Jalali, Michael Scherer, -1

机译：加权弹性网用于无监督域自适应并应用于DNA甲基化数据的年龄预测
7. Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation [O] . Dušan Variš, Ondřej Bojar 2019

机译：使用弹性重量固结的神经机翻译无人监测

Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation

摘要

著录项

相似文献

相关主题

期刊订阅