Learning Deep Transformer Models for Machine Translation

机译：学习机器翻译的深变压器模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Transformer is the state-of-the-art model in recent machine translation evaluations. Two strands of research are promising to improve models of this kind: the first uses wide networks (a.k.a. Transformer-Big) and has been the de facto standard for the development of the Transformer system, and the other uses deeper language representation but faces the difficulty arising from learning deep networks. Here, we continue the line of research on the latter. We claim that a truly deep Transformer model can surpass the Transformer-Big counterpart by 1) proper use of layer normalization and 2) a novel way of passing the combination of previous layers to the next. On WMT 16 English-German, NIST OpenMT'12 Chinese-English and larger WMT'18 Chinese-English tasks, our deep system (30/25-layer encoder) outperforms the shallow Transformer-Big/Base baseline (6-layer encoder) by 0.4～2.4 BLEU points. As another bonus, the deep model is 1.6X smaller in size and 3X faster in training than Transformer-Big~1.

机译：变压器是最近的机器翻译评估中的最先进的模型。两条股有希望改善这种模式：第一个使用广泛的网络（AKA变压器 - 大），并一直是变压器系统开发的事实标准，另一个使用更深入的语言表示，但面临难度来自学习深网络的影响。在这里，我们继续对后者的研究线。我们声称，真正深的变压器模型可以超越变压器 - 大对应物1）正确使用层标准化和2）将前一层与下一个组合的新方法。在WMT 16英语 - 德语，NIST Openmt'12中文 - 英语和更大的WMT'18中英语任务，我们的深度系统（30/25层编码器）优于浅变压器 - 大/基础基线（6层编码器） 0.4〜2.4 BLEU积分。作为另一个奖励，深层模型的尺寸小于1.6倍，培训比变压器 - 大〜1更快3倍。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|cxxxiv p. 1324-1979|共13页
会议地点
作者
Qiang Wang; Bei Li; Tong Xiao; Jingbo Zhu; Changliang Li; Derek F. Wong; Lidia S. Chao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Research on Machine Translation of Deep Neural Network Learning Model Based on Ontology [J] . Yaya Tian, Shaweta Khanna, Anton Pljonkin Informatica: An International Journal of Computing and Informatics . 2021,第5期

机译：基于本体论的深神经网络学习模型机器翻译研究
2. Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals [J] . Martin Popel, Marketa Tomkova, Jakub Tomek, Nature Communications . 2020,第1期

机译：变换机器翻译：深度学习系统达到了与人类专业人士相当的新闻翻译质量
3. Deep Learning Takes on Translation Improvements in hardware, the availability of massive amounts of data, and algorithmic upgrades are among the factors supporting better machine translation [J] . Monroe Don Communications of the ACM . 2017,第6期

机译：深度学习承担翻译工作硬件改进，大量数据的可用性以及算法升级是支持更好的机器翻译的因素
4. Learning Deep Transformer Models for Machine Translation [C] . Qiang Wang, Bei Li, Tong Xiao, Annual meeting of the Association for Computational Linguistics . 2019

机译：学习用于机器翻译的深层变压器模型
5. Deep Learning Networks and Its Application in Neural Machine Translation. [D] . Duan, Tiehang. 2017

机译：深度学习网络及其在神经机器翻译中的应用。
6. Machine Learning-Based Sensor Data Modeling Methods for Power Transformer PHM [O] . Anyi Li, Xiaohui Yang, Huanyu Dong, 2018

机译：基于机器学习的电力变压器PHM传感器数据建模方法
7. Learning Deep Transformer Models for Machine Translation [O] . Qiang Wang, Bei Li, Tong Xiao, 2019

机译：学习机器翻译的深变压器模型

Learning Deep Transformer Models for Machine Translation

摘要

著录项

相似文献

相关主题

期刊订阅