Efficient Methods for Mapping Neural Machine Translator on FPGAs

Li Qin; Zhang Xiaofan; Xiong Jinjun; Hwu Wen-Mei; Chen Deming

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Efficient Methods for Mapping Neural Machine Translator on FPGAs

【24h】

Efficient Methods for Mapping Neural Machine Translator on FPGAs

机译：在FPGA上映射神经机转换器的高效方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Neural machine translation (NMT) is one of the most critical applications in natural language processing (NLP) with the main idea of converting text in one language to another using deep neural networks. In recent year, we have seen continuous development of NMT by integrating more emerging technologies, such as bidirectional gated recurrent units (GRU), attention mechanisms, and beam-search algorithms, for improved translation quality. However, with the increasing problem size, the real-life NMT models have become much more complicated and difficult to implement on hardware for acceleration opportunities. In this article, we aim to exploit the capability of FPGAs to deliver highly efficient implementations for real-life NMT applications. We map the inference of a large-scale NMT model with total computation of 172 GFLOP to a highly optimized high-level synthesis (HLS) IP and integrate the IP into Xilinx VCU118 FPGA platform. The model has widely used key features for NMTs, including the bidirectional GRU layer, attention mechanism, and beam search. We quantize the model to mixed-precision representation in which parameters and portions of calculations are in 16-bit half precision, and others remain as 32-bit floating-point. Compared to the float NMT implementation on FPGA, we achieve 13.1x speedup with an end-to-end performance of 22.0 GFLOPS without any accuracy degradation. Based on our knowledge, this is the first work that successfully implements a real-life end-to-end NMT model to an FPGA on board.

机译：神经电脑翻译（NMT）是自然语言处理（NLP）中最关键的应用之一，主要思想以一种语言将文本转换为另一语言的主要思想。近年来，我们通过集成更多的新兴技术，例如双向门控经常性单元（GRU），注意机制和光束搜索算法，以实现纽约州的持续发展，以提高翻译质量。然而，随着问题规模的不断增加，现实生活NMT模型已经变得更加复杂，在硬件上实现了加速机会的硬件。在本文中，我们的目标是利用FPGA的能力为现实生活NMT应用提供高效实现。我们将大规模NMT模型推断使用172 GFLOP的总计算到高度优化的高级合成（HLS）IP，并将IP集成到Xilinx VCU118 FPGA平台中。该模型广泛使用了NMT的关键特征，包括双向GRU层，注意机制和光束搜索。我们将模型量化为混合精度表示，其中计算的参数和部分是16比特半精度，其他人仍然是32位浮点。与FPGA上的浮动NMT实现相比，我们实现了13.1倍的加速，结束于结束性能为22.0 gflops，没有任何精度下降。根据我们的知识，这是第一项工作，成功将现场终端到底NMT模型成功实现到船上的FPGA。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2021年第7期|1866-1877|共12页
作者
Li Qin; Zhang Xiaofan; Xiong Jinjun; Hwu Wen-Mei; Chen Deming;
展开▼
作者单位

Univ Illinois Dept Elect & Comp Engn Champaign IL 61801 USA;

Univ Illinois Dept Elect & Comp Engn Champaign IL 61801 USA;

IBM TJ Watson Res Ctr Yorktown Hts NY 10598 USA;

Univ Illinois Dept Elect & Comp Engn Champaign IL 61801 USA;

Univ Illinois Dept Elect & Comp Engn Champaign IL 61801 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Computational modeling; Field programmable gate arrays; Decoding; Task analysis; Hardware; IP networks; Dictionaries; Hardware-efficient inference; neural machine translation; FPGA; high level synthesis;

机译：计算建模;现场可编程门阵列;解码;任务分析;硬件;IP网络;词典;硬件高效推断;神经电脑翻译;FPGA;高水平合成;

相似文献

外文文献
中文文献
专利

1. fpgaConvNet: Mapping Regular and Irregular Convolutional Neural Networks on FPGAs [J] . Venieris Stylianos I, Bouganis Christos-Savvas Neural Networks and Learning Systems, IEEE Transactions on . 2019,第2期

机译：fpgaConvNet：在FPGA上映射规则和不规则卷积神经网络
2. Toolfiows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions [J] . Venieris Stylianos I., Kouris Alexandros, Bouganis Christos-Savvas ACM Computing Surveys . 2018,第3期

机译：在FPGA上映射卷积神经网络的工具流：调查和未来方向
3. Architecture-Aware Technique for Mapping Area-Time Efficient Custom Instructions onto FPGAs [J] . Lam Siew Kei, Srikanthan Thambipillai Computers, IEEE Transactions on . 2011,第5期

机译：用于将时区有效的定制指令映射到FPGA的架构感知技术
4. Efficient mapping of mathematical expressions to FPGAs: Exploring different design methodologies [C] . Nemes Csaba, Nagy Zoltan, Szolgay Peter 2011 20th European Conference on Circuit Theory and Design . 2011

机译：高效地将数学表达式映射到FPGA：探索不同的设计方法
5. A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA [D] . Zhang, Xinyu. 2017

机译：在FPGA上高效实现反卷积神经网络的设计方法
6. Quantum Neural Network Based Machine Translator for Hindi to English [O] . Ravi Narayan, V. P. Singh, S. Chakraverty -1

机译：基于量子神经网络的印地文到英语机器翻译器
7. Efficient Methods for Mapping Neural Machine Translator on FPGAs [O] . Qin Li, Xiaofan Zhang, Jinjun Xiong, 2021

机译：在FPGA上映射神经机转换器的高效方法

Efficient Methods for Mapping Neural Machine Translator on FPGAs

摘要

著录项

相似文献

相关主题

期刊订阅