Neural machine translation inference acceleration method based on attention mechanism

A machine translation and attention technology, applied in the field of neural machine translation inference acceleration, can solve the problem that the decoding speed is difficult to meet real-time response

Inactive Publication Date: 2019-12-06
沈阳雅译网络技术有限公司
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At the same time, in the neural machine translation system based on the self-attention mechanism, due to the frequent attention alignment operations within the sentence and between sentenc

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Neural machine translation inference acceleration method based on attention mechanism
  • Neural machine translation inference acceleration method based on attention mechanism
  • Neural machine translation inference acceleration method based on attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The present invention will be further elaborated below in conjunction with the accompanying drawings of the description.

[0037]The present invention will optimize the decoding speed of the neural machine translation system based on the attention mechanism from the perspective of attention sharing, aiming to greatly increase the decoding speed of the translation system at the cost of a small performance loss, and achieve a balance between performance and speed.

[0038] The present invention is based on an attention mechanism neural machine translation inference acceleration method, comprising the following steps:

[0039] 1) Construct a training parallel corpus and a multi-layer neural machine translation model based on the attention mechanism, use the parallel corpus to generate a machine translation vocabulary, and further train to obtain model parameters after training convergence;

[0040] 2) Calculate the parameter similarity between any two layers of the decoder...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a neural machine translation inference acceleration method based on an attention mechanism. The neural machine translation inference acceleration method comprises the steps: constructing and training a training parallel corpus and a multilayer neural machine translation model based on the attention mechanism to obtain model parameters after training convergence; calculatingthe parameter similarity between any two layers of the decoding end self-attention, the coding end self-attention and the coding and decoding attention in the model; if the similarity between the upper layers and the bottom layers of the coding end and the decoding end is higher than a threshold value, enabling the upper layers to directly use attention weight parameters of the bottom layers forcalculation; if the similarity between the upper layer and the bottom layer is higher than the threshold value, enabling the upper layer to directly use the attention calculation result of the bottomlayer; and inputting the vocabulary into the model for calculation to obtain probability distribution of the machine translation vocabulary, and selecting the highest word as a translation result. Onthe basis of the latest implementation of rapid reasoning, the neural machine translation inference acceleration method can averagely obtain 1.3 times of acceleration ratio, and meanwhile, hardly reduces the model performance.

Description

technical field [0001] The invention relates to a neural machine translation inference acceleration technology, in particular to an attention mechanism-based neural machine translation inference acceleration method. Background technique [0002] Machine translation (Machine Translation or MT) is an experimental subject that uses computers to translate between natural languages. In layman's terms, it is the process of using a computer to convert a natural language (source language) into another natural language (target language). For a long time, machine translation has been regarded as one of the ultimate technical means to solve translation problems, and the application demand is very strong. For example, the Chinese government includes research on natural language understanding, including machine translation technology, in the national medium and long-term scientific and technological development plan; it is reported that Google Translate provides services to more than 20...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/28G06K9/62G06N3/04
CPCG06N3/045G06F18/22
Inventor 杜权朱靖波肖桐张春良
Owner 沈阳雅译网络技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products