Encoder-decoder framework pre-training method for neural machine translation

A machine translation and pre-training technology, applied in neural learning methods, natural language translation, biological neural network models, etc., can solve problems such as inability to initialize models and limit the benefits of pre-training models.

Active Publication Date: 2020-07-07
沈阳雅译网络技术有限公司
View PDF6 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this integration method has an important shortcoming, that is, only part of the information of the pre-trained model can be applied to the neural machine translation model, or it can only be applied to some modules of the neural machine translation model, and the entire model cannot be initialized. Some parameters need to be learned from scratch, which limits the benefits of pre-trained models

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Encoder-decoder framework pre-training method for neural machine translation
  • Encoder-decoder framework pre-training method for neural machine translation
  • Encoder-decoder framework pre-training method for neural machine translation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The present invention will be further elaborated below in conjunction with the accompanying drawings of the description.

[0038] In the field of natural language processing, models based on the encoder-decoder framework are generally used in conditional generation tasks such as machine translation, text generation, and intelligent dialogue, while pre-training models require massive amounts of data, which means that they can only rely on unlabeled Monolingual data for training. Inspired by document-level machine translation tasks, encoding the context of a sentence is helpful for the translation of the sentence, because adjacent sentences generally share some semantic information. The neural machine translation model extracts the information in the input source language through the encoder, and the decoder generates the target language with the same semantics according to the extracted information. Therefore, the present invention can use the above text of a sentence as...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an encoder-decoder framework pre-training method for neural machine translation. The encoder-decoder framework pre-training method comprises the steps of: constructing a largenumber of multi-language document-level monolingual corpora, and adding a special identifier in front of each sentence to represent the language type of the sentence; processing sentence pairs to obtain training data; training monolingual data of different languages to obtain converged pre-training model parameters; constructing parallel corpora, and initializing parameters of a neural machine translation model by using the pre-training model parameters; finely adjusting model parameters of the initialized neural machine translation model through parallel corpora to finish a training process;and in a decoding stage, encoding a source language sentence by using an encoder of the trained neural machine translation model, and decoding by using a decoder to generate a target language sentence. According to the encoder-decoder framework pre-training method, the model has language modeling capability and language generation capability, the pre-training model is applied to the neural machinetranslation model, the convergence rate of the model can be increased, and the robustness of the model is improved.

Description

technical field [0001] The invention relates to a pre-training method for an encoder-decoder framework, in particular to a neural machine translation-oriented encoder-decoder framework pre-training method. Background technique [0002] In the neural network, the pre-training method refers to obtaining a basic model through massive general data training. This general and sufficient data can encourage the model to have good generalization ability on downstream tasks in the same field. Afterwards, for downstream tasks, the pre-trained model is fine-tuned using task-specific data, so that the model pays more attention to task-related features and has better performance on this task. In the case of small amount of task-specific data, the pre-training method can effectively improve the performance of the model, and since the pre-training model already has general feature extraction capabilities, the fine-tuning model can achieve faster convergence and stronger robustness sex. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/58G06F40/56G06F40/279G06F40/205G06N3/08
CPCG06N3/08Y02D10/00
Inventor 杜权朱靖波肖桐张春良
Owner 沈阳雅译网络技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products