Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Training method and training system for non-autoregressive machine translation model based on task-level curriculum learning

A technology for machine translation and training methods, applied in natural language translation, computational models, machine learning, etc., can solve the problem of low accuracy of non-autoregressive machine translation models, and achieve improved inference efficiency, accuracy, and acceleration. Effect

Active Publication Date: 2020-08-25
ZHEJIANG UNIV
View PDF9 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the problem of low accuracy of the existing non-autoregressive machine translation model, the present invention starts with the training method of the model, and proposes a training method and training system of a non-autoregressive machine translation model based on task-level curriculum learning , the present invention uses a task-level course-based learning method to gradually transfer the model from AT to NAT

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training method and training system for non-autoregressive machine translation model based on task-level curriculum learning
  • Training method and training system for non-autoregressive machine translation model based on task-level curriculum learning
  • Training method and training system for non-autoregressive machine translation model based on task-level curriculum learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0091] This implementation uses four standard translation datasets: IWSLT14 German-English (De-En) dataset, IWSLT16 English-German (En-De) dataset and WMT14 English-German (En-De) dataset. This implementation reverses the WMT14 English-German dataset to obtain a fourth dataset, the WMT14 German-English (De-En) dataset. Among them, the IWSLT14 German-English (De-En) dataset and the IWSLT16 English-German (En-De) dataset are from the literature M.Cettolo, C.Girardi, and M.Federico.2012.WIT3:Web Inventory of Transcribed and Translated Talks .InProc.of EAMT, pp.261-268, Trento, Italy; WMT14 English-German (En-De) dataset from the ACL Ninth Workshop on Statistical Machine Translation (https: / / www.statmt.org / wmt14 / translation-task.html)

[0092] Among them, the number of bilingual sentence pairs used for training, development, and testing in the IWSLT14 dataset is 153k, 7k, and 7k, respectively. The number of bilingual sentence pairs used by IWSLT16 for training, development, and...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a training method and a training system of a non-autoregressive machine translation model based on task level curriculum type learning, and belongs to the field of non-autoregressive machine translation. In the present invention, the method comprises the following steps: firstly, establishing a Transformer-based machine translation model; replacing a multi-head self-attention mechanism in the Transformer model decoder with a causal-k self-attention mechanism according to the multi-head self-attention mechanism of the Transformer model decoder; obtaining a TCL-NAT model;then, adjusting a parameter k in a causal-k self-attention mechanism; dividing the training process into AT training stages (k=1) in sequence; an SAT training stage (1<k<N) and an NAT training stage(k=N), introducing a task window concept in an SAT training stage, and simultaneously training a plurality of tasks with different parallelism degrees in the same stage, so that the model can stably transit from one training stage to another training stage, and the accuracy of the non-autoregressive machine translation model is effectively improved.

Description

technical field [0001] The invention relates to the field of non-autoregressive machine translation, in particular to a training method and system for a non-autoregressive machine translation model based on task-level curriculum learning. Background technique [0002] In recent years, neural machine translation (NMT) has developed rapidly. Usually, NMT adopts the encoder-decoder framework. At present, the mainstream method for the decoder to generate the target sentence is the autoregressive method. The characteristic of the autoregressive method is that the generation of the current word depends on the prediction results of the previous words and the source context from the encoder. . Although the accuracy of NMT using the autoregressive method has reached the human level, because the autoregressive method must be translated word by word, that is, in the reasoning process, the following words must wait for all the previous words to be inferred before reasoning can be perfo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/58G06N20/00
CPCG06F40/58G06N20/00
Inventor 赵洲路伊琳刘静林
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products