Unlock instant, AI-driven research and patent intelligence for your innovation.

Incorporating sentence weights into a domain adaptation method for neural machine translation

A machine translation, sentence technology, applied in natural language translation and other directions, can solve the problem of fine-tuning method overfitting and reducing translation effect.

Active Publication Date: 2021-08-03
IOL WUHAN INFORMATION TECH CO LTD
View PDF1 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The fine-tuning method does not take into account that some of the corpus outside the domain, which is close to the corpus in the domain, can help the translation in the domain, but the corpus that is far from the corpus in the domain may reduce the translation effect in the domain
Moreover, the fine-tuning method is prone to over-fitting problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Incorporating sentence weights into a domain adaptation method for neural machine translation
  • Incorporating sentence weights into a domain adaptation method for neural machine translation
  • Incorporating sentence weights into a domain adaptation method for neural machine translation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0024] First, let me introduce the application basis of this application: the NMT model based on attention mechanism.

[0025] In the neural machine translation system, the encoder-decoder framework is generally used to achieve translation. For each word in the training corpus, we initialize a word vector for it, and the word vectors of all words constitute a word vector dictionary. A word vector is generally a multi-dimensional vector. Each dimension in the vector is a real number. The size of the dimension is generally determined according to the results of the experiment process. For example, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a domain adaptation method for integrating sentence weight into neural machine translation, which is applied to an NMT model based on an attention mechanism and an encoder-decoder framework, including: calculating the similarity between an out-of-domain sentence and an in-domain corpus to give sentence weight; the sentence weight information is integrated into NMT training. In the above-mentioned domain adaptation method of integrating sentence weights into neural machine translation, we use the self-information of NMTencoder to obtain weights with the method of domain similarity, and integrate the weights into NMT training. This new method can achieve better translation results than the method in the paper "Instance weighting for neural machine translation domain adaptation."

Description

technical field [0001] The invention relates to the field of technical translation, in particular to a domain adaptation method for integrating sentence weights into neural machine translation. Background technique [0002] With the improvement of computer computing power and the application of big data, deep learning has been further applied. Neural Machine Translation based on deep learning has attracted more and more attention. In the NMT field, one of the most commonly used translation models is the encoder-decoder model with an attention-based mechanism. The main idea is to encode the sentence to be translated (collectively referred to as 'source sentence' hereinafter) into a vector representation through an encoder, and then use a decoder to decode the vector representation of the source sentence and translate it into its Corresponding translations (collectively referred to as 'target sentences' hereinafter). [0003] In many machine learning tasks, the distribution...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/58G06F40/42
CPCG06F40/42G06F40/58
Inventor 熊德意张诗奇
Owner IOL WUHAN INFORMATION TECH CO LTD