Construction method of semi-supervised neural machine translation model based on word-to-word translation

A translation model and machine translation technology, applied in neural learning methods, biological neural network models, natural language translation, etc., can solve problems such as unsupervised translation models cannot translate normally, and achieve the effect of improving translation quality and translation performance

Pending Publication Date: 2020-04-10
KUNMING UNIV OF SCI & TECH
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a method for constructing a semi-supervised neural machine translation model based on word-to-word translation, which is used to construct a semi-supervised neural machine translation model, and solves the problem that the unsupervised translation model cannot translate normally between two languages ​​with a huge gap question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Construction method of semi-supervised neural machine translation model based on word-to-word translation
  • Construction method of semi-supervised neural machine translation model based on word-to-word translation
  • Construction method of semi-supervised neural machine translation model based on word-to-word translation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0042] Embodiment 1: as Figure 1-3 As shown, based on the construction method of the semi-supervised neural machine translation model of word-to-word translation, the specific steps of the method are as follows:

[0043] Step1. Obtain the monolingual corpus of the source language and the target language, and the parallel corpus of the source language and the target language, and tokenize them;

[0044] Step2. Use the monolingual corpus of the source language and the target language to train a cross-language language model:

[0045] L lm =E x~S [-logP s→s (x|C(x))]+E y~T [-logP t→t (y|C(y))]

[0046] Among them, S represents the monolingual corpus of the source language, T represents the monolingual corpus of the target language, x and y represent a single sentence of the monolingual corpus of the source language and the monolingual corpus of the target language respectively; C(x) and C(y) represent the sentence Adding noise, that is, deleting, replacing, and exchanging...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a construction method of a semi-supervised neural machine translation model based on word-to-word translation, and belongs to the technical field of natural language processing. The method comprises the steps that firstly, monolingual corpora of a source language and a target language and parallel corpora of the source language and the target language are acquired; training a cross-language language model by using the monolingual corpus; initializing an encoder-decoder of the translation model by using the trained language model; acquiring bilingual dictionaries of thetwo languages; constructing a dictionary prefix tree in the translation model according to the bilingual dictionary; training an auto-encoder in the translation model; training a translation model byusing the parallel corpora of the source language and the target language; training the translation model and training the translation model at the same time; and training the translation model by using the parallel corpora of the source language and the target language through the auto-encoder, and fusing the translation model to obtain a final translation model. The method is simple and effective, the model can be translated normally, and the translation performance of the model is greatly improved.

Description

technical field [0001] The invention relates to a construction method of a semi-supervised neural machine translation model based on word-to-word translation, and belongs to the technical field of natural language processing. Background technique [0002] In the field of natural language processing, machine translation is a master of natural language processing and one of the most practical research subfields. Since supervised neural machine translation requires a large amount of parallel corpus, it is not very effective for language pairs that are difficult to obtain a large amount of parallel corpus, so unsupervised neural machine translation has been developed. In the Chinese-English language experiment of unsupervised neural machine translation, we found that the unsupervised neural machine translation model could not work properly due to the huge gap between Chinese and English languages. Therefore, two simple methods are proposed to improve it, so that the model can w...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/42G06N3/08
CPCG06N3/088
Inventor 余正涛刘科材李磊王振晗吴霖
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products