Machine translation model training method and device and electronic equipment

A translation model and machine translation technology, applied in the field of machine translation, can solve problems such as unreachable, noisy translation results, and inability to eliminate machine translation errors, etc., to achieve the effect of eliminating noise interference, ensuring accuracy, and avoiding oversaturation problems

Active Publication Date: 2019-10-08
TSINGHUA UNIV
View PDF11 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, a large parallel corpus cannot eliminate possible errors in machine translation itself. When errors occur in machine translation itself, the translation results obtained will be noisy, which will affect the accuracy of the translation results and fail to achieve the desired effect.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Machine translation model training method and device and electronic equipment
  • Machine translation model training method and device and electronic equipment
  • Machine translation model training method and device and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of the embodiments of the present invention, but not all of them. Based on the embodiments in the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the embodiments of the present invention.

[0025] The embodiment of the present invention aims at the inaccurate translation problem of the translation model trained when there is noise interference in the prior art, by processing the existing monolingual corpus to expand the parallel corpus used for training the model, and further using Monte Carl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a machine translation model training method and device and electronic equipment. The machine translation model training method comprises the steps: employing afirst parallel corpus, and carrying out preliminary training on a translation model from a target end to a source end; utilizing the trained translation model from the target end to the source end totranslate each sentence in a definite monolingual corpus to obtain a synthetic corpus, and splicing the synthetic corpus with the definite monolingual corpus to obtain a second parallel corpus; evaluating the credibility of a translation result obtained by using the trained translation model from the target end to the source end by using a Monte-Carlo random inactivation algorithm; and based on the credibility, training a translation model from a source end to a target end by utilizing an integral corpus formed by the first parallel corpus and the second parallel corpus. According to the embodiment of the invention, the translation model can still be accurately trained in the presence of noise interference, and the accuracy of the translation model is ensured.

Description

technical field [0001] The present invention relates to the technical field of machine translation, and more specifically, to a training method, device and electronic equipment for a machine translation model. Background technique [0002] In the field of language translation, in order to realize automatic machine translation, the current technology usually adopts a method based on neural network, which needs to collect large-scale high-quality parallel corpus to train a reliable neural network model. However, high-quality parallel corpora often only exist among a small number of languages, and are often limited to certain specific fields, such as government documents, news, etc. [0003] At present, with the development of key technologies such as databases and the Internet, electronic documents in various languages ​​and fields are increasingly abundant, providing rich monolingual corpus for machine translation, which also provides great convenience for solving the above p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28
CPCG06F40/58
Inventor 刘洋王硕栾焕博孙茂松
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products