Translation model training method and device and medium

A technology of translation model and training method, applied in natural language translation, etc., can solve problems such as poor accuracy of translation results, achieve the effects of reducing training costs, improving accuracy, and improving training effects

Pending Publication Date: 2021-11-05
TENCENT TECH (SHENZHEN) CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the second type, the number of models used for translation can be simplified, but because in the training process, the translation model actually learns the association between any two types of languages, there is no substantial difference from the first type of training model. As a result, the translation performance of the trained model in various languages ​​is closely related to the data volume of the bilingual parallel corpus
Once the corpus corresponding to a certain language is less, the translation model output by the trained translation model will be less accurate when translating that language

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Translation model training method and device and medium
  • Translation model training method and device and medium
  • Translation model training method and device and medium

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0135] Example 1: Randomly sample words with a fixed proportion from the sentence to be translated as the replaced words.

[0136] The first server can pre-store a fixed ratio, which refers to the ratio between the number of words to be replaced and the total number of words in the sentence to be translated, and the first server can randomly select from the sentence to be translated The words satisfying the fixed ratio are used as the replaced words.

[0137] In the embodiment of the present application, different numbers of words to be replaced can be selected from different sentences to be translated, so as to replace as many words as possible and improve the training effect of the first translation model.

[0138] Furthermore, because the semantics expressed by nouns and verbs in sentences are generally more valuable, so in the embodiment of the present application, the first server can set the sampling probability corresponding to nouns and verbs in the sentence to be tran...

example 2

[0139] Example 2: Randomly sample a fixed number of words from the sentence to be translated as the replaced words.

[0140] The first server may pre-store a fixed number, the value of the fixed number may be set according to actual needs, and the value of the fixed number is smaller than the total number of words included in the sentence to be translated. When the first server needs to replace the words in the sentence to be translated, it may randomly sample a fixed number of words from the sentence to be translated as the words to be replaced.

[0141] Similarly, in the embodiment of the present application, the first server may set the sampling probabilities corresponding to the nouns and verbs in the sentence to be translated to be higher than the sampling probabilities of words of other natures.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a translation model training method and device and a medium, and relates to the technical field of artificial intelligence, in particular to a natural language processing technology, and the translation model training method can improve the output accuracy of a trained translation model. In the translation model training method, the synonyms corresponding to the languages different from the languages of the to-be-translated sentences are used for replacing partial words in the to-be-translated sentences, and the different languages are input into the translation model at one time, so that the translation model can learn the relation among multiple languages at one time, the training effect of the translation model is improved, therefore, the output accuracy of the translation model is improved.

Description

technical field [0001] This application relates to the field of artificial intelligence technology, in particular to natural language processing technology, and provides a translation model training method, device and medium. Background technique [0002] Machine translation can enable people to communicate with each other without being limited by language, and can promote economic and cultural exchanges in various countries and regions. [0003] At present, there are usually two ways of machine translation. The first is to realize machine translation through a one-to-one translation model, that is, a separate translation model is trained for the translation process from one language to another. The second is to train a single translation model by merging bilingual parallel corpora in multiple languages, and to achieve translation in multiple languages ​​through a translation model that shares parameters. [0004] In the second type, the number of models used for translatio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/58G06F40/42
CPCG06F40/58G06F40/42
Inventor 曾显峰孟凡东
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products