Translation model training method, device and equipment and storage medium

A technology of translation model and training method, applied in the field of translation model training method, device, equipment and storage medium, capable of solving problems such as limiting the quality of machine translation models, achieving the effects of improving translation quality, easy transfer, and reduced difficulty of transfer learning

Pending Publication Date: 2021-03-26
IFLYTEK CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the number of parallel sentence pairs in the translation task is relatively rich, the quality of the translation model is high, b

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Translation model training method, device and equipment and storage medium
  • Translation model training method, device and equipment and storage medium
  • Translation model training method, device and equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0049]This application provides a translation model training scheme. For the translation model from the source language to the target language, the training data of the translation model can be forged by migrating the training corpus of a third-party language, so as to achieve training data enhancement and solve the problem of source language and target language. The problem of limited language training resources.

[0050] The solution of this application can be implemented based on...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a translation model training method, device and equipment and a storage medium, aiming at a source language and/or a target language, the method includes obtaining training corpora under respective approximate languages and parallel corpora of the training corpora, if the language of the training corpora is approximate to the source language, taking the parallel corpora astarget languages, and if the language of the training corpus is similar to the target language, taking the parallel corpus as the source language; for at least one text unit in the training corpus, replacing the text unit with a parallel text unit in the source language or the target language similar to the training corpus to obtain a mixed language training corpus; and forming a parallel corpus pair by the mixed language training corpus and the parallel corpus, adding the parallel corpus pair into the training sample set, and training a translation model from the source language to the targetlanguage. Approximate language resources of the source language and/or the target language are utilized, model training data are enriched, and the training effect of the translation model is improved.

Description

technical field [0001] The present application relates to the technical field of machine learning, and more specifically, relates to a translation model training method, device, equipment and storage medium. Background technique [0002] Modern machine translation systems use parallel corpus to learn the mapping relationship between translation languages, so that the quality of machine translation is positively correlated with the data of parallel sentence pairs. When the number of parallel sentence pairs in the translation task is relatively abundant, the quality of the translation model is high, but when the sentence pair data of the translation task is limited, it also greatly limits the quality of the machine translation model. [0003] In order to solve the problem of less translation training corpus, data augmentation can be used to forge training data. Among them, back-translation, as a widely used machine translation data enhancement method, has become the standard ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/49G06F40/58
CPCG06F40/58G06F40/49
Inventor 叶忠义张为泰刘俊华
Owner IFLYTEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products