Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Multi-language translation model determination method and related device

A technology for translation models and determination methods, applied in natural language translation, neural learning methods, biological neural network models, etc., can solve problems affecting the space occupation of multilingual translation models, achieve the effect of improving the degree of generalization and ensuring translation accuracy

Pending Publication Date: 2022-06-28
TENCENT TECH (SHENZHEN) CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, oversampling may affect the space occupation of the multilingual translation model during training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-language translation model determination method and related device
  • Multi-language translation model determination method and related device
  • Multi-language translation model determination method and related device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The embodiments of the present application will be described below with reference to the accompanying drawings.

[0034] In the process of training a multilingual translation model, there may be an imbalance in the number of corpora in different translation directions, resulting in a multilingual translation model trained based on the corpus that may pay too much attention to translation directions with a large number of corpora and ignore the corpus A small number of translation directions affects the translation ability of the multilingual translation model in different translation directions, and affects the translation accuracy of the multilingual translation model.

[0035] In the related art, in view of the problem that the unbalanced number of corpora in different translation directions affects the translation ability of multilingual translation models in different translation directions, it is usually used to oversample the training corpus of the translation dire...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multi-language translation model determination method and a related device, which can be applied to various scenes such as artificial intelligence, natural language processing, machine learning and the like. Translation directions corresponding to the n training tasks are determined through sampling parameters, the training corpora are sampled based on the corresponding translation directions, n sample sets are obtained, the n sample sets are in one-to-one correspondence with the n training tasks, the number of the corpora included in the n sample sets is the same, and the training corpora in the same sample set belong to the same translation direction; the translation direction with a small number of training corpora can be prevented from being ignored. The source language of the training corpus in the n sample sets is input into the initial multi-language translation model, n loss functions in one-to-one correspondence with the n sample sets are obtained according to the corresponding target language corpus, then a total loss function is determined based on the n loss functions, the initial multi-language translation model is trained according to the total loss function, the model generalization degree is improved, and the multi-language translation efficiency is improved. And the translation precision in different translation directions is ensured.

Description

technical field [0001] The present application relates to the field of machine learning, and in particular, to a method and related apparatus for determining a multilingual translation model. Background technique [0002] With the wide application of machine translation technology, machine translation has gradually expanded to multilingual machine translation. Multilingual machine translation means that the same model can support translation between multiple languages, so as to meet the needs of users for multiple translation directions. [0003] However, training of a multilingual machine translation model (multilingual translation model) requires training corpora in multiple translation directions. Usually, the number of training corpora for multiple translation directions is not balanced. In order to avoid over-focusing on the translation directions with a large number of training corpora and ignoring the translation directions with a small number of training corpora dur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/49G06F40/58G06F40/44G06N3/08
CPCG06F40/49G06F40/58G06F40/44G06N3/08G06N3/084
Inventor 季佰军胡博杰鞠奇
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products