Machine translation model training method and device and storage medium

A model training and machine translation technology, applied in machine learning, computing models, natural language translation, etc., can solve problems such as inconsistent training difficulty and inconsistent quality of language data translation results

Pending Publication Date: 2020-04-10
BEIJING XIAOMI INTELLIGENT TECH CO LTD
View PDF6 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] For language data in different fields, due to the difference in the amount of data containing language data, there will be a problem of inconsistent training difficulty. In the above-mentioned related technologies, the translation model obtained by mixing language data from multiple data fields together for training will be The problem of inconsistent quality of translation results of language data in different domains

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Machine translation model training method and device and storage medium
  • Machine translation model training method and device and storage medium
  • Machine translation model training method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0071] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present disclosure as recited in the appended claims.

[0072] It should be understood that "several" mentioned herein refers to one or more, and "multiple" refers to two or more. "And / or" describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and / or B may indicate: A exists alone, A and B exist simultaneously, and B exists independently. The character " / " gen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a machine translation model training method and device and a storage medium and belongs to the technical field of natural language processing. The method comprises the following steps: acquiring a multi-domain mixed training data set containing a plurality of training data pairs; performing data domain classification on the plurality of training data pairs to obtain at least two domain data subsets; determining at least two candidate optimization objectives of each domain data subset, and respectively training at least two candidate single domain models corresponding toeach domain data subset based on the at least two candidate optimization objectives; respectively testing at least two candidate single-domain models corresponding to each domain data subset, and selecting a candidate optimization target with the highest accuracy corresponding to the candidate single-domain model as a specified optimization target of the domain data subset; and training a hybriddomain model based on each domain data subset in the training data set and the corresponding specified optimization target. The quality of language data translation results of various fields by the hybrid field model can be improved.

Description

technical field [0001] The present disclosure relates to the technical field of natural language processing, and in particular to a machine translation model training method, device and storage medium. Background technique [0002] In the field of machine translation, in order to pursue the accuracy of language translation, people continue to improve the training methods of machine translation. [0003] In related technologies, by mixing language data from multiple data fields together for training, a general translation model applicable to multi-field translation and with better translation effects in each field is obtained. [0004] For language data in different fields, due to the difference in the amount of data containing language data, there will be a problem of inconsistent training difficulty. In the above related technologies, the translation model obtained by mixing language data from multiple data fields together for training will be Causes the problem of inconsi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/58G06K9/62
CPCG06F18/24G06F18/214G06F40/44G06F40/30G06N20/00G06F40/58G06F40/42
Inventor 孙于惠李响李京蔚
Owner BEIJING XIAOMI INTELLIGENT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products