Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Mongolian-Chinese machine translation method combining Meta-KD framework and fine-grained compression

A machine translation, fine-grained technology, applied in the field of artificial intelligence, can solve the problems of unsatisfactory models and small corpus, and achieve the effect of accelerated inference and fast training speed

Pending Publication Date: 2022-01-04
INNER MONGOLIA UNIV OF TECH
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Mongolian is a small language, so the corpus of Mongolian-Chinese translation is small, and the trained model is not ideal

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mongolian-Chinese machine translation method combining Meta-KD framework and fine-grained compression
  • Mongolian-Chinese machine translation method combining Meta-KD framework and fine-grained compression
  • Mongolian-Chinese machine translation method combining Meta-KD framework and fine-grained compression

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The implementation of the present invention will be described in detail below in conjunction with the drawings and examples.

[0022] Such as figure 1 As shown, a Mongolian-Chinese translation method that combines the Meta-KD framework model and fine-grained compression, including:

[0023] Step 1: Perform data preprocessing and data set division on the Chinese corpus, English corpus, and Mongolian corpus to obtain the Chinese-English training set, Chinese-English verification set, and Chinese-English test set, as well as the Mongolian-Chinese training set, Mongolian-Chinese verification set, and Mongolian-Chinese training set. Han test set. details as follows:

[0024] 1) For the Chinese corpus, using the data compression algorithm BPE for reference, the word granularity segmentation will be performed, and spaces will be added between the words to separate the words, and then output to a new text;

[0025] 2) For the English corpus, use the English preprocessing too...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Mongolian-Chinese machine translation method combining a Meta-KD framework and fine-grained compression. The method comprises the following steps: performing data preprocessing and data set division on Chinese corpora, English corpora and Mongolian corpora, learning Chinese-English translation by using the Meta-KD framework, training a BERT language model, learning a student model under the guidance of a meta-teacher according to a meta-distillation algorithm to obtain transferable knowledge for Mongolian-Chinese translation, and in combination with a fine-grained compression method, performing training verification of the Mongolian-Chinese translation on the student model. According to the method, data set training is performed through the Meta-KD framework, so that the method is more suitable for translation of small languages, and a more accurate translation result is obtained; and the fine-grained compression enables the trained model to have a higher training speed. Then, in combination with the fine-grained compression method, fine-grained compression is performed on information representation through information entropy, so that the purpose of model accelerated inference is achieved.

Description

technical field [0001] The invention belongs to the technical field of artificial intelligence and relates to machine translation, in particular to a Mongolian-Chinese machine translation method combined with a Meta-KD framework and fine-grained compression. Background technique [0002] Machine translation is the process of automatically converting a natural language (source language) into another natural language (target language) with the same meaning by means of a computer, which depends on the size and quality of parallel corpora. [0003] Mongolian is a small language, so the corpus of Mongolian-Chinese translation is small, and the trained model is not ideal. Contents of the invention [0004] In order to overcome the shortcomings of the above-mentioned prior art, the object of the present invention is to provide a Mongolian-Chinese machine translation method combining Meta-KD framework and fine-grained compression to solve the level of Mongolian-Chinese machine tra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/58G06F40/30G06F16/35G06N3/08
CPCG06F40/58G06F40/30G06F16/35G06N3/084
Inventor 苏依拉郭晨雨韩春辉仁庆道尔吉吉亚图
Owner INNER MONGOLIA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products