Semi-supervised Mongolian-Chinese neural machine translation method based on collaborative training

A collaborative training and machine translation technology, applied in the field of artificial intelligence, can solve the problems of unavailable translation results, lack of high quality, large scale, wide coverage, poor performance of neural machine translation, etc., to alleviate scarcity and improve accuracy Effect

Active Publication Date: 2020-07-14
INNER MONGOLIA UNIV OF TECH
View PDF5 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the lack of high-quality, large-scale, and wide-coverage bilingual bilingual parallel corpora such as Mongolian, neural machine translation does not perform well in the Mongolian-Chinese translation model, and it is impossible to obtain better quality Translation results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semi-supervised Mongolian-Chinese neural machine translation method based on collaborative training
  • Semi-supervised Mongolian-Chinese neural machine translation method based on collaborative training
  • Semi-supervised Mongolian-Chinese neural machine translation method based on collaborative training

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] The invention provides a semi-supervised Mongolian-Chinese neural machine translation method based on collaborative training. The collaborative training method is a method of gradually expanding the original corpus by rationally using the existing monolingual corpus under the condition that the original parallel corpus is scarce. method. The present invention first uses Mongolian-Chinese (mo-ch), English-Chinese (en-ch) and Korean-Chinese (ko-ch) parallel corpora, and constructs three initial translation models based on semi-supervised classification and generative adversarial networks: Mongolian-Chinese translation model M -mo-ch, English-Chinese translation model M-en-ch and Korean-Chinese translation model M-ko-ch, and use these three translation models to perform multi-source mutual parallel corpus Mongolian-English-Korean (mo-en-ko) to The target end is the mark of Chinese (ch), and the best quality marked corpus is selected by using the language model LM-ch traine...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

At present, a decoder-encoder structure is commonly used in neural machine translation, and a good effect is obtained under the condition that parallel corpora are sufficient. However, for the Mongolian language as a small language, mongolian and Chinese parallel corpus resources are limited and are very difficult to obtain, therefore, the invention provides a semi-supervised Mongolian-Chinese neural machine translation method based on collaborative training. Three translation models are constructed by using a semi-supervised classification generative adversarial network: a Mongolian-Chinese translation model M-mo-ch, an English-Chinese translation model M-en-ch and a Korean-Chinese translation model M-ko-ch; the three translation models are used for marking a multi-source end parallel corpus Mongolian and Korean to a target end, namely Chinese, the marked corpus with the best quality is selected by using a language model LM-ch trained by Chinese monolingual training to expand an original corpus, and a better translation model is trained again. According to the method, collaborative training and the semi-supervised classification generative adversarial network are combined and applied to Mongolian-Chinese neural machine translation, and the quality of the Mongolian-Chinese neural machine translation model is improved.

Description

technical field [0001] The invention belongs to the technical field of artificial intelligence, relates to machine translation, in particular to a semi-supervised Mongolian-Chinese neural machine translation method based on collaborative training. Background technique [0002] Machine Translation (MT) refers to the process of using computers to automatically convert text from one natural language (source language) into another natural language (target language) with exactly the same meaning. [0003] In recent years, although neural machine translation has gradually replaced traditional statistical machine translation, the performance of translation systems is highly dependent on the quality, scale and domain coverage of parallel corpora. However, due to the lack of high-quality, large-scale, and wide-coverage Mongolian-Chinese bilingual parallel corpora such as Mongolian, neural machine translation does not perform well in the Mongolian-Chinese translation model, and it is ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/58G06N3/08
CPCG06N3/08
Inventor 仁庆道尔吉文丽霞苏依拉刘永超庞蕊
Owner INNER MONGOLIA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products