Method for constructing Mongolian-Chinese parallel corpora by utilizing generative adversarial network to improve Mongolian-Chinese translation quality

A technology of parallel corpus and translation quality, applied in the field of machine translation, it can solve the problems of lack of Mongolian-Chinese parallel data sets, NMT cannot guarantee and so on.

Active Publication Date: 2019-12-20
INNER MONGOLIA UNIV OF TECH
View PDF8 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In order to overcome the shortcomings of the above-mentioned prior art, the purpose of the present invention is to provide a method for improving the quality of Mongolian-Chinese translation by using a generative con

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for constructing Mongolian-Chinese parallel corpora by utilizing generative adversarial network to improve Mongolian-Chinese translation quality
  • Method for constructing Mongolian-Chinese parallel corpora by utilizing generative adversarial network to improve Mongolian-Chinese translation quality
  • Method for constructing Mongolian-Chinese parallel corpora by utilizing generative adversarial network to improve Mongolian-Chinese translation quality

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] The implementation of the present invention will be described in detail below in conjunction with the drawings and examples.

[0063] The invention utilizes a generative confrontation network to construct a Mongolian-Chinese parallel corpus to improve the Mongolian-Chinese translation quality, mainly including the construction of an encoder and a decoder and the construction of a discriminator model.

[0064] figure 1 Shown is the optimized Transformer1 architecture. On the basis of the original Transformer, a gated linear unit is first added to effectively obtain important information in the source language sentences and discard redundant information; secondly, a branch structure is added, which can effectively capture the diverse semantic information between source language sentences ; Finally, the capsule network is added to the branch structure and after the third layer is standardized, so that the encoder can capture the exact position of the word in the source la...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for constructing Mongolian-Chinese parallel corpora by utilizing a generative adversarial network to improve Mongolian-Chinese translation quality. The generative adversarial network comprises a generator and a discriminator. The generator uses a hybrid encoder to encode a source language sentence Mongolian into vector representation and converts the representationinto target language sentence Chinese by using a decoder based on a bidirectional Transformer in combination with a sparse attention mechanism. Therefore, Mongolian sentences and more Mongolian-Chinese parallel corpora which are closer to human translation are generated. In a discriminator, the difference between the Chinese sentence generated by the generator and human translation is judged; andadversarial training is performed on the generator and the discriminator until the discriminator considers that the Chinese sentence generated by the generator is very similar to the human translation, a high-quality Mongolian-Chinese machine translation system and a large number of Mongolian-Chinese parallel data sets are acquired and Mongolian-Chinese translation is performed by using the Mongolian-Chinese machine translation system. According to the method, the problems that Mongolian-Chinese parallel data sets are severely deficient and NMT cannot guarantee naturalness, sufficiency and accuracy of translation results are solved.

Description

technical field [0001] The invention belongs to the technical field of machine translation, and in particular relates to a method for improving Mongolian-Chinese translation quality by using a generative confrontation network to construct Mongolian-Chinese parallel corpus. Background technique [0002] Machine translation, which can use computers to automatically translate one language into another, is one of the most powerful means to solve language barriers. In recent years, many large search companies and service centers such as Google and Baidu have conducted large-scale research on machine translation, making important contributions to obtaining high-quality translations of machine translation, so the translation between major languages ​​is close to that of human beings. At the translation level, millions of people communicate across language barriers using online translation systems and mobile apps. In the wave of deep learning in recent years, machine translation ha...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/28G06F17/27G06N3/04G06N3/08
CPCG06N3/08G06N3/044G06N3/045
Inventor 苏依拉孙晓骞王宇飞赵亚平张振高芬贺玉玺王昊
Owner INNER MONGOLIA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products