Mongolian and Chinese neural machine translation method based on transfer learning strategy

A machine translation and transfer learning technology, applied in the field of neural machine translation, can solve problems such as insufficient corpus, and achieve the effects of alleviating the problem of machine translation data sparseness, wide coverage, and simple and feasible implementation methods.

Inactive Publication Date: 2018-11-16
INNER MONGOLIA UNIV OF TECH
View PDF0 Cites 42 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This solves the problem of insufficient Mongolian-Chinese parallel corpus and achi

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mongolian and Chinese neural machine translation method based on transfer learning strategy
  • Mongolian and Chinese neural machine translation method based on transfer learning strategy
  • Mongolian and Chinese neural machine translation method based on transfer learning strategy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The implementation of the present invention will be described in detail below in conjunction with the drawings and examples.

[0044] The present invention is based on the Mongolian-Chinese neural machine translation method of transfer learning strategy, and its realization process is as follows:

[0045] 1. The problem of data preprocessing on the corpus

[0046] Data preprocessing includes Chinese word segmentation and English data preprocessing. The Chinese corpus is segmented using the open source software word segmentation tool stanford-segmenter of the Natural Language Laboratory of Stanford University; the English corpus is preprocessed using the English preprocessing tool stanford-ner. Its basic working principle is the conditional random field (CRF), that is, the conditional probability model with the maximum entropy model as the main source. This model is an undirected graph model that finds the conditional probability of the output node according to a given ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention is proposed for solving the problem of low quality of translated texts and poor translation effect of a current Mongolian and Chinese machine. Because the Mongolian language belongs to low-resource language, a large quantity of Mongolian and Chinese parallel bilingual corpus can be collected difficultly. A transfer learning strategy is a method for solving the problems in the different but relevant fields by using existing knowledge. The method comprises the following steps: firstly, performing training by utilizing the large-scale English-Chinese parallel corpus based on a neural machine translation frame; secondly, transferring translation model parameters trained by the large-scale English-Chinese parallel corpus into the Mongolian and Chinese neural machine translation frame, and training a neural machine translation model by utilizing the existing English-Chinese parallel corpus; and finally, performing contrast and evaluation on texts translated by a neural machinebased on the transfer learning strategy and texts translated by a statistical machine in accordance with BLEU values and the fluency of the translated texts. Through operation of a control variable method, the transfer learning strategy is obtained, so that the translation performance of a Mongolian and Chinese machine is effectively improved.

Description

technical field [0001] The invention belongs to the technical field of neural machine translation, and in particular relates to a Mongolian-Chinese neural machine translation method based on a migration learning strategy. Background technique [0002] Machine translation refers to the process of using a machine (computer) to automatically convert a natural language into another natural language with exactly the same meaning. In recent years, with the increasing frequency of international exchanges, machine translation, as an important means of breaking through language barriers, is playing an increasingly important role in people's production and life. As one of the data-driven methods of machine translation, neural machine translation is highly dependent on the scale and quality of parallel corpus data structures. Due to the large scale of neural network parameters, neural machine translation will significantly exceed the translation quality of statistical machine translat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/28
CPCG06F40/58
Inventor 苏依拉赵亚平牛向华
Owner INNER MONGOLIA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products