Mongolian-Chinese translation method based on transfer learning

A transfer learning, Mongolian-Chinese technology, applied in the field of neural machine translation, can solve problems such as insufficient corpus, achieve the effect of improving quality and enhancing language representation

Inactive Publication Date: 2020-01-14
INNER MONGOLIA UNIV OF TECH
View PDF1 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This solves the problem of insufficient Mongolian-Chinese parallel corpus and achieves the goal of improving the performance of Mongolian-Chinese machine translation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mongolian-Chinese translation method based on transfer learning
  • Mongolian-Chinese translation method based on transfer learning
  • Mongolian-Chinese translation method based on transfer learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The implementation of the present invention will be described in detail below in conjunction with the drawings and examples.

[0049] The Mongolian-Chinese neural machine translation prototype system based on the transfer learning strategy of the present invention, its realization process is as follows:

[0050] 1. The problem of data preprocessing on the corpus

[0051] Data preprocessing includes Chinese word segmentation and English data preprocessing. The Chinese corpus is segmented using the open source software word segmentation tool stanford-segmenter of the Natural Language Laboratory of Stanford University; the English corpus is preprocessed using the English preprocessing tool stanford-ner. Its basic working principle is the conditional random field (CRF), that is, the conditional probability model with the maximum entropy model as the main source. This model is an undirected graph model that finds the conditional probability of the output node according to a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The method is provided for solving the problems of low translation quality and poor translation effect of existing Mongolian-Chinese machine translation. Mongolian belongs to a low-resource language,collection of a large number of Mongolian-Chinese parallel bilingual corpora is extremely difficult, and the problem is effectively solved through the thought of integrating transfer learning and priori knowledge in the method. Transfer learning is a method for solving problems in different but related fields by using existing knowledge. The method comprises the steps that firstly, large-scale English-Chinese parallel corpora are used for training based on a neural machine translation framework; secondly, translation model parameter weights trained by large-scale English-Chinese parallel corpora are migrated into a Mongolian-Chinese neural machine translation framework; thirdly, rich vocabulary, syntax and other related knowledge representation information obtained through large-scale corpus training are fused into a Mongolian-Chinese neural machine translation model; and finally, a neural machine translation model is trained by utilizing the existing Mongolian-Chinese parallel corpus.

Description

technical field [0001] The invention belongs to the technical field of neural machine translation, and in particular relates to a Mongolian-Chinese mutual translation method based on transfer learning. Background technique [0002] Machine translation refers to the process of using a machine (computer) to automatically convert a natural language into another natural language with exactly the same meaning. In recent years, with the increasing frequency of international exchanges, machine translation, as an important means of breaking through language barriers, is playing an increasingly important role in people's production and life. As one of the data-driven methods of machine translation, neural machine translation is highly dependent on the scale and quality of parallel corpus data structures. Due to the large scale of neural network parameters, neural machine translation will significantly exceed the translation quality of statistical machine translation only when the tr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/58G06N3/04G06N3/08
CPCG06N3/08G06N3/044G06N3/045
Inventor 苏依拉赵亚平牛向华孙晓骞王宇飞高芬张振
Owner INNER MONGOLIA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products