Mongolian-Chinese machine translation method for enhancing semantic feature information based on Transformers

A technology of machine translation and semantic features, applied in the field of machine translation, can solve problems such as loss and multi-information, and achieve the effects of improving quality, easy capture, and reducing sparsity

Inactive Publication Date: 2019-03-19
INNER MONGOLIA UNIV OF TECH
View PDF3 Cites 72 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Words that are far away can only interact on higher CNN nodes, and there may be more information loss in this process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mongolian-Chinese machine translation method for enhancing semantic feature information based on Transformers
  • Mongolian-Chinese machine translation method for enhancing semantic feature information based on Transformers
  • Mongolian-Chinese machine translation method for enhancing semantic feature information based on Transformers

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The implementation of the present invention will be described in detail below in conjunction with the drawings and examples.

[0045] A Mongolian-Chinese machine translation method based on Transformer of the present invention firstly preprocesses the Mongolian corpus, and then uses word2vec to generate a word vector correlation model as the research background, and integrates the influence of depth, density, and semantic coincidence on concept semantic similarity, The similarity algorithm that integrates semantic distance and information content establishes a similarity matrix, and then conducts principal component analysis, converts the similarity matrix into a principal component transformation matrix, calculates the contribution rate of the principal component, and uses it as a weight for weighting processing to obtain the final The conceptual semantic similarity of the concept; the Transformer model is finally adopted in the translation process, thereby completely r...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Mongolian-Chinese machine translation method for enhancing semantic feature information based on a Transformer model. The method comprises the following steps: firstly, starting from the language characteristics of Mongolian, finding out the characteristics of the additional components of the Mongolian in terms of stem, affixes and lattices, and merging the language characteristics into the training of a model; secondly, distributed representation for measuring the similarity between the two words is taken as a research background, and the influence of depth, density and semantic coincidence degree on the concept semantic similarity is comprehensively analyzed; in the translation process, a Transformer model is adopted, and the Transformer model is a multi-layer encoder which performs position encoding by using a trigonometric function and is constructed on the basis of an enhanced multi-head attention mechanism. A decoder architecture, which completely dependson the mechanism of attention to draw the global dependency between the input and the output, eliminates recursion and convolution.

Description

technical field [0001] The invention belongs to the technical field of machine translation, and in particular relates to a Mongolian-Chinese machine translation method based on Transformer to enhance semantic feature information. Background technique [0002] Mongolian is an agglutinative language belonging to the Altaic language family. Mongolian written texts include traditional Mongolian and Cyrillic Mongolian. The "Mongolian" in the Mongolian-Chinese translation system we study here refers to the translation from traditional Mongolian to Chinese. Traditional Mongolian is also a kind of phonetic writing. The shape of the letters is not unique. The change of the shape is related to the position of the letter in the word. The position includes the individual beginning, the middle and the end of the word. Mongolian words are formed by the root (root) + suffix (suffix). The affixes are divided into two categories: one is used to attach to the root to give the original word a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06F17/27G06N3/04G06N3/08
CPCG06N3/08G06F40/216G06F40/30G06F40/289G06F40/58G06N3/044G06N3/045
Inventor 苏依拉张振高芬王宇飞孙晓骞牛向华赵亚平卞乐乐
Owner INNER MONGOLIA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products