Neural machine translation method by introducing source language block information to encode

A machine translation and information coding technology, applied in the field of neural machine translation, can solve problems such as imperfect external tools and error accumulation

Active Publication Date: 2018-01-26
沈阳雅译网络技术有限公司
View PDF5 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the imperfection of these external tools, it is possible to introduce new errors when obtaining block information, and these errors will continue to spread in subsequent work, resulting in error accumulation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Neural machine translation method by introducing source language block information to encode
  • Neural machine translation method by introducing source language block information to encode
  • Neural machine translation method by introducing source language block information to encode

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] The present invention will be further elaborated below in conjunction with the accompanying drawings of the description.

[0068] The present invention introduces a neural machine translation method for encoding source language chunk information, comprising the following steps:

[0069] 1) Input bilingual sentence-level parallel data, segment the source language and the target language respectively, and obtain bilingual parallel sentence pairs after word segmentation;

[0070] 2) Use the neural machine translation system to encode the source sentences of the bilingual parallel sentence pairs after word segmentation in time sequence, and obtain the state of each time sequence on the last hidden layer, that is, the encoding information under each time sequence;

[0071] 3) During the encoding process, the input source sentence is segmented into blocks;

[0072] 4) obtain the block coding information of the source sentence according to each timing state of the source sent...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention relates to a neural machine translation method for introducing source language block information to encode. The method comprises: inputting bilingual sentence-level parallel data, and carrying out word segmentation on the source language and the target language respectively to obtain bilingual parallel sentence pairs after being subject to word segmentation; encoding the source sentence in the bilingual parallel sentence pairs after being subject to word segmentation according to the time sequence, obtaining the state of each time sequence on the hidden layer of the lastlayer, and segmenting the input source sentence by blocks; according to the state of each time sequence of the source sentence and the segmentation information of the source sentence, obtaining the block encoding information of the source sentence; combing the time sequence encoding information with the block encoding information to obtain final source sentence memory information; and by dynamically querying the source sentence memory information, using attention mechanism to generate a context vector at each moment through a decoder network, and extracting feature vectors for word prediction.According to the method provided by the present invention, block segmentation is automatically carried out on the source sentence without the need of any pre-divided sentence to participate in the training, and the method can capture the latest and the best block segmentation manner of the source sentence.

Description

technical field [0001] The invention relates to the field of machine translation, in particular to a neural machine translation method that introduces coding of source language chunk information. Background technique [0002] Neural machine translation technology usually uses a neural network-based encoder-decoder framework to model the entire translation process end-to-end, and this method has achieved the best translation performance in many different languages. Among them, the encoder network is responsible for encoding the input source sentence into a fixed-dimensional vector with memory information, and the decoder will generate the corresponding translation result through the encoding vector obtained by the encoder. For the encoder, the input source sentence is usually regarded as a sequence of words appearing in order after word segmentation. When the encoder reads the source sentence, it can construct corresponding memory information for the source sentence. During ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06F17/27
Inventor 王强吴开心肖桐朱靖波张春良
Owner 沈阳雅译网络技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products