Automatic editing-after-translating system and method for multisource neural network based on splicing-remixing mode

A post-translation editing and neural network technology, which is applied in the fields of natural language processing and machine translation, can solve problems such as missing translations, improve the overall quality, improve translation fidelity, and improve the overall translation quality

Active Publication Date: 2017-10-27
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to propose a multi-source neural network post-translation editing system and method based on splicing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic editing-after-translating system and method for multisource neural network based on splicing-remixing mode
  • Automatic editing-after-translating system and method for multisource neural network based on splicing-remixing mode

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0058] Example 1

[0059] This embodiment combines with figure 1 , Describes the detailed composition and training and decoding process of a multi-source neural network post-editing system and method based on splicing and remixing of the present invention.

[0060] From figure 1 It can be seen that the training module is connected to the decoding module.

[0061] The training process of the training module includes the following steps:

[0062] Step A: Collect all the corpus needed in the training process of this system;

[0063] Among them, each corpus mainly includes training original corpus and reference translation corpus; among them, training original corpus and reference translation corpus are parallel corpus; assuming N=600000, that is, the training original has 60,000 sentences;

[0064] Training original corpus, denoted as: {source 1 ,source 2 ,…,Source 600000 },

[0065] Training target corpus, denoted as {ref 1 ,ref 2 ,...,Ref 600000 },

[0066] The preliminary translation res...

Example Embodiment

[0093] Example 2

[0094] This embodiment uses specific sentences as examples to illustrate the effects of the system and method.

[0095] In a specific example, the quality of translation is intuitively reflected in fidelity and fluency, where the increase in fidelity is refined to the accuracy of word selection.

[0096] Assume that the original translation reads "However, the challenges of the past are not limited to subsidizing public housing. Private housing is also full of major challenges."

[0097] The preliminary machine translation system uses the Moses statistical machine translation system. The translation result is "however, the pastchallenge, not in the funding of public housing, private housing is full of challenge." In this sentence, the keyword "funding" in the original translation is Translated into "funding", which means "to provide funds for...", it lacks the meaning of help and is not accurate enough. At the same time, the sentence pattern of the original translat...

Example Embodiment

[0100] Example 3

[0101] This embodiment explains in a statistical sense that the system and method directly use the preliminary translation result as the source language training single-source neural network post-editing system compared with the original translation without adding the original text. The advantages of neural network automatic translation editing system in overall translation quality.

[0102] Assume that the training source and reference translation dataset used for the training module has 600,000 sentences, and the translation source dataset used for the test module has 1597 sentences. The preliminary machine translation system uses the Moses statistical machine translation system, the score uses the multi-bleu script, and the BLEU value Represents the overall translation quality. The scores of one yuan to four yuan are the quantitative indicators of fidelity and fluency. The specific scores are described in Table 1 below:

[0103] Table 1: The statistical compari...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an automatic editing-after-translating system and method for a multisource neural network based on a splicing-remixing mode and belongs to the technical fields of computer natural language processing and machine translation. According to the method, the system is included, and a training module and a decoding module are also included. The method is divided into a training process and a decoding process. The training system process is established on the basis of a traditional neural network machine translation model, wherein a source corpus is replaced with a new corpus generated after a translation source text and a preliminary translation result are subjected to simple statement splicing and remixing, a target corpus is replaced with a reference translation which is doubled, and the preliminary translation result and the translation source text are made to assist each other in the training process to realize cross verification. In the translation decoding process, the system obtained through training can be directly used to decode the source corpus obtained after the translation source text and the preliminary translation result are correspondingly spliced, and the obtained translation is better than the preliminary translation result not subjected to the editing-after-translating method in fluency, accuracy and overall quality.

Description

technical field [0001] The invention relates to a multi-source neural network post-translation editing system and method based on splicing and remixing, and belongs to the technical fields of computer application, natural language processing and machine translation. technical background [0002] In recent years, with the advancement of the wave of globalization, international exchanges have become increasingly frequent, and the demand for translation services in all walks of life has become more urgent. Although machine translation has the advantage of being more efficient and convenient, there is still a big gap between its translation and human translation. Therefore, automatic post-editing of machine translation results to improve translation quality has important practical value. [0003] The neural network automatic post-translation editing system is an improvement on the traditional automatic post-translation editing. It is good at generating sentences with high fluen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/28
CPCG06F40/58
Inventor 郭宇航黄河燕曹倩雯
Owner BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products