Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Statistical machine translation method with error self-diagnosis and self-correction functions

A statistical machine translation and self-correction technology, applied in the fields of instruments, computing, special data processing applications, etc., can solve the problems of translation errors, poor self-correction ability, and translation accuracy can not really meet the translation requirements, so as to improve the practicability. , improve work efficiency, improve the effect of machine translation performance

Active Publication Date: 2012-11-28
天津华译语联科技股份有限公司
View PDF1 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the current statistical machine translation systems for industrial applications still cannot really meet the translation requirements due to translation accuracy, so most of them are used as auxiliary tools for human translation, and cannot independently provide translation results with high confidence, mainly in two aspects :
[0005] 1. Insufficient ability to predict translation errors: it is difficult to accurately diagnose and predict potential translation errors in translation results, and this function is important for post-editors to search, judge and correct the errors with high efficiency and low consumption significance
The confidence estimation accuracy of the current method needs to be further improved
[0006] 2. Poor self-correction ability for translation errors: For translation errors automatically diagnosed by the system, there are currently two solutions that can provide self-correction functions
However, none of these methods worked as expected.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Statistical machine translation method with error self-diagnosis and self-correction functions
  • Statistical machine translation method with error self-diagnosis and self-correction functions
  • Statistical machine translation method with error self-diagnosis and self-correction functions

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0069] Taking the open source statistical machine translation system Moses (Moses) as the baseline system, its phrase decoder and word graph decoder are used to decode the input source language string and word graph network to obtain the output translation hypothesis. The experimental data is as follows: the translation sentence pair is Chinese-English, and the translation direction is English-Chinese translation. The training corpus used by the statistical machine translation model is the FBIS 200K sentence pairs provided by LDC, and the development set and test set are NIST 2005 and 2003 data sets respectively. The paraphrase collection uses the paraphrase provided by the open source tool TER-plus, which is filtered and post-processed during use.

[0070] Table 1 illustrates the translation performance comparison results of the method of the present invention and the existing baseline system with respect to the test set based on the above data.

[0071] Table 1

[0072] ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a statistical machine translation method with error self-diagnosis and self-correction functions. The method comprises the following steps: firstly, defining the translation error type, training an error classifier, carrying out translation error classifying on a test set, then mapping the translation error from a target language terminal to a source language terminal and constructing a repeated word graph network, optimizing the repeated word graph network of the source language, and finally, carrying out word graph decoding to obtain a self-correction result. Compared with the current statistical machine translation method, according to the statistical machine translation method with error self-diagnosis and self-correction functions, the translation error rate is effectively lowered, and the translation performance is improved.

Description

technical field [0001] The invention belongs to the technical field of statistical machine translation methods, in particular to a statistical machine translation method with error self-diagnosis and self-correction functions. Background technique [0002] Software localization refers to the process of integrating information related to specific regional settings and information translation when software is transplanted in regions and countries with different cultural and language backgrounds, so as to adapt to local culture and usage habits. Translation plays a vital role in the localization process, and the adaptation to local culture and language directly affects the promotion of the software in that region or country. In the software localization industry, the traditional method is to first use the translation memory (Translation Memory, TM) to search and output the translation examples of the software interface, terms, manuals or technical documents according to the fuz...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/28
Inventor 杜金华王莎郭华张萌
Owner 天津华译语联科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products