Machine translation method

A technology for machine translation and translation results, applied in instruments, special data processing applications, electronic digital data processing, etc., can solve the problems of sensitive syntactic analysis performance, high translation error rate, waste of space, etc., and achieve fast translation speed and translation performance. high effect

Inactive Publication Date: 2009-04-01
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF0 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Compared with the string-based model, the tree-based model uses a syntax tree as input. The advantages are: fast decoding speed, concise model, and no need for binarization; however, this model has a flaw: only a single syntax tree is used to guide translation , since syntax-based models are sensitive to parsing performance, causing parsing errors to introduce false translations
A simple method is to use the N-best tree, decode each t

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Machine translation method
  • Machine translation method
  • Machine translation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] like figure 1 as shown, figure 1 The implementation flowchart of the overall technical solution of the machine translation decoding method based on the shared compressed forest provided by the present invention, the method includes the following steps:

[0027] Step 101), utilizing the syntax analyzer to analyze the source language string and output the shared compressed syntax forest;

[0028] The main task of syntactic analysis is to analyze the input source language string into a corresponding syntax tree. Available phrase tree parsers: Charniak parser, Bikel Parser, Stanford parser, Collins Parser, MuskCpars; the parser should not only output the 1-best tree, but also output the entire shared compression forest, that is: all possible root nodes are finally generated A shared compression forest composed of parsing trees of . In the present embodiment, the MuskCpar analyzer is adopted. Refer to Deyi Xiong, Shuanglong Li, Qun Liu, Shouxun Lin, Yueliang Qian.2005.Par...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a machine translation method, comprising the steps as follows: 1) source language strings are analyzed to gain share compressed syntax forests of the source language strings; step 2) according to the translation rule set between the known source language and a destination language, the syntax forests are matched so as to gain the share compressed translation forests; step 3) a search algorithm is used to look over the translation forests to generate final translation results. The machine translation method utilizes the share compressed forest to guide the translation, can search the translation results from a plurality of trees, and far exceeds the search space when N-best tree is independently used. On 2.23 millions of parallel bilingual data sets, compared with models decoded by 30-best, the translation speed of machine translation method is 1.4 times faster and the translation performance thereof is 1.7 BLEU points higher.

Description

technical field [0001] The invention belongs to the technical field of natural language processing, in particular, the invention relates to the technical field of tree-based statistical machine translation. Background technique [0002] Syntax-based statistical machine translation models have become the current mainstream translation methods. According to different inputs, they can be divided into string-based models and tree-based models (for tree-based models, please refer to Yang Liu, Qun Liu, and Shouxun Lin.2006 .Tree-tostring alignment template for statistical machine translation. In Proceedings of COLING-ACL, pages 609-616, Sydney, Australia, July. and Liang Huang, Kevin Knight, and Aravind Joshi. 2006. Statistical syntax-directed translation with extended domain of locality .In Proceedings of AMTA.). Compared with the string-based model, the tree-based model uses a syntax tree as input. The advantages are: fast decoding speed, concise model, and no need for binariza...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/28G06F17/30
Inventor 米海涛黄亮刘群
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products