Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for extracting phrases of statistical machine translation

A technology for statistical machine translation and phrases, applied in the fields of instruments, computing, special data processing applications, etc., it can solve the problems of poor phrase expression, unsatisfactory alignment quality, and poor translation quality, and achieve the effect of improving quality.

Inactive Publication Date: 2011-03-23
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF0 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the limitation of the size and quality of the bilingual corpus, the alignment quality represented by the alignment matrix is ​​not ideal, resulting in poor phrase tables extracted, further resulting in poor translation quality

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for extracting phrases of statistical machine translation
  • Method for extracting phrases of statistical machine translation
  • Method for extracting phrases of statistical machine translation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make the purpose, technical solution and advantages of the present invention clearer, the method for extracting phrases from statistical machine translation according to an embodiment of the present invention will be further described in detail below in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0027] figure 1 A flowchart of a method for extracting phrases from statistical machine translation according to a specific embodiment of the present invention is shown. As shown in the figure, the method includes the following steps:

[0028] Step 1) Obtain multiple aligned sentence pair combinations from the bilingual corpus in two directions, and calculate the prior probability of the multiple aligned sentence pair combinations.

[0029] An example of performing this step is given below:

[0030] 11) Use GIZA...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for extracting phrases of statistical machine translation, which comprises the following steps of: 1) acquiring a plurality of aligned sentence pair combinations from a bilingual language material from two directions, and calculating the priori probability of the plurality of aligned sentence pair combinations; 2) calculating the alignment probability of word pairsaccording to the sum of the priori probabilities of the word pairs of the plurality of aligned sentence pair combinations, and forming an alignment matrix by using the alignment probability of the word pairs; 3) calculating the frequency of phrase alignment according to the alignment matrix; and 4) calculating the relative frequency and the lexicalization probability of the phrase alignment according to the frequency of the phrase alignment. The method can effectively express all probable aligned phrase combinations, and improves the quality of phrase extraction, thereby being capable of improving the quality of translation which is performed according to the extracted phrases.

Description

technical field [0001] The present invention relates to the field of natural language processing, and more particularly, to the field of statistical machine translation of texts. Background technique [0002] With the rapid development of the world economy, cultural and economic exchanges between countries are becoming more and more frequent. People sometimes have to face materials and information in various languages ​​from various countries in their daily work and life. A major problem is that of language comprehension, where people need to be able to comprehend material written in a language other than their own in a relatively short period of time. [0003] Therefore, machine translation technology came into being. Early machine translation mainly focused on the research of rule translation systems, but the writing of translation rules required the participation of language experts, and usually a large number of rules had to be rewritten every time a translation field wa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/28
Inventor 刘洋夏天肖欣延刘群
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products