Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for generating sequence regulating model for machine translation

A technology for regulating models and generating devices, which is applied in the field of machine translation, and can solve problems such as inability to regulate, fail to meet machine translation requirements, and dependency on regulating models, and achieve the effect of improving the regulating ability

Active Publication Date: 2011-05-11
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF3 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Phrase-based ordering models have the following two defects: one is that discontinuous phrases cannot be ordered, such as "closely related to" and "their childhood development" cannot be ordered; The ordinal model relies on the entire phrase, so there is a problem of data sparsity when estimating parameters, resulting in inaccurate estimates
[0004] In order to improve the ordering ability of the ordering model, some researchers use function words or boundary words to solve the problem of data sparsity, but the effect of the above solutions is still not satisfactory and cannot meet the needs of machine translation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for generating sequence regulating model for machine translation
  • Method and device for generating sequence regulating model for machine translation
  • Method and device for generating sequence regulating model for machine translation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.

[0028] The present invention utilizes collocation information between words in source language sentences to improve the reordering ability of the sequence model. For example, in the above example sentences, if it can be found that "and" and "closely related" are a collocation word pair, then in During the reordering process, consider using such a reordering model to constrain the reordering direction of the two:

[0029] p(o|w i ,w j ) o ∈ (straight, inverted)

[0030] In the above ordering model, w i and w j Indicates two source language collocation words that have a collocation relationship in the source language sentence, the two form a source language collocation word pair, o indicates the ordering direction, and "straight" indicates the source language collocation word w i and w j The order in the source language sentence with the source la...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for generating a sequence regulating model for machine translation, which comprises the following steps of: acquiring a bilingual corpus; matching and extracting source language example sentences in the bilingual corpus to acquire source language matching word pairs; performing bilingual word alignment on the source language example sentences and target language example sentences, and determining corresponding translation of source language matching words according to the bilingual word alignment result; determining the sequence regulating directions of the source language matching word pairs according to the sequence of the source language matching words in the source language example sentences and the sequence of the corresponding translation in the target language example sentences; and counting the sequence regulating directions, and acquiring the sequence regulating probability of each sequence regulating direction to form the sequence regulating model. By the mode, the sequence regulating model is established on the basis of matching information of words of a source language, and the sequence regulating capacity of the sequence regulating model is further improved.

Description

technical field [0001] The invention relates to the field of machine translation, in particular to a method and device for generating a sequence model for machine translation. Background technique [0002] In recent years, phrase-based statistical machine translation (phrase-based statistical machine translation) has made great progress in translation quality compared with word-based statistical machine translation originally proposed by IBM. Therefore, it has received widespread attention. To put it simply, when training phrase-based statistical machine translation, the bilingual example sentences in the bilingual corpus are first aligned with bilingual words, and then a bilingual phrase table with probability is extracted based on the bilingual word alignment. When translating, the source language sentences to be translated are first matched with the source language phrases in the phrase table to obtain the target language phrases corresponding to the source language phra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/28
Inventor 吴华胡晓光王海峰
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD