Dependency coherence constraint-based automatic alignment method for bilingual words

An automatic alignment and word alignment technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem that the generative word alignment model does not incorporate syntactic information.

Active Publication Date: 2012-10-03
北京中科凡语科技有限公司
View PDF2 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The technical problem to be solved by the present invention is that the generative word alignment model does not incorporat

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dependency coherence constraint-based automatic alignment method for bilingual words
  • Dependency coherence constraint-based automatic alignment method for bilingual words
  • Dependency coherence constraint-based automatic alignment method for bilingual words

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0055] The bilingual word alignment method based on dependency coherence constraint of the present invention generates a word alignment model according to a bilingual training set, and utilizes the word alignment model to perform word alignment on test sentence pairs. The bilingual training set includes a plurality of training sentence pairs, and each training sentence pair includes a source language sentence and a target language sentence corresponding to each other in semantics.

[0056]The basic idea of ​​the present invention is to use dependency coherence to constrain the process of word alignment, so as to well control the range of each word being aligned to the other end, reduce redundancy, and improve word alignmen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a dependency coherence constraint-based automatic alignment method for bilingual words. The method comprises the following steps of: performing dependency parsing on a training sentence pair; in a training stage, training a word alignment model based on a dependency coherence constraint between a source language end and a target language end by utilizing the training sentence pair and a dependency syntax tree; and in a test stage, generating word alignment results in line with the dependency coherence constraint between the source language end and the target language end for a test sentence pair by utilizing the word alignment model based on the dependency coherence constraint between the source language end and the target language end, and combining the two word alignment results to generate a word alignment result in line with a bilingual dependency coherence constraint, wherein the word alignment result combines accuracy and a recalling rate. Compared with the prior art, the method is low in word alignment error rate.

Description

technical field [0001] The invention belongs to the field of natural language processing, in particular to statistical machine translation and methods for automatic alignment of bilingual words. Background technique [0002] Word alignment, as the name suggests, is to identify the translation correspondence between the translated sentences in two languages ​​in units of words. Word alignment is an important part of statistical machine translation. It is the basis for extracting phrase tables and ordering rules in phrase-based translation models, and even the basis for extracting syntactic translation rules in syntax-based translation models. Usually, the quality of word alignment directly affects the translation quality of statistical machine translation systems. [0003] Word alignment methods can be roughly divided into two categories: heuristic methods and statistical methods. The heuristic method judges whether words are aligned by calculating the co-occurrence measure...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/28G06F17/30
Inventor 宗成庆王志国
Owner 北京中科凡语科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products