Alignment algorithm for bilingual paragraphs

A technology of segment and smoothing parameters, applied in the field of English-Chinese bilingual understanding, which can solve the problem of sparse probability data.

Inactive Publication Date: 2009-09-02
刘建
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This makes the probability data in the model (such as the probability of any English word translated into any Chinese word) have serious data sparsity problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Alignment algorithm for bilingual paragraphs
  • Alignment algorithm for bilingual paragraphs
  • Alignment algorithm for bilingual paragraphs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] The alignment scheme can be realized on the computer to form the final system

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for carrying out syntagm alignment by stacking Bilingual Corpus, which is the basis of example-based machine translation (EBMT), provides an alignment module of bilingual syntagm of English and Chinese based on anchor word-pair and corresponding alignment algorithm and solves the data sparse problem of middle or small Corpus. The ambiguity in syntagm segmentation is delayed to be removed by the system at the time of syntagm alignment, thereby improving the correctness of syntagm segmentation.

Description

technical field [0001] The invention patent relates to English-Chinese bilingual understanding technology in natural language understanding. Especially segment alignment technology Background technique [0002] In recent years, with the development of corpus linguistics, the example-based machine translation (Example-based MT) method has become one of the new ideas of machine translation. The EBMT system stores a large number of segment-level aligned bilingual sentence pairs in advance, that is, a bilingual corpus. When translating, the system only conducts a shallow analysis of the translated sentence, divides it into segments, and then finds the best translation for each segment from the bilingual corpus according to the context, arranges them in a certain order, and finally generates translated sentence. This method avoids many difficulties in traditional translation methods (such as syntactic analysis, word meaning recognition, etc.), and has certain practicability, e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/28
Inventor 刘建
Owner 刘建
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products