Automatic extraction and filtration method for Chinese-English phrase translation pairs

A technology of automatic extraction and filtering methods, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve the problems of inability to meet the recall rate, and the syntax tree constraints are too strict, so as to reduce the occurrence probability and solve the problem of storage space. Requirements, the effect of good translation effect

Active Publication Date: 2010-12-01
INST OF AUTOMATION CHINESE ACAD OF SCI
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

And we clearly know that, firstly, the accuracy rate of syntactic tree generation itself is a problem, and secondly, the constraints of syntactic tree are too strict to meet the requirement of recall rate, so in most syntactic systems, all phrases are actually retained Yes, using only syntactic knowledge to provide reordering information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic extraction and filtration method for Chinese-English phrase translation pairs
  • Automatic extraction and filtration method for Chinese-English phrase translation pairs
  • Automatic extraction and filtration method for Chinese-English phrase translation pairs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides an automatic abstracting and filtering method in Chinese-English phrase translation. The method of the invention comprises the following steps: extracting the characteristic information which divides the language block and filters the candidate phrase to original Chinese-English double-language sentence pair; confirming the language block division anchor point according to different characteristic information, dividing the original Chinese-English sentence pair to a plurality of single language blocks; extracting the candidate phrase in the language block with the word aligning information of original Chinese-English double-language sentence pair; and filtering the generated candidate phrase according to the characteristic information of generation frequencyof candidate phrase for generating the required phrase pair. The invention adopts the phrase abstraction in the traversing language block and is especially useful for the indefinite expanding of empty word thereby effectively settling the requirement to the storing space caused by the overgreat extraction amount of phrase, and effectively filtering many noise phrase. The invention can generate a plurality groups of phrases directly according to the fixation word alignment of present sentence pair thereby increasing the recalling rate of phrase pair under the precondition of satisfying the precision.

Description

A method for automatic extraction and filtering of Chinese-English phrase translation pairs technical field The invention belongs to the field of natural language processing, in particular to methods for statistical machine translation, cross-language information retrieval and bilingual phrase automatic extraction and filtering. Background technique With the advent of the globalized information age, how to overcome language barriers is becoming more and more serious. Using computers to realize automatic translation between different languages ​​has become a common problem faced by all mankind. At present, statistical methods occupy a dominant position in machine translation research, and among statistical methods, phrase-based translation models are more mature. The basic idea of ​​the phrase-based statistical machine translation method is to take the phrase as the basic unit of translation. Because phrases contain information about the selection of translated words and t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27G06F17/28
Inventor 宗成庆周玉
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products