Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for extracting resequencing template in machine translation

A machine translation and sequencing technology, applied in the field of machine translation, can solve the problems of low coverage of sequencing phenomena, wrong sequencing, and inaccuracy, and achieve the effect of increasing the degree of generalizability, ensuring efficiency, and reducing restrictions

Inactive Publication Date: 2010-05-12
INST OF COMPUTING TECHNOLOGY - CHINESE ACAD OF SCI
View PDF0 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the translation templates used in the machine translation system in the prior art cannot accurately and completely describe the reordering phenomenon due to various limitations in the extraction process; When the translation template is used, it will be translated sequentially by default, resulting in wrong ordering
[0011] Therefore, the existing automatic extraction methods for sequencing templates have the problem of low coverage of sequencing templates for sequencing phenomena in translation.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for extracting resequencing template in machine translation
  • Method and system for extracting resequencing template in machine translation
  • Method and system for extracting resequencing template in machine translation

Examples

Experimental program
Comparison scheme
Effect test

example 2

[0112] Sequencing example 2 {state relations} (and its translation diplomatic relationship) and {policy} (and its translation the policy of)

[0113] The sequence instance 2 is a part of the sequence instance 1, that is, the sequence instance 2 and the sequence instance 1 overlap.

[0114] The sequence template is extracted from the shortest sequence instance, and the sequence template is extracted from the sequence instance 2 in this embodiment. Since "diplomatic relations" and "policy" are both source language blocks marked with content words at the beginning and end, the two parts can be replaced by variables, and the corresponding translations can be replaced by variables at the same time, and the sequence template "X1 of X2" can be extracted. Translates to "X2 of X1".

[0115] Similarly, the sequence template "X1 after X2" is extracted from sequence example 1, and translated to "X2 after X1".

[0116] Since the variable part of the sequence instance where the sequence t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method and a system for extracting a resequencing template in machine translation. The method comprises the following steps of: (1) inputting bilingual alignment corpus and carrying out word segmentation and part-of-speech tagging on a source language part in the bilingual alignment corpus; (2) resequencing and analyzing each bilingual pair of sentences in the bilingual alignment corpus and extracting a resequencing example; (3) dividing each resequencing example into two parts according to the positions of a word pair in a source language and a target language in the resequencing example, and for each part, confirming a variable part according to the part-of-speech tagging, and replacing the variable part with a variable. The invention can eliminate the restriction to extract a translation template in the prior art and can extract various resequencing templates so as to increase the coverage rate of the resequencing template on resequencing phenomena in translation.

Description

technical field [0001] The invention relates to the field of machine translation, in particular to a method and system for extracting a sequence template in machine translation. Background technique [0002] A translation template is a knowledge representation commonly used in machine translation to guide translation, and describes the corresponding relationship that needs to be followed when translating from a source language to a target language. The translation template is a string composed of constants and variables on the source language side and the target language side, and each part of the source language and target language strings corresponds one-to-one. [0003] An example of a simple Chinese-English translation template: [0004] X today. [0005] X today. [0006] The constants in the template refer to language fragments, also known as terminators. For example, "today" in the above example corresponds to "today"; "." corresponds to ".". A variable refers to ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/28
Inventor 蔡舒
Owner INST OF COMPUTING TECHNOLOGY - CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products