Participle-network-based word alignment fusion method for computer-aided Chinese-to-English translation

A fusion method and word alignment technology, applied in computing, special data processing applications, instruments, etc., can solve problems such as lack of

Inactive Publication Date: 2012-11-28
NANJING UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there is still a lack of an effective method that can quickly select a Chinese word segmentation method that is conducive to word alignment for each sentence pair in the word alignment training corpus

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Participle-network-based word alignment fusion method for computer-aided Chinese-to-English translation
  • Participle-network-based word alignment fusion method for computer-aided Chinese-to-English translation
  • Participle-network-based word alignment fusion method for computer-aided Chinese-to-English translation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0070] Algorithms used in the present invention are all written and realized by C# language. The model used in the experiment is: Intel Xeon X5550 processor, the main frequency is 2.66G HZ, and the memory is 16G. The GIZA++ word alignment toolkit used in the present invention is a general open source word alignment toolkit at present, compiled by this laboratory under Cygwin to obtain a version that can finally run under the windows platform. The rest of the machine translation modules used in the present invention are rewritten in C# language based on the phrase-based statistical machine translation open source software Moses.

[0071] The data preparation before implementation is as follows: use K kinds of word segmentation tools to segment the Chinese part of the English-Chinese parallel corpus, and get K middle word segmentation, that is, s k (k=1,...,K), put s k (k=1,...,K) do traditional word alignment a with parallel English parts respectively k (k=1, . . . , K).

...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a participle-network-based word alignment fusion method for computer-aided Chinese-to-English translation. The method comprises the following steps of: 1, determining skeleton alignment: searching and selecting an optimal skeleton connection by using a connection-confidence-based connection selection algorithm, and forming the skeleton alignment; and 2, projecting the selected skeleton alignment to each participle to obtain various-participle-based word alignment. By the method, the conventional single-participle-based word alignment algorithm is improved, and the word alignment quality of each participle and the machine translation quality can be simultaneously improved. By fusing the characteristics for the word alignment under multiple participles, the final wordalignment is more robust, and the number of word alignment errors affected by participle errors or bilingual participle inconsistency can be reduced.

Description

technical field [0001] The invention relates to the field of computer software language translation, in particular to a word alignment fusion method based on word segmentation network in computer Chinese-to-English translation. Background technique [0002] With the rapid increase in the amount of information in today's world and the increasingly frequent international exchanges, the rapid popularization and development of computer network technology, language barriers have become more obvious and serious, and people's potential demand for machine translation is also increasing. Machine translation is the use of computers to translate between different languages. The translated language is called the source language, and the translated language is called the target language. Machine translation is the process of converting from the source language to the target language. In recent years, a series of important advances have been made in statistical machine translation method...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/28G06F17/27
Inventor 奚宁李博渊汤光超赵迎功陈家骏戴新宇张建兵
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products