Double-language sentence alignment method and device

A technology of sentence pairs and sentences, which is applied in the field of bilingual sentence alignment methods and devices, can solve complex and other problems, and achieve the effect of improving efficiency and speeding up construction

Active Publication Date: 2009-07-22
深圳市点通数据有限公司
View PDF0 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, manually adding and enriching the corpus is undoubtedly a huge and complicated task, an

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Double-language sentence alignment method and device
  • Double-language sentence alignment method and device
  • Double-language sentence alignment method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Refer below figure 1 The method flow of an embodiment of the bilingual sentence alignment method provided by the present invention is described in detail. As shown in the figure, the process of executing a bilingual sentence alignment method in this embodiment is as follows:

[0031] First execute the sentence segmentation step, that is: divide the first language and the second language into a plurality of sentences according to the sentence break, and during specific implementation, the first language and the second language can be any combination of two different languages, Such as: Chinese and English or English and Chinese. This embodiment takes the common combination of Chinese and English as an example. Full stops, question marks, and exclamation marks in Chinese are all used as sentence breakers. If there are quotation marks in the sentence to be divided, the content in the middle of the quotation marks will not be separated. Starting from the starting point, i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a bilingual sentence alignment method which comprises a clause step, a participle step, an alignment step, a matching step and an execution step; wherein, the alignment step concretely comprises: a region division step which respectively divides the first language and the second language requiring alignment into a plurality of comparing regions containing sentences of the first language and the second language according to prearranged region division rules; in the matching step, the mutual matching rate of every pair of sentences in the comparing region of every pair of the first language and the corresponding second language is calculated and the combinations of the mutually matched sentences of the first language and the second language according to the mutual matching rate is determined; in the execution step, the alignment operation on the combination of the sentences of the first language and the second language with the biggest mutual matching rate is executed. A corresponding bilingual sentence alignment device is also disclosed by the invention. The invention can greatly improve the aligning efficiency and accelerate the constructing speed of corpuses.

Description

technical field [0001] The invention relates to computer translation technology, in particular to a bilingual sentence alignment method and device. Background technique [0002] With the rapid expansion of information and the global integration of economy and trade, international communication is becoming more and more frequent. It is a common and urgent need to quickly organize, transform and use a large amount of foreign language materials as needed. Driven by this demand, it has become an unavoidable trend to use machine translation systems to assist people in rapid translation and file building, and computer-aided translation came into being. [0003] However, there is still a huge gap between machine translation systems and human translation. A very important reason for the poor performance of machine translation systems is the lack of resources. No matter what machine translation method is used, a large number of large-scale knowledge resources are required, and these...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/28
Inventor 张玉志
Owner 深圳市点通数据有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products