Translation model establishing method and system

A translation model and semantic technology, which is applied in the field of translation model construction methods and systems, and can solve problems such as distant translation targets.

Inactive Publication Date: 2015-03-04
SUZHOU UNIV
View PDF2 Cites 53 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The initial statistical machine translation method was established based on the noise channel model. Afterwards, researchers further generalized the model in practice and proposed a statistical machine translation method based on the idea of ​​maximum entropy. On this basis, the statistical machine translation method is also The development based on words, phrases and syntax has improved the performance of machine translation more or less, that is, compared with the previous translation models, the translation performance of translation models based on words, phrases or syntax has improved to a certain extent. However, it is still far away from realizing the translation goal of "credibility, expressiveness and elegance".

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Translation model establishing method and system
  • Translation model establishing method and system
  • Translation model establishing method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0071] The applicant found through research that the semantic information of a word or phrase can reflect its relevance to the contextual word or phrase. Compared with traditional word or phrase-based translation methods, the translation model with phrase semantic information has a higher translation quality. Therefore, in order to further improve the translation performance of statistical machine translation, the purpose of the present invention is to integrate phrase semantic information into the statistical machine translation system, thereby realizing a translation process based on phrase semantic information.

[0072] To this end, the first embodiment discloses a method for constructing a translation model that integrates phrase semantic information, refer to figure 1 , The method may include the following steps:

[0073] S101: Obtain a bilingual parallel corpus, where the bilingual parallel corpus includes a comparative translation of a source language sentence to a target la...

Embodiment 2

[0101] In the second embodiment, reference figure 2 , The method may further include the following steps:

[0102] S105: Use the translation model to translate the text to be translated.

[0103] Specifically, refer to image 3 , The implementation process of the translation of the text to be translated in this step specifically includes:

[0104] S301: Perform phrase segmentation on the sentence of the text to be translated to obtain the phrase sequence corresponding to the text to be translated;

[0105] S302: Extract the phrases in the phrase sequence in order, and for the extracted phrases, retrieve the aligned phrases corresponding to them from the regular alignment table, the aligned phrases include the source language phrases of the extracted phrases and their Corresponding N candidate target language phrases, where N is a natural number not less than 1;

[0106] S303: Retrieve the phrase semantic vectors corresponding to the source language phrase and the N candidate target la...

Embodiment 3

[0116] The third embodiment discloses a translation model construction system, which corresponds to the translation model construction methods disclosed in the above embodiments.

[0117] First, corresponding to the first embodiment, refer to Figure 4 , The system includes an acquisition module 100, a first generation module 200, a second generation module 300, and a processing module 400.

[0118] The obtaining module 100 is configured to obtain a bilingual parallel corpus, the bilingual parallel corpus including a comparative translation of a source language sentence to a target language sentence.

[0119] The first generating module 200 is configured to use the bilingual parallel corpus to generate a rule alignment table, a word semantic vector table, and a phrase table. The rule alignment table includes bilingual hierarchical phrase rules, and the word semantic vector table includes bilingual Word semantic vector, and the phrase table includes bilingual phrase information.

[012...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a translation model establishing method and system. The translation model establishing method comprises the following steps: respectively generating a regular alignment table, a word semantic vector table and a phrase table according to alignment information of a double-language parallel corpus, subsequently generating a source language phrase semantic vector table of a source language semantic space and a target language phrase semantic vector table of a target language semantic space by using the word semantic vector table and the phrase table, and finally training by using phrase semantic vector tables of different semantic spaces, thereby generating a translation model integrated with semantic information. The result shows that phrase semantic information can be integrated in statistic machine translation, the research shows that the relevance of words or phrases to context words or phrases can be reflected in the semantic information, and compared with a conventional translation method based on words or phrases, the translation model is relatively high in translation quality after the phrase semantic information is integrated, so that the translation property of the statistic machine translation is further improved as compared with that of the prior art.

Description

Technical field [0001] The invention belongs to the technical field of statistical machine translation, and in particular relates to a method and system for constructing a translation model. Background technique [0002] In recent years, with the improvement of computing power and the continuous enrichment of corpus resources, statistical machine translation technology has gradually become the most important research hotspot in the field of natural language processing. [0003] The realization of statistical machine translation usually includes two main processes: training and decoding. The so-called training refers to training a statistical translation model from corpus resources according to a certain algorithm; the so-called decoding or translation refers to the translation of the text to be translated according to the trained translation model. The initial statistical machine translation method was established based on the noise channel model. Later, in practice, the researche...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06F17/27
Inventor 熊德意王超超张民
Owner SUZHOU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products