Method and device for adapting a machine translation system based on language database to new field

A technology of machine translation and corpus, applied in the field of information processing

Inactive Publication Date: 2010-05-26
KK TOSHIBA
View PDF0 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For such a situation, the above-mentioned existing methods are powerless

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for adapting a machine translation system based on language database to new field
  • Method and device for adapting a machine translation system based on language database to new field
  • Method and device for adapting a machine translation system based on language database to new field

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] Various preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0023] figure 1 is a flowchart of a method of adapting a corpus-based machine translation system to a new domain according to an embodiment of the present invention. The purpose of this embodiment is to adapt the corpus-based machine translation system that has been trained in one field to a new field where there is no bilingual corpus or only a small bilingual corpus.

[0024] Such as figure 1 As shown, first in step 105, a piece of source language text in the new domain is obtained. Wherein, the source language text includes multiple source language sentences.

[0025] In step 110, the above-mentioned source language text in the new domain is translated by using the above-mentioned corpus-based machine translation system.

[0026] In step 115, for each source language sentence in the source language text, an evaluation of the tran...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides method and system for adapting a machine translation system based on a language database to a new field. The method comprises the following steps of: translating a plurality of source language sentences in the new field by using the machine translation system based on the language database which is trained in one field; selecting the source language sentences the evaluation of the translated result of which is lower than a pre-set first evaluation threshold from the plurality of source language sentences; recognizing a text fragment related to the new field from the source language sentences evaluation of the translated result of which is lower than the first evaluation threshold; and updating the machine translation system by using the plurality of source language sentences and the translated results thereof, as well as the text fragment related to the new field and a correct translated text thereof. In the invention, the machine translation system trained well outside the field trains the machine translation system through using the text fragment which is recognized in the process of repeatedly translating the text in the new field and is related to the new filed so as to continuously improve the translation performance of the new field by using the machine translation system.

Description

technical field [0001] The present invention relates to information processing technology, in particular, to a method and apparatus for adapting a corpus-based machine translation system to a new field. Background technique [0002] Machine translation technology is mainly divided into: rule-based machine translation, corpus-based machine translation. [0003] In a corpus-based machine translation system, the main translation resources come from bilingual corpora. [0004] That is to say, in the corpus-based machine translation system, the parallel bilingual corpus in the bilingual corpus is used as the training basis for machine translation. That is, such a machine translation system uses parallel bilingual corpora processed by sentence alignment and phrase alignment in the bilingual corpus to train the translation model, and when the user inputs the sentence to be translated, the translation model is used to obtain the target language translation of the input sentence. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28
Inventor 吴华王海峰
Owner KK TOSHIBA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products