Method and device for obtaining corpus, method and system for generating translation model and method and system for mechanical translation

A translation model and corpus technology, applied in the field of machine translation, can solve the problem of insufficient translation accuracy and achieve the effect of improving effectiveness and accuracy

Active Publication Date: 2013-04-17
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF3 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is to provide a method and device for acquiring corpus, a method and system for generating translation models, and a method and system for machine translation, so as to solve the problem of translation accuracy when translating texts in different fields in the prior art. Not high enough defect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for obtaining corpus, method and system for generating translation model and method and system for mechanical translation
  • Method and device for obtaining corpus, method and system for generating translation model and method and system for mechanical translation
  • Method and device for obtaining corpus, method and system for generating translation model and method and system for mechanical translation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0037] Please refer to figure 1 , figure 1 It is a schematic structural block diagram of an embodiment of a machine translation system in the present invention. Such as figure 1 As shown, the machine translation system includes: a classification module 101 , a translation module 102 , a training module 103 , a model generation module 104 and a corpus acquisition module 105 .

[0038] The corpus acquisition module 105 is used to acquire training corpus in various fields for use by other modules.

[0039] The corpus acquisition module 105 includes a merging unit 1051 , a selection unit 1052 , a clustering unit 1053 , a training unit 1054 and a classification unit 1055 .

[0040] Wherein the merging unit 1051 is used for merging the bilingual sente...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and device for obtaining corpus, a method and system for generating a translation model and a method and system for mechanical translation. The system for the mechanical translation comprises a classification module and a translation module. The classification module is used for utilizing a first classification module to classify texts to be translated to determine the fields where the texts to be translated belong to, and the first classification model is obtained by training corpus in various fields. The translation module is used for utilizing field translation models corresponding to the fields where the texts to be translated belong to translate the texts to be translated, wherein the field translation models are obtained by training the corpus of the corresponding fields. Through the way, translation accuracy can be improved effectively.

Description

【Technical field】 [0001] The present invention relates to the field of machine translation, in particular to a method and device for acquiring corpus, a method and system for generating a translation model, and a method and system for machine translation. 【Background technique】 [0002] In machine translation, the translation model used has a great influence on the translation quality. In the existing machine translation methods, the types of texts to be translated are not distinguished, but the same translation model is used to translate various types of texts. This will lead to large differences in the quality of translation results for different types of texts. [0003] For example, the word "bank" in English should be translated as "bank" in the context of economics and finance, and it should be translated as "river bank" in the context of geography. If these cases are not differentiated and the same translation model is used for translation, the quality of translation ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06F17/30
Inventor 马艳军吴华王海峰
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products