Machine translation method and apparatus, and machine translation system training method and apparatus

A technology of machine translation and training data, applied in natural language translation, instruments, special data processing applications, etc., can solve problems affecting translation quality, translation of unnameable entity characters, translation loss, etc.

Inactive Publication Date: 2018-05-25
BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, in order to ensure the timeliness of machine translation, it is inevitable to limit the size of the translation vocabulary, which makes it difficult to cover some low-frequency but important words
Especially for some named entity characters (Named Entity, NE), because they are not included in the translation vocabulary, the corresponding translation of named entity characters cannot be performed during translation, which may cause the problem of translation loss
[0004] Therefore, the current translation mechanism is difficult to fully cover the translation of named entity characters, which seriously affects the translation quality

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Machine translation method and apparatus, and machine translation system training method and apparatus
  • Machine translation method and apparatus, and machine translation system training method and apparatus
  • Machine translation method and apparatus, and machine translation system training method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0081] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0082] refer to figure 1 , which shows a flow chart of the steps of an embodiment of a machine translation method of the present invention, which may specifically include the following steps:

[0083] Step 101, receiving an input original character string in a source language, the original character string includes named entity characters and non-named entity characters, wherein the named entity characters have an entity category label to which they belong.

[0084] The embodiment of the present invention can be applied to translation between any two or more languages, for example, translation between Chinese-English, English-Chinese, Chinese-English-Japanese and other languages. In practical applications, users can input charact...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

An embodiment of the invention provides a machine translation method. The method comprises the steps of receiving an original character string of an input source language, wherein the original character string comprises a named entity character and a non-named entity character, and the named entity character has a subordinate entity class tag; translating the original character string into a middle character string, specifically, replacing the corresponding named entity character in the original character string with the entity class tag, and translating the non-named entity character in the original character string into a target language character; and translating the middle character string into a target character string, specifically, searching for a target character matched with the named entity character and the corresponding entity class tag in a preset mapping table, and correspondingly replacing the entity class tag in the middle character string with the target character. According to the method, the translation quality can be improved.

Description

technical field [0001] The present invention relates to the technical field of language processing, in particular to a machine translation method and device, a machine translation system training method and device, and a machine translation device and a machine translation system training device . Background technique [0002] At present, the development of globalization has brought about an urgent need for machine translation (Machine Translate) between multiple languages. Among them, due to the advantages of simple construction of the neural machine translation system and good translation quality, machine translation through the neural machine translation system has become the mainstream. [0003] However, the high requirements of the neural machine translation system on computing equipment and its system framework determine that the size of the translation vocabulary is inversely proportional to the efficiency of machine translation and machine training. Therefore, in o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28
CPCG06F40/58
Inventor 程善伯王宇光姜里羊陈伟王砚峰
Owner BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products