Translation corpus processing method and device

A corpus and post-translation technology, applied in the field of translation corpus processing, can solve problems such as the influence of judgment on translation accuracy, differences in words or phrases, etc., and achieve the effect of improving accuracy and translation efficiency

Active Publication Date: 2019-08-20
BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD +1
View PDF13 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In practice, in the process of translating from Chinese to target English, the target English translated from Chinese will have the same semantics as the target English to be translated, but the target English and the target English to be translated have different words or phrases, which will affect the judgment of translation accuracy have an impact

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Translation corpus processing method and device
  • Translation corpus processing method and device
  • Translation corpus processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] In the following description, many specific details are explained in order to fully understand this application. However, this application can be implemented in many other ways different from those described herein, and those skilled in the art can make similar promotion without violating the connotation of this application. Therefore, this application is not limited by the specific implementation disclosed below.

[0028] The terms used in one or more embodiments of this specification are only for the purpose of describing specific embodiments, and are not intended to limit one or more embodiments of this specification. The singular forms of "a", "said" and "the" used in one or more embodiments of this specification and the appended claims are also intended to include plural forms, unless the context clearly indicates other meanings. It should also be understood that the term "and / or" used in one or more embodiments of this specification refers to and includes any or all ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a translation corpus processing method and device. The translation corpus processing method comprises the steps of acquiring a first language translation corpus, a first language corpus phrase and a post-translation phrase; performing word segmentation processing on the first language translation corpus to obtain a corpus word segmentation table, and performing word segmentation processing on the first language corpus phrase and the post-translation phrase to obtain a phrase word segmentation table; creating a word probability table according to the corpus word segmentation table and the phrase word segmentation table, wherein the word probability table comprises to-be-selected words and probabilities corresponding to the to-be-selected words; taking each word in thecorpus word segmentation table as a reference word, traversing the reference word to obtain a reference word which is the same as the to-be-selected word, taking the reference word as a target word,and obtaining the probability of the to-be-selected word corresponding to the reference word; and determining phrases in the first language translation corpus corresponding to the first language corpus phrase according to the target word and the probability of the target word.

Description

Technical field [0001] This application relates to the field of natural language processing technology, and in particular to a method and device for processing translation corpus, computing equipment, and computer-readable storage medium. Background technique [0002] Natural language processing is an important direction in the field of computer science and artificial intelligence. Natural language processing includes translation between two different languages. [0003] Take the translation between Chinese and English as an example. In order to test the accuracy of the translation from English to Chinese, the English to be translated can generally be translated into Chinese and then translated from Chinese into the target English. The target English and the English to be translated Compare to further determine whether the translation is accurate. [0004] In practice, in the process of translating from Chinese to target English, the target English translated into Chinese will have ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06F17/27G06F16/33G06F16/36G06F16/31
CPCG06F16/3346G06F16/374G06F16/31G06F40/289G06F40/58Y02D10/00
Inventor 李长亮李天阳唐剑波王献
Owner BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products