Statistical machine translation method realizing self-adaption of field

A statistical machine translation and self-adaptive technology, applied in natural language translation, instruments, computing, etc., can solve the problems of difficult high-level translation, insufficient knowledge, weak self-adaptive ability, etc., and achieve the effect of improving the accuracy of translation

Inactive Publication Date: 2017-06-20
成都佳音多语信息技术有限公司
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The original data for training the machine translation system may come from a wide range of fields. When encountering rare words and sentence patterns in a specific field, how to quickly transfer them to obtain high-level translation is not easy, because there are few corpora in these fields. Lack of knowledge when migrating
At present, several well-known online translation systems are capable of news translation (because the news corpus is the most), but for areas where corpus is scarce such as banking and law, the adaptive ability is much weaker

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] The present invention is further described below:

[0019] The present invention provides a statistical machine translation method for realizing domain self-adaptation, comprising the following steps,

[0020] a. According to the existing knowledge system, all Chinese-English nouns and noun phrases are used to establish a computer-recognizable tree structure diagram of the knowledge system. The tree structure diagram of the knowledge system includes a number of sequential arrangements and subdivisions The level of the level, the label of the level starts from 1 to n; the Chinese-English nouns and noun phrases are divided into common nouns and industry nouns, common nouns belong to the first level, and industry nouns start from the second level according to the field. points; general nouns usually do not affect the context field, while subdivided industry nouns have a higher impact on the field, and more subdivided industry vocabulary has a higher degree of influence on ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a statistical machine translation method realizing self-adaption of a field. The translation method comprises the following steps: establishing a computer recognizable knowledge system dendritical structure chart according to all Chinese-English bilingual nouns and noun phrases, and having corresponding knowledge dendritical hierarchy through the Chinese-English bilingual nouns and noun phrases; calculating the sum of the field influence weights of field position points; and performing comparison to obtain the field position points of the sum of the highest influence weights, and determining corresponding translation vocabularies in the knowledge field according to a noun dictionary. According to the statistical machine translation method, by simulating a human brain knowledge framework system, a computer can learn a method of human in reading a character analysis-related field, so that the computer can recognize the field of the word knowledge, and therefore, the self-adapting function of the field of machine translation is realized, and further the translation accuracy is improved.

Description

technical field [0001] The invention belongs to the technical field of statistical machine translation, and in particular relates to a statistical machine translation method for realizing domain self-adaptation. Background technique [0002] Statistical machine translation is the most popular type of machine translation in use today. It works by training a translation engine using very large parallel text as well as monolingual corpora. The system looks for statistical correlations between the source text and the translation. Then, for the source language sentence, find the translation with the highest probability. The translation engine itself has no concept of rules or grammar. [0003] The main disadvantage of statistical machine translation is that if there is no similar material in the translation training corpus, the resulting translation will not work. For example, a translation engine trained on technical text will perform poorly on colloquial text. Therefore, t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28
CPCG06F40/44G06F40/58
Inventor 梁如昕
Owner 成都佳音多语信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products