Statistic machine translation system based on syntax framework

A technology for statistical machine translation and translation systems, applied in natural language translation, instruments, computing, etc., can solve the problems of system robustness, sparse rules for short segments of sentences, and time-consuming and labor-intensive skeleton information. Easy, effective results

Active Publication Date: 2016-05-11
沈阳雅译网络技术有限公司
View PDF3 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Aiming at the problem of system robustness caused by the inability to translate and sequence short sentence fragments well and sparse rules in the syntactic translation system in the prior art, and the model cannot perform long-distance sentence components in the non-syntactic translation system The problem of effective ordering, time-consuming and labor-intensive human-labeled skeleton information, etc., the technical problem to be solved by the present invention is to provide a statistical machine translation system based on syntactic skeleton, which models the high-level syntactic skeleton of the source language, and good translation of low-level phrases while proposing a novel representation of the syntactic skeleton for machine translation systems to use

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Statistic machine translation system based on syntax framework
  • Statistic machine translation system based on syntax framework
  • Statistic machine translation system based on syntax framework

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] The present invention will be further elaborated below in conjunction with the accompanying drawings of the description.

[0050] Such as figure 1 Shown, a kind of statistical machine translation system based on syntactic skeleton of the present invention comprises the following steps:

[0051] 1) The probabilistic SCFG hierarchical rule extraction method extracts non-syntactic translation rules for the translation of the non-skeleton part of the sentence to be translated:

[0052] Using the heuristic restriction method of extracting hierarchical rules, extract probabilistic SCFG grammar rules on parallel sentence pairs that have been aligned but not syntactically analyzed, and use non-syntactic translation rules, that is, non-syntactic translation rules to process the translation of the low-level structure of the sentence to be translated ;

[0053] 2) The GHKM rule method extracts syntax translation rules for the translation of the skeleton part of the sentence to b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a statistic machine translation system based on a syntax framework. The translation process comprises the steps that 1, non-syntax translation rules are extracted through a probabilistic SCFG-level rule extraction method and used for translation of a non-framework part of a sentence to be translated; 2, syntax translation rules are extracted through a GHKM rule method and used for translation of a framework part of the sentence to be translated; 3, non-complete syntax translation rules are generated according to the syntax translation rules, and the non-syntax translation rules and the syntax translation rules are combined to realize integration of advantages of a non-syntax translation system and advantages of a syntax translation system; 4, a model is generated. According to the system, the syntax translation rules are used for translation of the syntax framework and long-distance sequencing, and the rules of the non-syntax translation system are used for processing low-level vocabulary translation and sequencing; the model is easy to realize, and the effect is remarkable.

Description

technical field [0001] The invention relates to a technique for modeling source syntax in statistical machine translation, in particular to a statistical machine translation system based on a syntactic skeleton. Background technique [0002] In Statistical Machine Translation (SMT), there are different translation systems, such as non-syntactic translation systems based on phrases and hierarchical phrases, and syntactic translation systems such as tree-to-string and string-to-tree. Different translation systems have their own advantages and disadvantages. For example, the syntactic translation system has obvious advantages in dealing with long-distance and complex ordering problems between various components, but when the translation rules of syntactic translation are relatively sparse or the coverage When it is relatively low, there will be system robustness problems, which may lead to poor translation effects. And it has been confirmed that if the syntactic system is impl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06F17/27
CPCG06F40/211G06F40/42
Inventor 肖桐朱靖波张春良高瑜泽
Owner 沈阳雅译网络技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products