Construction method and system of incremental-translation-oriented structured language model

A language model and construction method technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve complex probability models, inability to meet and other problems

Inactive Publication Date: 2013-02-27
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, it uses a more complex probability model and needs to be calculated in a complete syntax tree
It cannot meet the needs of incrementally generating translations and simultaneously performing language model calculations in machine translation problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Construction method and system of incremental-translation-oriented structured language model
  • Construction method and system of incremental-translation-oriented structured language model
  • Construction method and system of incremental-translation-oriented structured language model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] Such as figure 1 as shown, figure 1 It is a flowchart of a method for constructing a structured language model of the present invention, and the method includes the following steps:

[0034] Step 1, sequentially perform dependency syntax analysis on incrementally generated translation fragments to obtain a set of dependency tree fragments.

[0035] The main task of dependency analysis is to perform dependency analysis on the input translation fragments to obtain the corresponding set of dependency tree fragments. Since the most widely used machine translation system currently generates translations incrementally, it is necessary to use a dependency analysis algorithm with the same decoding order. In this example, a shift-reduce algorithm is used. Please refer to Incremental Deterministic Dependency Syntax Analysis, No. Three and four chapters (Joakim Nivre.2004.Incrementality indeterministic dependency parsing.In Proceedings of the ACL Workshop Incremental Parsing.Ass...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a construction method and a construction system of an incremental-translation-oriented structured language model. The method comprises the following steps: step 1, performing dependency grammar analysis on incrementally generated translation segments to obtain dependency tree segment assembly; step 2, extracting a discriminant feature instance on the dependency tree segment assembly, and calculating a feature score of the discriminant feature instance by a discriminant dependency grammar model; step 3, performing pruning on the dependency tree segment assembly according to the feature score, taking a maximal value of the feature score as the score of the structured language model, reserving the segment having the highest score in the structured language model, and acquiring the optimized dependency tree segment assembly; and step 4, splicing the next translation segment onto the dependency tree segment assembly through a shift-specification operation, repeating the step 1, the step 2 and the step 3 until finishing the translation, and generating the complete dependency tree. According to the construction method and the construction system of the incremental-translation-oriented structured language model, the grammar information and the long-distance dependency information can be merged into the language model, the effective optimization algorithm is proposed for dynamic calculation of the structured language model in a decoding process, and the translation quality is improved.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular, the invention relates to a structured language model oriented to an incremental translation model. Background technique [0002] Statistical language model, as a statistical model to calculate the probability of natural language generation, plays a vital role in many problems of natural language processing. In the problem of machine translation, for newly generated translation fragments, we use the language model to calculate its generation probability, and keep the translation with higher probability, so as to achieve the effect of improving the translation quality. The n-gram language model, also known as the n-1 order Markov model, is the most widely used model in statistical language models. It is based on the limited history assumption: the probability of the nth word is only related to the previous n-1 words . This assumption greatly reduces the complexi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28
Inventor 于恒米海涛刘群
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products