Check patentability & draft patents in minutes with Patsnap Eureka AI!

Phrase rule extracting method based on combination

A rule and phrase technology, applied in special data processing applications, instruments, electrical and digital data processing, etc., can solve the problems of large phrase rule table, noisy data, and occupying a lot of hard disk space.

Active Publication Date: 2013-03-27
沈阳雅译网络技术有限公司
View PDF4 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] However, the benchmark phrase rule extraction method has an inevitable problem, that is, during the rule extraction process, the phrase length needs to be mechanically adjusted to obtain the optimal phrase rule set
The extracted phrase rule table is very large, takes up a lot of hard disk space, and contains a lot of noise data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Phrase rule extracting method based on combination
  • Phrase rule extracting method based on combination
  • Phrase rule extracting method based on combination

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] The present invention will be further elaborated below in conjunction with the accompanying drawings of the description.

[0026] A kind of combination-based phrase rule extraction method of the present invention comprises the following steps:

[0027] Construct a "minimal phrase rule set" in the bilingual corpus;

[0028] By combining the minimum phrase rule set to construct a phrase rule set with more context information and good quality, forming a "combined phrase rule set" n-composed;

[0029] Based on the combined phrase rules, a minimal phrase rule set minimal is generated from a given bilingual parallel corpus containing word alignment information, and stored in a hash data structure named minimal;

[0030] Set the value of the number of combinations n, construct a combined phrase rule set n-composed, and detect all possible phrase rules through the minimum phrase rule set minimal, that is, judge that the combined phrase rule is composed of several minimum phras...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a phrase rule extracting method based on combination. The phrase rule extracting method comprises the following steps of: configuring a 'minimum phrase rule' in a bilingual corpus; configuring a combined phrase rule set through the combination; generating a minimum phrase rule set in a given bilingual parallel corpus and storing into a hash data structure; configuring a combined phrase rule and judging that the combined phrase rule is formed by several minimum phrase rules through the minimum phrase rule set; if the phrase rule in the combination is formed by less than or equal to n minimum phrase rules in the minimum phrase rule set, putting the phrase rule into a new hash data structure; and outputting a new minimum phrase rule set and phrase rules of the combined phrase rule set, and finishing one time of a phrase rule extracting process based on the combination. According to the phrase rule extracting method based on the combination disclosed by the invention, a high-quality phrase rule set containing abundant contextual information is effectively generated; and under the condition that a translation performance is not reduced, the phrase rule set extracted by the method is reduced by 56.5% when being compared with the phrase rule set extracted by a standard method.

Description

technical field [0001] The invention relates to a phrase processing technology in a phrase-based statistical machine translation system, in particular to a combination-based phrase rule extraction method. Background technique [0002] Phrase-based statistical machine translation systems have demonstrated very competitive performance in the field of machine translation. A large part of the reason why the phrase-based approach works is that it relies on a high-quality phrase rule set. In a phrase rule set, each source language phrase is mapped to one or more different target language phrases. In the phrase system, a phrase is composed of a series of consecutive words, and the phrase has no linguistic meaning. At present, some researchers in the field of machine translation have proposed some effective phrase rule extraction methods. In these phrase rule extraction methods, heuristic methods have been widely used. This extraction method extracts all the phrase rules that ar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/28
Inventor 朱靖波李强肖桐张浩
Owner 沈阳雅译网络技术有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More