Machine translation optimization method capable of exploring more reference translation version information

A machine translation and multi-reference technology, applied in natural language translation, machine learning, instrumentation, etc., can solve problems such as evaluation bias

Active Publication Date: 2017-09-05
NANJING UNIV
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Purpose of the invention: The technical problem to be solved by this invention is to propose an extended and independent reference to the problem of evaluation deviation caused by semantic and expression diversity in the existing machine translation quality evaluation method with limited reference translations. The translation is a machine translation quality optimization method that refers to the translation graph. For different word choices and different expressions in the machine translation translation, it can make a more fair and reasonable evaluation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Machine translation optimization method capable of exploring more reference translation version information
  • Machine translation optimization method capable of exploring more reference translation version information
  • Machine translation optimization method capable of exploring more reference translation version information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0109] The evaluation experiment of this embodiment under multiple reference translations is as follows:

[0110] 11. Input the source language file and the corresponding multiple reference translation files, and obtain the word alignment information between the source and reference translations through giza++.

[0111] 12. Using the results obtained in step 11 and the dev set (containing multiple reference translations) as input, expand the reference translation information in the dev set, conduct parameter training experiments for the machine translation system, and output the translation results of the dev set.

[0112] 13. After the training is over, test on the test sets MT02, MT04, and MT05, and output the translation results of the corresponding data sets.

Embodiment 2

[0114] The evaluation experiment of this embodiment under a single reference translation is as follows:

[0115] 11. Input the source language file and the corresponding single reference translation file, and use giza++ to obtain the word alignment information between the source and reference translation.

[0116] 12. Use the result obtained in step 11, the free translation table and the dev set (containing a single reference translation) as input, expand the reference translation information in the dev set, conduct parameter training experiments for the machine translation system, and output the translation results of the dev set.

[0117] 13. After the training is over, test on the test sets MT02, MT04, and MT05, and output the translation results of the corresponding data sets.

Embodiment 3

[0119] In this embodiment, the manual correlation evaluation is performed on the BLEU method using the reference translation graph:

[0120] 11. Input the source language file and the corresponding reference translation file, and obtain the word alignment information between the source and reference translation through giza++.

[0121] 12. Sort the corresponding translations of different systems that contain manual evaluation results.

[0122]13. The translation results from different systems are scored according to the original method without using the reference translation map, and at the same time, they are sorted according to the ranking results, and the ranking results of the ranking results and the ranking results of the manual evaluation are evaluated manually by Kendall’s Tau.

[0123] 13. Utilize the result of step 11 to construct a reference translation graph, score the translation results from different systems according to the extended method, and sort the scoring ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a machine translation optimization method capable of exploring more reference translation version information. The machine translation optimization method mainly includes the steps of acquiring word alignment information from a source end to a target end by GIZA++, dividing reference translation versions into phrase blocks according to the word alignment information, constructing sub-graphs for every translation version according to a source-end word order, merging multiple sub-graphs, representing multiple juxtaposed reference translation versions into a reference translation version graph, linking the different reference translation versions to acquire more information, linking a reference translation version to be evaluated with the reference translation version graph through source languages, and selecting a proximate route of the translation version to be evaluated to finish final translation version quality evaluation. The machine translation optimization method evaluates the machine translation versions more sufficiently by extending the reference translation version information through graphs, and can better aid the system in parameter learning when participating in training.

Description

technical field [0001] The invention belongs to the fields of computer statistical machine translation and machine translation quality evaluation, and relates to a machine translation optimization method for automatically exploring more reference translation information. Background technique [0002] The background of machine translation can be traced back more than 60 years ago, and since the 1990s, statistical machine translation has developed rapidly and made great progress, and has gradually become a research hotspot in the field of machine translation. Completing machine translation is not an end in itself, but to know the extent to which machine translation can help people achieve a certain task, so we need to evaluate the translation output by machine translation. Machine translation evaluation is currently a very active field of research and a hot topic of discussion. [0003] Machine translation quality evaluation is divided into two aspects: manual evaluation and ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06F17/27G06N99/00
CPCG06F40/253G06F40/58G06N20/00
Inventor 黄书剑季红洁戴新宇陈家骏张建兵
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products