Neural network machine translation corpus extension method based on statistical phrase table
A technology of machine translation and extension methods, applied in the field of computer applications and machine translation, can solve problems such as difficulty in obtaining satisfactory results, and achieve the effect of alleviating adverse effects and improving evaluation indicators
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0053] This embodiment describes the flow of the method of the present invention and its specific examples.
[0054] figure 1 It is a statistical phrase table-based neural network machine translation corpus expansion method of the present invention and a flow chart in this embodiment.
[0055] from figure 1 It can be seen from the figure that the present invention includes two stages 1) the training set expansion stage and 2) the operation process of the model training stage.
[0056] Take Uyghur-to-Chinese translation as an example, where Uyghur is the source language and Chinese is the target language.
[0057] 1) Training set expansion stage:
[0058] Step 1. Preprocess the original training set according to Definition 1, Definition 2, Definition 3, Definition 4, and Definition 5. The specific process of preprocessing varies with different source languages and target languages. The purpose is to standardize the training set. Among them, the preprocessing process of th...
Embodiment 2
[0070] The training set in the Uyghur-Chinese news translation task provided by CWMT2017 is randomly split into training set, development set and test set 1. In addition, the development set data of the Uyghur-Chinese news translation evaluation task provided by CWMT2017 is used as a test Set 2, the experimental results show that, in the case that the original training set, development set, test set data and neural machine translation model are the same, the present invention adopts BLEU based on Chinese characters compared with the neural machine translation model training method of the present invention. As the evaluation index, the following experimental results can be obtained.
[0071] Table 1 uses the comparison of BLEU values before and after the training set expansion method proposed by the present invention
[0072]
[0073] The experimental results in Table 1 show that: in the case of the same training set, development set and test set data, the BLEU evaluation ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 

