Method for constructing machine translation test set in chapter-level English translation
A technology of machine translation and construction method, applied in the field of the construction of text-level English-Chinese machine translation test set, which can solve the problem of no evaluation index and so on.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
example 1
[0088]Source language data:
[0089]Previous: You Rich Guys Think That Money Can Buy Anything.
[0090]Current sentence: how right you are.
[0091]Target language data:
[0092]Previous sentence: Your rich people always have money to buy everything.
[0093]Current sentence: You are too right.
[0094]Chapter Level Connection Word Test Set, requires one of five chapter level connections such as "AS", "OR", "While", "Since", "Though", and "though" in the current sentence in the source language data. The word "CC", "in", "WRB" needs to meet one of "CC", "in", "WRB", because the expression of the Chinese chapter level connection is more diverse, we use the first automatic filtering sentence pair, then take manual check to meet the source language The conditions in the data but the target language data does not contain the corresponding connection word pair, and then check whether the information used by the connection to eliminate the ambiguity is in the previous sentence, and finally each meaning of e...
example 2
[0097]Source language data:
[0098]Previous sentence: Everything is so difficult in life, for me.
[0099]Current sentence: While for Others It's All Child's Play.
[0100]Target language data:
[0101]Previous sentence: For me, life is very difficult.
[0102]Current sentences: It is like children.
[0103]The omission of the test set first matches the sentence of the source language data to the source language data. The sentence is filtered, which contains "do", "does", "can", "could", "shouth", "is", "am "" Are "," May "sentence pair, then require the verbs in the previous sentence of source language data, ie, word" VC "," VE "," VV ", and then check the current sentence in the target language data. The verb and the previous sentence consistency, and finally select a certain number of test cases to constitute the omitted test set.
[0104]Then check the verbs included in the previous sentence of source language data, ie, word, "VE", "VV", and then check the verbs in the current sentence in the targe...
example 3
[0106]Source language data:
[0107]Previous sentence: you see, she doesn't know.
[0108]Current sentence: Neither Do I.
[0109]Target language data:
[0110]Previous sentence: Look, she doesn't know.
[0111]When I don't know.
[0112]Step 4, perform artificial inspections of the selected test cases, correct translation errors.
[0113]Table 1: BLEU automatic score results
[0114] pronoun Chapter level connection word Omit thumt 12.49.818.2 CADEC 19.115.325.5 BERT-NMT13.912.719.1
[0115]As can be seen from Table 1: From the perspective of bleu (bilingual evaluation replacement) value, the CADEC (combined context decoder) model is the highest in three language phenomena, indicating that the model is in three chart level languages. The best translation effect is the best, BERT-NMT (Combined BERT's neural machine translation) fusion BERT's neuromechanical translation of the Gert, the Bleu value of the model is second, and thumt (Tsinghua University machine translation) model is the lowest, indicating th...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com