Method and device for fusing multiple machine translation systems
A technology of machine translation and translation results, which is applied in the fields of instruments, special data processing applications, and electrical digital data processing, etc. It can solve the problems of not fully considering the decoding process information and the decoding search space, etc., to achieve good scalability, The effect of improving performance
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
specific Embodiment approach 1
[0036] Embodiment 1: The device for integrating multiple machine translation systems in this embodiment includes a monolingual or bilingual preprocessor, a phrase extractor, a language model generator, multiple machine translation system trainers and decoders;
[0037] The monolingual or bilingual preprocessor preprocesses monolingual and bilingual; the phrase extractor extracts the phrase from the bilingual training corpus and puts it in the phrase table; uses the language model generator to train the language model from the monolingual training corpus; The machine translation system before fusion uses the phrase table and language model for training, and uses the parameter weights obtained from the training as the weights of the final decoder; the decoder decodes the test corpus to generate translation results, and evaluates the translation results to output scores.
specific Embodiment approach 2
[0038] Specific embodiment two: the method for the fusion of multiple machine translation systems in this embodiment is implemented in the following steps:
[0039] 1. The preprocessing process of the machine translation system;
[0040] 2. Establish a translation hypergraph for each translation system;
[0041] 3. Fuse the two translation hypergraphs and train the training set;
[0042] Wherein, the training includes two parts: the single machine translation system before fusion adopts the BTG sequence model trained by maximum entropy and the machine translation system after fusion adopts minimum error rate training (MERT);
[0043] Fourth, the test set is decoded to generate translation results, and the translation results are scored, that is, a method for integrating multiple machine translation systems is completed.
[0044] Modern machine translation technology is based on bilingual grammar, which is a quadruple
[0045] G=(V N ,V T ,P,S), where V N Is the set of no...
specific Embodiment approach 3
[0090] Specific implementation mode three: the difference between this implementation mode and specific implementation mode two is: the preprocessing process of the machine translation system is specifically:
[0091] (1) Segmentation of source language and target language;
[0092] (2) Sentences that need part-of-speech tagging are tagged with part-of-speech, and bilingual alignment is performed at the same time;
[0093] (3) the sentence that needs to carry out syntactic analysis carries out syntactic analysis;
[0094] (4) Merge the alignment information with the part-of-speech & syntax information;
[0095] (5) Phrases are extracted, and feature scores related to the phrases are calculated.
[0096] In order to better understand the preprocessing process, this embodiment uses a tree-to-string model for introduction. image 3 It is participle for the sentence "Bush and Sharon held talks"; Figure 4 Part-of-speech tagging for sentences after word segmentation; Figure 5...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com