Chinese- Vietnamese unsupervised neural machine translation method fusing EMD minimized bilingual dictionary
A bilingual dictionary and machine translation technology, applied in the field of machine translation, can solve problems such as poor translation quality
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0055] Embodiment 1: as Figure 1-7 As shown, the Chinese-Vietnamese unsupervised neural machine translation method that integrates the EMD minimized bilingual dictionary, Step1, first obtain parallel corpora: 58 million Chinese monolingual corpora crawled from the Internet, and 30 million Vietnamese monolingual corpora.
[0056] Step2, corpus preprocessing; on the basis of step Step1, Chinese and Vietnamese single-sentence word segmentation and part-of-speech tagging are trained to obtain single-language word vectors; Vietnamese word segmentation and part-of-speech tagging are performed using the undertheseanlp Vietnamese word segmentation tool for Vietnamese, Use the jieba word segmentation tool to perform word segmentation and part-of-speech tagging for Chinese. Using word2vec to train Chinese-Vietnamese and Vietnamese monolingual word vectors. Both Chinese and Vietnamese train 300-dimensional word vectors. The 300-dimensional word embeddings are trained using the skip-gr...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com