Chinese and Vietnamese neural machine translation method fusing zero pronouns and chapter information

A machine translation and pronoun technology, applied in neural architecture, natural language translation, biological neural network models, etc., can solve the problem of only simple translation, and achieve the effect of enriching semantic features, improving significantly, and improving performance

Inactive Publication Date: 2022-06-07
KUNMING UNIV OF SCI & TECH
View PDF9 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although Transformer can use the multi-head attention mechanism to capture more semantic information, it often only translates simple parts of the omitted pronouns

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese and Vietnamese neural machine translation method fusing zero pronouns and chapter information
  • Chinese and Vietnamese neural machine translation method fusing zero pronouns and chapter information
  • Chinese and Vietnamese neural machine translation method fusing zero pronouns and chapter information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0055] Example 1: as figure 1 As shown, the Chinese-Vietnamese neural machine translation method integrating zero pronouns and text information, the specific steps of the method are as follows:

[0056] Step1. Crawling, collecting and constructing parallel data of Sino-Vietnamese and English trilinguals through web crawler technology, using matrix alignment method to find out the missing pronouns in both Chinese and Vietnamese bilinguals, and using the dependency syntax analysis library DDParser to analyze the syntactic components of these omitted pronouns, as the truth of the classification task label, and obtain a Chinese-Vietnamese comparable corpus dataset marked with zero pronoun information; using joint learning can simultaneously learn and update the parameters of the classification model and translation model.

[0057] Step1.1. Crawling, collecting and constructing Chinese-Vietnamese-English parallel data through crawling technology;

[0058] Step1.2. Perform zero pro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a Chinese and Vietnamese neural machine translation method fusing zero pronouns and chapter information, and belongs to the technical field of natural language processing. The method comprises the following steps: constructing a middle-to-cross-English three-language aligned chapter data set, and performing zero pronoun classification marking on middle-to-cross data; respectively acquiring bilingual features of a source statement and a context by using a self-attention mechanism; the source statement and the context features are pooled and linked, and syntactic component classification of null pronouns is carried out; and the target statement is predicted through the two attention sub-layers by using the source statement and the context features. A joint learning mode is adopted, and parameters of the main task model and the auxiliary model are learned and updated at the same time. And combining the classification task and the translation task. And chapter information is added in the classification task, so that the zero pronoun classification accuracy is improved. And meanwhile, the chapter information can also effectively improve the translation task performance. By fusing the null pronouns and the chapter information, the Chinese and Vietnamese neural machine translation performance is effectively improved.

Description

technical field [0001] The invention relates to a Chinese-Vietnamese neural machine translation method integrating zero pronouns and textual information, and belongs to the technical field of natural language processing. Background technique [0002] The exchanges between China and Vietnam are getting closer and closer, and the demand for Chinese-Vietnamese translation technology is growing. The research on translation technology in low-resource scenarios such as Chinese-Vietnamese is getting better and better. However, the translation technology researched at this stage is more aimed at the usage scenarios of formal style, such as the translation of news texts and official documents. For informal styles, such as online comments, spoken daily conversations and other usage scenarios, the translation performance is obviously insufficient under the same translation model. The main reason for the poor translation performance is that in spoken language and daily dialogue scenar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/58G06F40/211G06K9/62G06N3/04
CPCG06F40/58G06F40/211G06N3/048G06F18/2415
Inventor 余正涛王麒鼎
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products