A Chinese translation method of multilingual geographical noun roots based on transformer deep learning model

A technology of deep learning and place names, applied in the field of machine translation, can solve the problems of word order confusion, inconsistent usage habits, wrong free translation and transliteration of Chinese translation results, and achieve the effect of improving translation efficiency and reducing manual dependence

Active Publication Date: 2021-04-09
NANJING WENTUJING INFORMATION TECH CO LTD
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, high-tech companies such as Google, Microsoft, and Baidu have launched corresponding translation products, and they have been widely praised. However, these translation products often have the problem of wrong use of free translation and transliteration when translating foreign place names, resulting in foreign place names being disregarded. When translated into an adjective or a special noun, the word order of the Chinese translation result may be confused, and the translation result does not match the Chinese usage habits

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Chinese translation method of multilingual geographical noun roots based on transformer deep learning model
  • A Chinese translation method of multilingual geographical noun roots based on transformer deep learning model
  • A Chinese translation method of multilingual geographical noun roots based on transformer deep learning model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The specific implementation of the present invention will be described in detail below in conjunction with the accompanying drawings. The Chinese translation method of multilingual geographical noun roots based on the Transformer deep learning model includes the following steps:

[0031] (1) Preprocess the original foreign language place name corpus and the corresponding Chinese translation corpus, and remove the special characters in the foreign language place name corpus; the foreign language place names that remove special characters need to expand the abbreviation part according to the rules; the expanded foreign language place name corpus is returned Lowercase processing and diacritic substitution processing are required.

[0032] 1) Through the method of establishing a special character library combined with string matching, the special characters such as "#$. / -" that exist in the corpus of foreign place names due to encoding conversion and incomplete data cleanin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for translating multilingual place names into Chinese based on the Transformer model. The language range covers English, French and German: based on the place name language knowledge base combined with the language characteristics of the place names to be translated into Chinese, the language of the place names to be translated into Chinese can be identified and input. And according to the language, select the corresponding geographical noun root extraction rule in the geographical noun root extraction rule base to extract the root of the place name to be translated into Chinese; convert the extracted geographical noun root text into a character vector through the character embedding model; based on English, French and German geographical noun root and Corresponding to the Transformer model obtained by training and fine-tuning the translation corpus of Chinese geographical noun roots, input the character vectors of the geographical noun roots to be translated into Chinese, and obtain the final Chinese translation results of the geographical noun roots. The English, French and German geographical noun root results of Chinese translation provided by the present invention all have good readability, conform to Chinese reading habits, meet the needs of Chinese translation of multilingual geographical noun roots to a certain extent, and have good flexibility and universality.

Description

technical field [0001] The invention relates to the field of machine translation, in particular to a method for Chinese translation of geographical noun roots in English, French and German based on a Transformer deep learning model. Background technique [0002] As an indispensable basic geographic information and social public information, place names are an important bridge connecting various social information, and play an important role in national and social management, economic development, cultural construction, national defense and diplomacy, etc. A large number of foreign place names appear in the process of economic exchanges, and it is urgent to propose a method that can reasonably translate foreign place names. [0003] In recent years, research on neural machine translation has developed rapidly, and compared with statistical machine translation, the translation quality has been significantly improved. Neural machine translation usually uses an encoder-decoder ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/58G06F40/289G06F16/903G06N3/04G06N3/08
CPCG06F40/58G06F40/289G06F16/90344G06N3/084G06N3/047G06N3/044G06N3/045
Inventor 张雪英赵文强吴恪涵
Owner NANJING WENTUJING INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products