Translation model training method and device

A translation model and training method technology, applied in the fields of computing equipment and computer-readable storage media, translation model training methods and devices, can solve the problems of high consumption of computing resources, huge model size, long training period, etc., and achieve a reduction in model size. The effect of volume, guaranteed accuracy, and improved performance

Pending Publication Date: 2020-11-13
BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD
View PDF6 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The existing translation model has a complex structure, a large amount of parameters for each sub-layer, and a large model volume. When training the translation model, the training cycle is long and consumes high computing resources.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Translation model training method and device
  • Translation model training method and device
  • Translation model training method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0074] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the application. However, the present application can be implemented in many other ways different from those described here, and those skilled in the art can make similar promotions without violating the connotation of the present application. Therefore, the present application is not limited by the specific implementation disclosed below.

[0075] Terms used in one or more embodiments of the present application are for the purpose of describing specific embodiments only, and are not intended to limit the one or more embodiments of the present application. As used in one or more embodiments of this application and the appended claims, the singular forms "a", "the", and "the" are also intended to include the plural forms unless the context clearly dictates otherwise. It should also be understood that the term "and / or" used in one or more embodiments of th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a translation model training method and device. The translation model comprises an encoder and a decoder. The encoder comprises n encoding layers which are connected in sequence, the decoder comprises n decoding layers which are connected in sequence, a self-attention sub-layer of the ith encoding layer and a self-attention sub-layer of the ith decoding layer share a self-attention parameter, n is greater than or equal to 1, and i is greater than or equal to 1 and less than or equal to n. The method comprises the following steps of: receiving a training statement and a target statement corresponding to the training statement; obtaining a training statement vector corresponding to the training statement and a target statement vector corresponding to the target statement; inputting the training statement vector into the encoder, and performing encoding processing to obtain an encoding vector; inputting the encoding vector and the target statement vector into the decoder, decoding the encoding vector and the target statement vector to obtain a decoding vector, and calculating a loss value according to the decoding vector; and adjusting parameters of the translation model according to the loss value.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, and in particular to a translation model training method and device, computing equipment, and a computer-readable storage medium. Background technique [0002] With the improvement of computer computing power, the application of neural networks is becoming more and more extensive, such as building translation models to realize the conversion of sentences to be translated into target sentences. [0003] The translation model is an end-to-end network structure, including an encoder and a decoder. The encoder includes multiple encoding layers, and the decoder includes multiple decoding layers. Each encoding layer includes a self-attention sublayer and a feedforward neural network. Network sub-layers, including self-attention sub-layer, encoding-decoding attention sub-layer and feed-forward neural network sub-layer in each decoding layer, each sub-layer has its own parameter...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/47G06F40/126G06N3/04
CPCG06F40/47G06F40/126G06N3/044G06N3/045
Inventor 李长亮郭馨泽
Owner BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products