Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Transfomer model processing method, readable storage medium and equipment

A model processing and storage medium technology, applied in voice analysis, voice recognition, instruments, etc., can solve the problem of large number of parameters and achieve the effect of reducing the number of parameters

Pending Publication Date: 2022-05-10
FOSHAN UNIVERSITY
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The Transformer model has excellent performance in speech recognition tasks, showing extraordinary performance, but there are still shortcomings in the number of parameters during training.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Transfomer model processing method, readable storage medium and equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0029] This embodiment provides a transfomer model processing method, such as figure 1 shown, including steps:

[0030] S1. During training, the target layer is calculated according to the sub-layer weights of the transformer model, and the target layer is deleted or retained during the next training.

[0031] Further, step S1 includes:

[0032] S11. In the Encoder structure of the Transformer model, the two sub-layers and the last two sub-layers of the Decoder structure perform matrix operations according to the weights related to the output of the sub-layers;

[0033] The weight associated with the sublayer output is the nearest weight of the sublayer output Let the value of the matrix be a matrix of 1 and Do matrix operations: n i = u 1 W i u 2 .

[0034] S12. Select the minimum value in the matrix operation, take the layer where the minimum value is located as the target layer, obtain different target layers for each sub-layer, and delete the target layer in th...

Embodiment 2

[0043] This embodiment provides a computer-readable storage medium, at least one instruction, at least one section of program, code set or instruction set is stored in the readable storage medium, at least one instruction, at least one section of program, code set or instruction set is controlled by the processor Load and execute to realize the transfomer model processing method of Embodiment 1.

[0044] Optionally, the computer-readable storage medium may include: a read-only memory (ROM, Read Only Memory), a random access memory (RAM, Random Access Memory), a solid-state hard drive (SSD, Solid State Drives) or an optical disc, and the like. Wherein, the random access memory may include a resistive random access memory (ReRAM, Resistance Random Access Memory) and a dynamic random access memory (DRAM, Dynamic Random Access Memory).

Embodiment 3

[0046] This embodiment provides a device, which can be a computer device, or a mobile terminal device, such as a mobile phone, a tablet computer, etc., including a processor and a memory, where program codes are stored in the memory, and the processor executes the program codes to execute The transfomer model processing method of embodiment 1.

[0047] Those skilled in the art should be aware that, in the foregoing one or more examples, the functions described in the embodiments of the present application may be implemented by hardware, software, firmware or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a ge...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a transfomer model processing method, a readable storage medium and equipment, and the method comprises the steps: carrying out the calculation of a target layer according to the sub-layer weight of a transfomer model during the training, and deleting or retaining the target layer during the next training. According to the method, the sublayers needing to be deleted or reserved are automatically confirmed through forward propagation and reverse propagation, that is, the sublayers needing to be deleted or reserved during next training are calculated through the weight obtained through current training, then the parameter quantity of the model is reduced, and the lightweight transfomer model is achieved.

Description

technical field [0001] The invention belongs to the technical field of speech recognition, and in particular relates to a transfomer model processing method, a readable storage medium and equipment. Background technique [0002] The Transformer model is an Encoder-Decoder (codec) framework that is repeatedly stacked 6 times. There are two sub-layers in the Encoder structure, which are Multi-Head Attention (MHA) and Feed-Forward Neural Network (Feed-Forward Neural Network). Network, FFN). The Decoder structure is similar to the Encoder structure. It has three sublayers, two MHA sublayers and one FFN sublayer. The Encoder and Decoder structures have their own functions. The main function of the entire Encoder structure is to extract features from the input audio signal. After 6 repetitions, the advanced features of the audio signal are extracted; the first MHA sublayer of the Decoder is to extract the input character information. Features, the second MHA sublayer matches the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/06G10L15/16G10L15/02
CPCG10L15/063G10L15/16G10L15/02
Inventor 阮锦标段志奎陈嘉维于昕梅高国智严世泉王虎伟
Owner FOSHAN UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products