Data access and bounded processing method for deep learning semi-precision operator

A technology of deep learning and processing methods, applied in the field of deep learning, can solve problems such as half-precision operator DMA memory access is out of bounds, and achieve the effects of reducing time, improving performance, and reducing occupancy

Pending Publication Date: 2022-03-22
JIANGNAN INST OF COMPUTING TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to provide a simple and general bounding processing method for the common half-precision data types in the implementation of deep learning operators, so as to solve the non-boundary problem of DMA access of half-precision operators on heterogeneous many-core platforms

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data access and bounded processing method for deep learning semi-precision operator
  • Data access and bounded processing method for deep learning semi-precision operator
  • Data access and bounded processing method for deep learning semi-precision operator

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0026] EXAMPLES: The present invention provides a method for depth learning semi-precision operator data access pair processing method, for depth learning, calculating characteristics and tensile spatial distribution, 4B parallel processing of multi-dimensional tensions , The input data of the four-dimensional sheets is divided into different classes according to the actual participation calculation, and different semi-precision data pair processing methods are used separately;

[0027] Specifically, select different alignment methods based on the input operator type and input data.

[0028] S1, for one-dimensional calculation (such as activation functions, the calculation of four-dimensional sheets in the activation function is actually calculated according to one-dimensional calculation), calculating the total amount of data LEN = N * c * h * w, if len is odd, single semi-precision The floating point is 2B, which does not satisfy the requirements, adding a 0 at the last end of L...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a deep learning semi-precision operator data access and bounded processing method, and the method comprises the steps: carrying out the 4B bounded processing of a specific dimension of a multi-dimensional tensor according to the calculation characteristics of an operator in deep learning and the spatial distribution of the tensor, dividing the input data of the four-dimensional tensor into different classes according to the dimensions actually participating in the calculation, and carrying out the calculation of the four-dimensional tensor. Different semi-precision data boundary alignment processing methods are respectively used; specifically, different boundary alignment methods are selected according to the type of an input operator and the calculation dimension of input data. According to the method, the problem that the half-precision operator DMA memory access on the heterogeneous many-core platform is not aligned is solved, the occupation of the memory space can be reduced, the alignment processing time can be effectively shortened, and the alignment processing performance is improved.

Description

Technical field [0001] The present invention relates to a method of interrupted the peer processing for depth learning semi-precision operators, belonging to the field of depth learning technology. Background technique [0002] The semi-precision data type takes up less memory, the calculation time is short, and the performance of the depth learning training model can effectively improve the performance of the semi-precision data type, the operator has an important role in accelerating depth learning model training. [0003] The data transfer between the control core and the calculation of the heterogeneous nuclear platform is mainly implemented by the DMA request, and the DMA only supports the 4B-granular pair, which means that the DMA request needs to guarantee the main memory address, calculate the nuclear deposit address, transfer data. The parameters such as the amount, step size, and steps of steps are required to meet the requirements of the 4b particle size pair, and the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F13/28
CPCG06F13/28
Inventor 刘鑫刘沙陈德训彭超黄则强高捷王宜鹏
Owner JIANGNAN INST OF COMPUTING TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products