A method and device for improving gemm computing performance

A technology for computing performance and optimization methods, which is applied in the computer field and can solve problems such as poor results

Active Publication Date: 2021-10-26
XFUSION DIGITAL TECH CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For the detection stage of deep learning, the existing methods to improve the computing performance of GEMM are not effective.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and device for improving gemm computing performance
  • A method and device for improving gemm computing performance
  • A method and device for improving gemm computing performance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] The method and device for improving the computing performance of GEMM disclosed in this application can be applied in the process of detecting targets using deep convolutional neural networks. Such as figure 1 As shown, after completing the training of the deep convolutional neural network, the deep convolutional neural network is used to detect the acquired objects. For example, a face image is obtained through an image acquisition device, and as a target, a trained deep convolutional neural network is used to detect the face image, and a detection result indicating the identity of the face is obtained.

[0025] Among them, in the process of detecting the target, the deep convolutional neural network converts the convolution calculation into GEMM calculation. During the research process, the applicant found that the existing methods for improving the performance of GEMM calculations are aimed at large-scale matrices. Therefore, they should be used in the process of tr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The application provides a method and device for improving GEMM computing performance, obtain parameters calculated by general matrix-matrix multiplied by GEMM to be optimized, and query target parameters from at least one historical GEMM calculated parameter, the target parameter is the same as the parameter to be optimized The parameters calculated by GEMM satisfy the parameters of the preset relationship. The optimization method corresponding to the target parameter is determined according to the preset corresponding relationship between the parameter and the optimization method. And use the optimization method corresponding to the target parameter to optimize the GEMM calculation to be optimized. Wherein, the parameters of the GEMM calculation are determined based on the size of the matrices involved in the GEMM calculation. Because the characteristics of the matrix participating in the GEMM calculation to be optimized are used as the basis for optimizing the GEMM calculation to be optimized, in the process of using the deep convolutional neural network to detect the target, even if the size of the matrix is ​​small or the shape is irregular, It can improve the performance of GEMM calculation.

Description

technical field [0001] The present application relates to the computer field, in particular to a method and device for improving GEMM computing performance. Background technique [0002] General Matrix-matrix Multiplication (GEMM) calculations are generally calculations for dense matrices and are widely used in deep learning. Deep learning is a method based on representation learning of data in machine learning. [0003] With the development of deep learning, deep convolutional neural network has become the most widely used network structure, and it is widely used in the fields of image and speech. The core algorithm of the deep convolutional neural network is convolution calculation, and the current mainstream implementation method is to convert convolution calculation into GEMM calculation. Studies have shown that GEMM calculations occupy most of the computing resources of deep convolutional neural networks. It can be seen that the performance of GEMM calculations direct...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/16G06N3/04
CPCG06F17/16G06N3/045
Inventor 齐霁张邵敏贾海鹏
Owner XFUSION DIGITAL TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products