Deep learning operator optimization method and device, equipment and storage medium

A technology of deep learning and optimization method, applied in the field of artificial neural network, which can solve the problems of increasing power consumption, increasing system resource occupancy of terminal equipment, poor performance of deep learning framework, etc.

Pending Publication Date: 2021-06-08
BEIJING XIAOMI PINECONE ELECTRONICS CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when the existing Buffer method is used to perform fixed-point quantization operations, data is loaded from the second-level high-speed memory (L2cache) of the processor, resulting in a slow quantization speed and poor performance of the deep learning framework for reasoning neural network models. In turn, it will increase the system resource usage of the terminal device and increase the power consumption

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep learning operator optimization method and device, equipment and storage medium
  • Deep learning operator optimization method and device, equipment and storage medium
  • Deep learning operator optimization method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. Embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present disclosure as recited in the appended claims.

[0055] Explanation of terms:

[0056] OpenCL (full name Open Computing Language, Open Computing Language) is the first open and free standard for general-purpose parallel programming of heterogeneous systems. , hand-held devices to write efficient and lightweight code, and is widely applicable to multi-core processors (CPU), graphics processing unit (GPU), Cell type architecture and other para...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a deep learning operator optimization method and device, equipment and a storage medium. The method comprises the steps: in response to target data detected in a first-layer high-speed memory L1 cache of a graphics processor, calling a method for reading an Image object to read the target data into the processor from the first-layer high-speed memory; performing secondary quantization operation on the target data in the processor to obtain an operation result; and writing the operation result into a main memory of the graphics processor. According to the invention, the speed of reading the target data can be improved by using an optimization acceleration mechanism of an L1cache of the graphics processor, so that the speed of subsequently quantifying a neural network based on the read target data can be improved, the interaction efficiency of a deep learning framework and a memory can be improved, the execution speed of the deep learning framework can be improved, resource occupation of a terminal system is reduced, and power consumption of the terminal system is reduced.

Description

technical field [0001] The present disclosure relates to the technical field of artificial neural networks, and in particular to a deep learning operator optimization method, device, equipment, and storage medium. Background technique [0002] With the rapid development of artificial intelligence technology, deep learning neural networks have achieved great success in pattern recognition such as image classification, object detection, image segmentation, speech recognition, and machine translation. However, a neural network with better performance usually has larger model parameters, which makes its calculation complexity higher, mainly reflected in two aspects: space storage and time calculation. Therefore, quantization of neural networks becomes particularly important, especially for applications running on devices such as terminal devices or embedded chips. [0003] In related technologies, the neural network on the terminal device can be accelerated by using a fixed-poi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50G06N3/08
CPCG06F9/5016G06N3/08Y02D10/00G06N3/063G06T1/60G06N3/04G06F7/483G06F12/0897G06F2212/454G06F2212/455G06F2212/302G06V10/82G06V10/95G06F18/21
Inventor 李滨
Owner BEIJING XIAOMI PINECONE ELECTRONICS CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products