Fixed-point quantification method and device for deep learning model

A technology of deep learning and quantitative methods, applied in the field of deep learning, can solve the problems of slow reasoning deep learning network, increase in model storage size, and difficulty in implementing landing applications, so as to save time, improve the efficiency of quantization, and increase the speed of quantization Effect

Pending Publication Date: 2021-09-17
珠海亿智电子科技有限公司
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] Common deep learning models usually have a large number of parameters and layers in order to improve network accuracy, resulting in a sharp increase in model storage size and slow inference speed
Slow reasoning makes many high-precision deep learning networks can only run on GPU systems with high computing power, making it difficult to implement applications

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fixed-point quantification method and device for deep learning model
  • Fixed-point quantification method and device for deep learning model
  • Fixed-point quantification method and device for deep learning model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

[0031] In the description of the present invention, the meaning of several means one or more, and the meaning of multiple means two or more than two. Greater than, less than, exceeding, etc. are understood as not including the original number, and above, below, within, etc. are understood as including the original number . If the description of the first and second is only for the purpose of distinguishing the technical features, it cannot be understood as indicating or implying the relative importance or implicitly indicating the number...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a fixed-point quantification method and device for a deep learning model, and the method comprises the following steps: inputting calibration data to a target model, taking a model parameter and an activation value of the target model as quantification objects in sequence, and executing the following steps: inputting calibration set data, extracting the quantification objects of the target model according to layers, acquiring a distribution histogram of a quantized object, scaling the distribution histogram of the quantized object through an adaptive KL divergence equation, acquiring KL divergence values corresponding to different decimal point positions based on a preset quantization digit, and comparing to obtain a first quantization result of the quantized object. The defect that the KL divergence algorithm only pays attention to the probability is overcome, the quantization result is optimized, and the quantization speed can be greatly increased, the quantization efficiency is improved and the time is saved under the condition that the certain precision of the quantized model is ensured.

Description

technical field [0001] The invention relates to the technical field of deep learning, in particular to a method and device for fixed-point quantization of a deep learning model. Background technique [0002] Common deep learning models usually have a large number of parameters and layers in order to improve network accuracy, resulting in a sharp increase in model storage size and slow inference speed. Slow reasoning makes many high-precision deep learning networks can only run on GPU systems with high computing power, making it difficult to implement applications. [0003] With the gradual warming up of deep learning in recent years, its deployment requirements are increasing, and the end-side often uses low-precision computing units in order to balance cost and speed. Compared with full-precision or mixed-precision models including half-precision, although low-precision, it can reason faster and occupy less memory. Therefore, how to convert between full-precision and low-p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/08G06K9/62
CPCG06N3/08G06N3/045G06F18/2415
Inventor 不公告发明人
Owner 珠海亿智电子科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products