Unlock instant, AI-driven research and patent intelligence for your innovation.

Quantization model generation method and terminal

A technology for quantizing models and models, applied in the field of neural networks, can solve the problems of reduced accuracy loss of quantitative models, reduced model accuracy, and inaccurate estimation, and achieves the effect of reducing accuracy loss and improving accuracy

Active Publication Date: 2021-04-13
SANLI VIDEO FREQUENCY SCI & TECH SHENZHEN
View PDF15 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the prior art, the floating-point 32bit (bit) target detection model based on deep learning is usually quantized to 8bit, and most of the multiplication calculations of fp (float, floating point) 32 in the original model are converted into int (integer )8 multiplication calculation and fp32 addition calculation, but after the quantization operation, even if it introduces a pseudo-quantization node for training, it will still bring a certain loss of accuracy
[0004] There is another method to quantize the floating-point 32-bit image classification model based on deep learning to low bit, convert the multiplication calculation of fp32 in the original model into the underlying low-bit dot product bit operation, and introduce quantization into the fp32 model at the same time Training further reduces the accuracy loss of the quantized model, but the gradient introduced by the training is estimated by using the through estimator, and the inaccurate estimation will still cause the model accuracy to drop

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Quantization model generation method and terminal
  • Quantization model generation method and terminal
  • Quantization model generation method and terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0072] Please refer to figure 1 , 3 -5, a method for generating a quantitative model of the present embodiment, comprising steps:

[0073] S1. Use the preset data set to train the target detection model to obtain a converged floating-point target detection model;

[0074] In this embodiment, the preset data set is PASCAL VOC, the applied target detection algorithm is Retinanet, and the target detection model is Backbone;

[0075] For example, use Retinanet to train Backbone on the PASCAL VOC dataset to obtain a converged floating-point target detection model, that is, the converged fp32 model;

[0076] S2. Quantize the target detection model and perform training based on gradient estimation to obtain a converged first quantized model;

[0077] Wherein, the linear quantization method applied in the quantization process can be selected arbitrarily;

[0078] In this embodiment, the applied linear quantization method is shown in the following formula:

[0079] r=Round(S(q-Z))...

Embodiment 2

[0092] Please refer to figure 1 , 3 5. On the basis of Embodiment 1, this embodiment further defines how to optimize the quantized target detection model:

[0093]Aiming at the problems of decreased accuracy and difficulty in convergence, the first quantization model is optimized by using the principle of "knowledge distillation";

[0094] Specifically, determine the joint loss function of the target detection model and the first quantization model according to their respective classification loss functions, joint classification loss functions, respective regression loss functions, and joint regression loss functions;

[0095] For example, if Figure 5 as shown, Figure 5 The upper part of the model is a convergent fp32 model, and the network outputs the category and position of the target at the same time, that is, Classification (classification) and Regression (regression). Figure 5 The lower part of is the int8 model that converges during quantization perception traini...

Embodiment 3

[0111] In this embodiment, on the basis of Embodiment 1 or Embodiment 2, it is further verified whether the converged second quantized model obtained through joint training can reduce the accuracy loss after model quantization and improve the accuracy of the target detection model after quantization:

[0112] Use the VOC2007 test in the PASCAL VOC test standard to test the convergent fp32 model, convergent int8 model, and convergent optimal int8 model respectively, and obtain the corresponding map (mean-ap, objective evaluation data), as shown in the following table:

[0113] class fp32 model int8 model Optimal int8 model aeroplane 0.839 0.797 0.769 bicycle 0.849 0.810 0.829 bird 0.850 0.796 0.832 the boat 0.657 0.630 0.603 bottle 0.618 0.547 0.609 the bus 0.851 0.793 0.799 car 0.876 0.858 0.865 cat 0.933 0.922 0.928 the chair 0.558 0.492 0.506 cow 0.802 0.701 0.752 dining table ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a quantization model generation method and terminal, and the method comprises the steps: training a target detection model through a data set, obtaining a converged floating point type target detection model, carrying out the quantification, carrying out the training based on gradient estimation, and obtaining a converged first quantization model; and finally, performing joint training on the target detection model and the first quantization model based on a joint loss function of the target detection model and the first quantization model to obtain a converged second quantization model, and causing precision loss and precision reduction of the obtained first quantization model due to inaccurate quantization operation and gradient estimation. According to the invention, the target detection model and the first quantization model are subjected to joint training, the target detection model can guide the first quantization model on the basis of the knowledge distillation principle, the first quantization model learns the feature extraction capacity of the target detection model, and the optimal second quantization model is obtained; and therefore, the precision loss of the model after quantification is reduced, and the precision of the target detection model after quantification is improved.

Description

technical field [0001] The invention relates to the technical field of neural networks, in particular to a method for generating a quantitative model and a terminal. Background technique [0002] With the development of deep learning technology, in order to speed up the running speed of neural network models and facilitate the deployment of neural networks in mobile terminals, quantization techniques are usually used to quantize floating-point computing models into fixed-point computing models. [0003] In the prior art, the floating-point 32bit (bit) target detection model based on deep learning is usually quantized to 8bit, and most of the multiplication calculations of fp (float, floating point) 32 in the original model are converted into int (integer )8 multiplication calculation and fp32 addition calculation, but after the quantization operation, even if it introduces a pseudo-quantization node for training, it will still bring a certain loss of accuracy. [0004] Ther...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/08G06N3/04
CPCG06N3/084G06N3/045
Inventor 潘成龙张宇刘东剑
Owner SANLI VIDEO FREQUENCY SCI & TECH SHENZHEN