Quantization model generation method and terminal
A technology for quantizing models and models, applied in the field of neural networks, can solve the problems of reduced accuracy loss of quantitative models, reduced model accuracy, and inaccurate estimation, and achieves the effect of reducing accuracy loss and improving accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0072] Please refer to figure 1 , 3 -5, a method for generating a quantitative model of the present embodiment, comprising steps:
[0073] S1. Use the preset data set to train the target detection model to obtain a converged floating-point target detection model;
[0074] In this embodiment, the preset data set is PASCAL VOC, the applied target detection algorithm is Retinanet, and the target detection model is Backbone;
[0075] For example, use Retinanet to train Backbone on the PASCAL VOC dataset to obtain a converged floating-point target detection model, that is, the converged fp32 model;
[0076] S2. Quantize the target detection model and perform training based on gradient estimation to obtain a converged first quantized model;
[0077] Wherein, the linear quantization method applied in the quantization process can be selected arbitrarily;
[0078] In this embodiment, the applied linear quantization method is shown in the following formula:
[0079] r=Round(S(q-Z))...
Embodiment 2
[0092] Please refer to figure 1 , 3 5. On the basis of Embodiment 1, this embodiment further defines how to optimize the quantized target detection model:
[0093]Aiming at the problems of decreased accuracy and difficulty in convergence, the first quantization model is optimized by using the principle of "knowledge distillation";
[0094] Specifically, determine the joint loss function of the target detection model and the first quantization model according to their respective classification loss functions, joint classification loss functions, respective regression loss functions, and joint regression loss functions;
[0095] For example, if Figure 5 as shown, Figure 5 The upper part of the model is a convergent fp32 model, and the network outputs the category and position of the target at the same time, that is, Classification (classification) and Regression (regression). Figure 5 The lower part of is the int8 model that converges during quantization perception traini...
Embodiment 3
[0111] In this embodiment, on the basis of Embodiment 1 or Embodiment 2, it is further verified whether the converged second quantized model obtained through joint training can reduce the accuracy loss after model quantization and improve the accuracy of the target detection model after quantization:
[0112] Use the VOC2007 test in the PASCAL VOC test standard to test the convergent fp32 model, convergent int8 model, and convergent optimal int8 model respectively, and obtain the corresponding map (mean-ap, objective evaluation data), as shown in the following table:
[0113] class fp32 model int8 model Optimal int8 model aeroplane 0.839 0.797 0.769 bicycle 0.849 0.810 0.829 bird 0.850 0.796 0.832 the boat 0.657 0.630 0.603 bottle 0.618 0.547 0.609 the bus 0.851 0.793 0.799 car 0.876 0.858 0.865 cat 0.933 0.922 0.928 the chair 0.558 0.492 0.506 cow 0.802 0.701 0.752 dining table ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


