Deep learning model tuning method and computing device

A computing device and deep learning technology, applied in the field of neural networks, can solve problems such as loss of data accuracy, and achieve the effect of ensuring optimality and accuracy of precision data

Pending Publication Date: 2021-07-20
ALIBABA GRP HLDG LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, converting the model from the FP32 model to the INT8 model will result in a lo

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep learning model tuning method and computing device
  • Deep learning model tuning method and computing device
  • Deep learning model tuning method and computing device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The present disclosure is described below based on examples, but the present disclosure is not limited only to these examples. In the following detailed description of the disclosure, some specific details are set forth in detail. The present disclosure can be fully understood by those skilled in the art without the description of these detailed parts. In order to avoid obscuring the essence of the present disclosure, well-known methods, procedures, and procedures are not described in detail. Additionally, the drawings are not necessarily drawn to scale.

[0053] The following terms are used in this document.

[0054]Acceleration unit: also known as neural network acceleration unit, aimed at the low efficiency of general-purpose processors in some special-purpose fields (for example, processing images, processing various operations of neural networks, etc.), in order to improve the efficiency of these special-purpose The processing unit designed for the data processi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A computing device includes a memory, a scheduling unit, and an acceleration unit, wherein the acceleration unit is used for executing each quantitative model; the memory stores instructions; the scheduling unit reads the instruction to perform the following steps: creating a plurality of configuration combinations for the deep learning model, each configuration combination specifying a value combination of a plurality of quantitative configuration parameters; based on each configuration combination, performing quantization operation on the deep learning model to obtain a plurality of models after quantization operation; sequentially deploying the plurality of quantized models to an acceleration unit, and receiving precision data corresponding to the plurality of quantized models from the acceleration unit; and on the basis of the respective precision data of the plurality of models after the quantification operation, obtaining an optimal model of which the precision loss meets a set condition. According to the embodiment of the invention, the neural network acceleration unit and the scheduling unit are matched with each other, so that the optimal model with small precision loss can be quickly obtained.

Description

technical field [0001] The present disclosure relates to the field of neural networks, and in particular, relates to a deep learning model tuning method and computing device. Background technique [0002] In the field of neural networks, inference refers to pushing a pre-trained deep learning model to be used in actual business scenarios. Because inference is directed at the user, inference performance is critical, especially for enterprise-grade products. [0003] Regarding inference performance, in addition to optimization at the hardware level, at the algorithm level, model quantization (Quantized Model) is one of the important means to improve inference performance. There are many methods for model quantization at present, and converting the model from a 32-bit single-precision floating-point number (FP32) model to an 8-bit integer data (INT8) model is one of the methods. Usually we use 32-bit single-precision floating-point numbers when building deep learning models. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/08G06N3/04
CPCG06N3/082G06N3/045
Inventor 赵晓辉李书森
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products