Model compression method and processing equipment based on pruning, weight sharing and coding

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A compression method and weight technology, applied in the field of model compression, can solve the problems of poor calculation ability of artificial intelligence models, insufficient model compression, no weight sharing, etc., to reduce the amount of calculation and storage overhead, improve capabilities, and reduce energy consumption. consumption effect

Inactive Publication Date: 2022-04-15

SHENZHEN HONGDIAN TECH CORP

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0007] The purpose of the present invention is to provide a model compression method and processing device based on pruning, weight sharing and encoding, so as to solve the problems in the prior art that the model compression is not sufficient, and the weights are not shared and encoded, causing the mobile device to real-time Computational AI model technical issues with poor capabilities

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0027] Such as figure 1 As shown, the present invention provides a model compression method based on pruning, weight sharing and encoding, which includes the following steps. S100: Through normal training, a model composed of original network weights and neurons is obtained, and the network training method adopts the technology in the existing neural network to obtain the initial model before compression. S200: According to the threshold setting (the specific threshold can be set according to the specific distribution of the original network weights of the model, or the computing power of the mobile terminal, such as the original network weight 0.05 as the threshold), the connection network between neurons in the model Perform pruning processing, that is, remove unnecessary original network weights and corresponding neurons, so that only important weight parameters for the network are retained. Specifically, the original network weights and corresponding neurons that are lower...

Embodiment 2

[0038] A processing device comprising: one or more processors; a memory for storing one or more computer programs, one or more of the processors for executing the one or more computer programs stored in the memory, so that One or more of said processors execute the model compression method based on pruning, weight sharing and encoding of the present invention. The processing device reduces the computational load and storage overhead of the artificial intelligence model through pruning processing, weight sharing, weight smoothing and encoding operations, has strong applicability, and also reduces energy consumption.

[0039] Those skilled in the art can understand that all or part of the features / steps of the above-mentioned method embodiments can be realized by methods, data processing systems or computer programs, and these features can be implemented without hardware, all by software, or by hardware combined with software. The foregoing computer program may be stored in one...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a model compression method and processing equipment based on pruning, weight sharing and coding, relates to the technical field of model compression, and solves the technical problems that model compression is insufficient, and the capability of mobile equipment for calculating an artificial intelligence model in real time is poor. The method comprises the following steps: S100, obtaining an initial model composed of an original network weight and neurons through normal training; s200, according to threshold setting, a result model is finally obtained; s300, carrying out weight sharing on a connection network among neurons of the result model through clustering to obtain a shared network weight and corresponding neurons; s400, carrying out smoothing processing on the shared network weight obtained after the weight is shared; and S500, encoding the shared network weight after smoothing processing by adopting a variable length bit, and storing an index value of the shared network weight to obtain a compressed model. According to the method, the model is fully compressed, and the capability of calculating the artificial intelligence model in real time by the mobile equipment is improved.

Description

technical field [0001] The invention relates to the technical field of model compression, in particular to a model compression method and processing equipment based on pruning, weight sharing and encoding. Background technique [0002] For a long time, machine learning researchers have been committed to developing deeper and larger models to achieve higher precision and accuracy, but also lead to models with a large number of parameters, high storage space occupation, and complex calculations. Usually, it is impossible for a model to run on the GPU and server alone, and the calculation amount is unrealistic for edge terminals such as mobile phones and smart devices. Under the condition of ensuring the accuracy and accuracy of the network model, model compression becomes a machine important direction of learning. [0003] By using the unique model compression algorithm technology, the corresponding amount of calculation can be completed on the edge terminal. The model compr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/08G06N3/06G06K9/62

CPCG06N3/082G06N3/061G06F18/23

Inventor 张小虎左绍舟龚潇刘文强丁靖

Owner SHENZHEN HONGDIAN TECH CORP

Model compression method and processing equipment based on pruning, weight sharing and coding

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology