Unlock instant, AI-driven research and patent intelligence for your innovation.

Model compression method and processing equipment based on pruning, weight sharing and coding

A compression method and weight technology, applied in the field of model compression, can solve the problems of poor calculation ability of artificial intelligence models, insufficient model compression, no weight sharing, etc., to reduce the amount of calculation and storage overhead, improve capabilities, and reduce energy consumption. consumption effect

Inactive Publication Date: 2022-04-15
SHENZHEN HONGDIAN TECH CORP
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is to provide a model compression method and processing device based on pruning, weight sharing and encoding, so as to solve the problems in the prior art that the model compression is not sufficient, and the weights are not shared and encoded, causing the mobile device to real-time Computational AI model technical issues with poor capabilities

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model compression method and processing equipment based on pruning, weight sharing and coding
  • Model compression method and processing equipment based on pruning, weight sharing and coding
  • Model compression method and processing equipment based on pruning, weight sharing and coding

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0027] Such as figure 1 As shown, the present invention provides a model compression method based on pruning, weight sharing and encoding, which includes the following steps. S100: Through normal training, a model composed of original network weights and neurons is obtained, and the network training method adopts the technology in the existing neural network to obtain the initial model before compression. S200: According to the threshold setting (the specific threshold can be set according to the specific distribution of the original network weights of the model, or the computing power of the mobile terminal, such as the original network weight 0.05 as the threshold), the connection network between neurons in the model Perform pruning processing, that is, remove unnecessary original network weights and corresponding neurons, so that only important weight parameters for the network are retained. Specifically, the original network weights and corresponding neurons that are lower...

Embodiment 2

[0038] A processing device comprising: one or more processors; a memory for storing one or more computer programs, one or more of the processors for executing the one or more computer programs stored in the memory, so that One or more of said processors execute the model compression method based on pruning, weight sharing and encoding of the present invention. The processing device reduces the computational load and storage overhead of the artificial intelligence model through pruning processing, weight sharing, weight smoothing and encoding operations, has strong applicability, and also reduces energy consumption.

[0039] Those skilled in the art can understand that all or part of the features / steps of the above-mentioned method embodiments can be realized by methods, data processing systems or computer programs, and these features can be implemented without hardware, all by software, or by hardware combined with software. The foregoing computer program may be stored in one...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a model compression method and processing equipment based on pruning, weight sharing and coding, relates to the technical field of model compression, and solves the technical problems that model compression is insufficient, and the capability of mobile equipment for calculating an artificial intelligence model in real time is poor. The method comprises the following steps: S100, obtaining an initial model composed of an original network weight and neurons through normal training; s200, according to threshold setting, a result model is finally obtained; s300, carrying out weight sharing on a connection network among neurons of the result model through clustering to obtain a shared network weight and corresponding neurons; s400, carrying out smoothing processing on the shared network weight obtained after the weight is shared; and S500, encoding the shared network weight after smoothing processing by adopting a variable length bit, and storing an index value of the shared network weight to obtain a compressed model. According to the method, the model is fully compressed, and the capability of calculating the artificial intelligence model in real time by the mobile equipment is improved.

Description

technical field [0001] The invention relates to the technical field of model compression, in particular to a model compression method and processing equipment based on pruning, weight sharing and encoding. Background technique [0002] For a long time, machine learning researchers have been committed to developing deeper and larger models to achieve higher precision and accuracy, but also lead to models with a large number of parameters, high storage space occupation, and complex calculations. Usually, it is impossible for a model to run on the GPU and server alone, and the calculation amount is unrealistic for edge terminals such as mobile phones and smart devices. Under the condition of ensuring the accuracy and accuracy of the network model, model compression becomes a machine important direction of learning. [0003] By using the unique model compression algorithm technology, the corresponding amount of calculation can be completed on the edge terminal. The model compr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/08G06N3/06G06K9/62
CPCG06N3/082G06N3/061G06F18/23
Inventor 张小虎左绍舟龚潇刘文强丁靖
Owner SHENZHEN HONGDIAN TECH CORP