A neural network weight compression method based on non-uniform quantization and its application method
A technology of neural network and compression method, applied in the field of neural network weight compression based on non-uniform quantization, can solve the problem of not being able to describe connection weights and connection weights well at the same time, so as to reduce capacity, solve contradictions, and ensure system performance effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0029] The present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.
[0030] Such as figure 2 As shown, the present invention provides a method for compressing neural network weights based on non-uniform quantization. The method is to group the connection weights obtained after the training of the neural network, normalize the maximum value, and compress and encode. The specific process is as follows:
[0031] 1) Since branch pruning operations are usually performed during the training of connection weights, the weight distribution of the trained neural network presents a double-hump distribution, such as image 3 shown. Therefore, the connection weights are grouped based on the data probability:
[0032] 1.1) The weights are centered on 0 and divided into two groups: group 0 and group 1.
[0033] 1.2) Add the offset value C0 to the weight in group 0, and the offset value is the mean value of group 0, so that the m...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


