Binary neural network compression method based on weight sensitivity

A binary neural network and hardware compression technology, applied in instruments, adaptive control, control/regulation systems, etc., can solve problems such as high hardware resource overhead and power consumption, wrong weights, and reduced recognition accuracy, to ensure The effect of recognition accuracy

Active Publication Date: 2019-01-15
周军
View PDF5 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although hardware architecture optimization and neural network compression can save storage and power consumption to a certain extent, they are not as simple as binary neural networks.
[0006] Second, the recognition accuracy of the binary neural network is low; among various binary neural networks, in terms of classification tasks, both BinaryConnect and BNN can only complete the classification tasks well for some smaller data sets, such as Handwritten digit set MNIST, common object recognition data set CIFAR, real-world street house number recognition data set SVHN, etc., when replaced with a very large data set such as ImageNet, BinaryConnect and BNN will seriously reduce the recognition accuracy
[0007] Third, the traditional compression method uses a storage device such as 6T SRAM. This storage device causes relatively large hardware resource overhead and power consumption, which limits the scale of the neural network implemented by the chip. Although the binary neural network has good performance on traditional hardware, But its fault tolerance is not fully utilized
However, near-threshold / sub-threshold voltage technologies still face some challenges, with uncertainty or variability issues
At low supply voltages, circuits are susceptible to disturbances that can cause errors in the weights stored in traditional memory devices
If the entire traditional memory device adopts near-threshold / sub-threshold voltage technology, it will greatly affect the recognition accuracy of the neural network

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Binary neural network compression method based on weight sensitivity
  • Binary neural network compression method based on weight sensitivity
  • Binary neural network compression method based on weight sensitivity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0069] Such as Figure 1 to Figure 5 As shown, this embodiment provides a binary neural network hardware compression method based on weight sensitivity. It should be noted that the "first", "second", and "third" described in this embodiment The term of equal serial number is only used to distinguish the same kind of parts, including the following steps:

[0070] In the first step, binary neural network training is used to obtain the weight matrix and raw accuracy.

[0071] In the second step, the sensitivity of any weight matrix is ​​evaluated as follows:

[0072] (21) preset error probability P to evaluate the unreliability of new memory devices and near-threshold / sub-threshold voltages, where P is a number greater than 0 and less than 1; that is, each weight in the weight matrix occurs The probability of being wrong (1→-1, -1→1) is P.

[0073] (22) Errors occur in any binary neural network weight of the weight matrix in sequence with the error probability P to obtain the ...

Embodiment 2

[0088] Such as Figure 6 As shown, this embodiment provides a binary neural network hardware compression method based on weight sensitivity, which combines sensitivity analysis and binary particle swarm optimization to search for binary neural networks that are less sensitive to recognition accuracy. Low weight matrix combination. Among them, the binary particle swarm optimization algorithm is composed of M particles to form a community, search for the optimal value in a D-dimensional target space, update the position of the particles according to the speed update formula, evaluate the pros and cons of each solution with the fitness function, and pass An iterative update method is used to search for the optimal value. It should be noted that the "fourth", "fifth", and other serial numbers described in this embodiment are only used to distinguish similar components. The binary neural network hardware compression method includes the following steps:

[0089] The binary neural ...

Embodiment 3

[0108] Such as Figure 7 As mentioned above, this embodiment provides a binary neural network hardware compression method based on weight sensitivity, and the serial numbers such as "first", "second", and "third" in this embodiment are only used To distinguish similar parts, specifically, the method includes the following steps:

[0109] In the first step, binary neural network training is used to obtain the weight matrix and raw accuracy.

[0110] In the second step, the sensitivity of any weight matrix is ​​evaluated as follows:

[0111] (21) preset error probability P to evaluate the unreliability of new memory devices and near-threshold / sub-threshold voltages, where P is a number greater than 0 and less than 1; that is, each weight in the weight matrix occurs The probability of being wrong (1→-1, -1→1) is P.

[0112] (22) Errors occur in any binary neural network weight of the weight matrix in sequence with the error probability P to obtain the first accuracy of the bin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a binary neural network compression method based on weight sensitivity. The method comprises the following steps of training by using a binary neural network to acquire weightmatrixes and original accuracy; assessing the sensitivity of any weight matrix; presetting a sensitivity threshold, and dividing a sensitive set and a non-sensitive set of the weight matrixes; assessing the sensitivity of the non-sensitive set of the weight matrixes; adjusting the sensitivity threshold to acquire the best non-sensitive set of the weight matrixes, wherein the sensitivity of the best non-sensitive set is equal to a preset maximum accuracy loss value; and storing the best non-sensitive set into a novel storage or a traditional storage using a nearly threshold / sub-threshold voltage technology. According to the scheme, the method provided by the invention has the advantages of low power consumption, high recognition rate, good universality and low cost, and has the wide marketprospect in the field of hardware compression technologies.

Description

technical field [0001] The invention relates to the technical field of hardware compression, in particular to a binary neural network hardware compression method based on weight sensitivity. Background technique [0002] At present, in order to reduce the resource overhead and power consumption required for neural network hardware implementation, the mainstream methods adopted include hardware architecture optimization, neural network compression, binary neural network, etc. Among them, hardware architecture optimization is to design a more efficient way to implement neural networks at the hardware level, reduce the memory resources occupied by data, and reduce the redundancy of data in memory reading and writing and computing methods, so as to achieve the purpose of reducing resource overhead and power consumption. The neural network compression is to realize the compression of the network model by reducing the number of weights and quantization bits in the neural network, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G05B13/02
CPCG05B13/024
Inventor 周军王尹
Owner 周军
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products