Binary neural network compression method based on weight sensitivity

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A binary neural network and hardware compression technology, applied in instruments, adaptive control, control/regulation systems, etc., can solve problems such as high hardware resource overhead and power consumption, wrong weights, and reduced recognition accuracy, to ensure The effect of recognition accuracy

Active Publication Date: 2019-01-15

周军

View PDF5 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although hardware architecture optimization and neural network compression can save storage and power consumption to a certain extent, they are not as simple as binary neural networks.

[0006] Second, the recognition accuracy of the binary neural network is low; among various binary neural networks, in terms of classification tasks, both BinaryConnect and BNN can only complete the classification tasks well for some smaller data sets, such as Handwritten digit set MNIST, common object recognition data set CIFAR, real-world street house number recognition data set SVHN, etc., when replaced with a very large data set such as ImageNet, BinaryConnect and BNN will seriously reduce the recognition accuracy

[0007] Third, the traditional compression method uses a storage device such as 6T SRAM. This storage device causes relatively large hardware resource overhead and power consumption, which limits the scale of the neural network implemented by the chip. Although the binary neural network has good performance on traditional hardware, But its fault tolerance is not fully utilized

However, near-threshold / sub-threshold voltage technologies still face some challenges, with uncertainty or variability issues

At low supply voltages, circuits are susceptible to disturbances that can cause errors in the weights stored in traditional memory devices

If the entire traditional memory device adopts near-threshold / sub-threshold voltage technology, it will greatly affect the recognition accuracy of the neural network

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0069] Such as Figure 1 to Figure 5 As shown, this embodiment provides a binary neural network hardware compression method based on weight sensitivity. It should be noted that the "first", "second", and "third" described in this embodiment The term of equal serial number is only used to distinguish the same kind of parts, including the following steps:

[0070] In the first step, binary neural network training is used to obtain the weight matrix and raw accuracy.

[0071] In the second step, the sensitivity of any weight matrix is evaluated as follows:

[0072] (21) preset error probability P to evaluate the unreliability of new memory devices and near-threshold / sub-threshold voltages, where P is a number greater than 0 and less than 1; that is, each weight in the weight matrix occurs The probability of being wrong (1→-1, -1→1) is P.

[0073] (22) Errors occur in any binary neural network weight of the weight matrix in sequence with the error probability P to obtain the ...

Embodiment 2

[0088] Such as Figure 6 As shown, this embodiment provides a binary neural network hardware compression method based on weight sensitivity, which combines sensitivity analysis and binary particle swarm optimization to search for binary neural networks that are less sensitive to recognition accuracy. Low weight matrix combination. Among them, the binary particle swarm optimization algorithm is composed of M particles to form a community, search for the optimal value in a D-dimensional target space, update the position of the particles according to the speed update formula, evaluate the pros and cons of each solution with the fitness function, and pass An iterative update method is used to search for the optimal value. It should be noted that the "fourth", "fifth", and other serial numbers described in this embodiment are only used to distinguish similar components. The binary neural network hardware compression method includes the following steps:

[0089] The binary neural ...

Embodiment 3

[0108] Such as Figure 7 As mentioned above, this embodiment provides a binary neural network hardware compression method based on weight sensitivity, and the serial numbers such as "first", "second", and "third" in this embodiment are only used To distinguish similar parts, specifically, the method includes the following steps:

[0109] In the first step, binary neural network training is used to obtain the weight matrix and raw accuracy.

[0110] In the second step, the sensitivity of any weight matrix is evaluated as follows:

[0111] (21) preset error probability P to evaluate the unreliability of new memory devices and near-threshold / sub-threshold voltages, where P is a number greater than 0 and less than 1; that is, each weight in the weight matrix occurs The probability of being wrong (1→-1, -1→1) is P.

[0112] (22) Errors occur in any binary neural network weight of the weight matrix in sequence with the error probability P to obtain the first accuracy of the bin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a binary neural network compression method based on weight sensitivity. The method comprises the following steps of training by using a binary neural network to acquire weightmatrixes and original accuracy; assessing the sensitivity of any weight matrix; presetting a sensitivity threshold, and dividing a sensitive set and a non-sensitive set of the weight matrixes; assessing the sensitivity of the non-sensitive set of the weight matrixes; adjusting the sensitivity threshold to acquire the best non-sensitive set of the weight matrixes, wherein the sensitivity of the best non-sensitive set is equal to a preset maximum accuracy loss value; and storing the best non-sensitive set into a novel storage or a traditional storage using a nearly threshold / sub-threshold voltage technology. According to the scheme, the method provided by the invention has the advantages of low power consumption, high recognition rate, good universality and low cost, and has the wide marketprospect in the field of hardware compression technologies.

Description

technical field [0001] The invention relates to the technical field of hardware compression, in particular to a binary neural network hardware compression method based on weight sensitivity. Background technique [0002] At present, in order to reduce the resource overhead and power consumption required for neural network hardware implementation, the mainstream methods adopted include hardware architecture optimization, neural network compression, binary neural network, etc. Among them, hardware architecture optimization is to design a more efficient way to implement neural networks at the hardware level, reduce the memory resources occupied by data, and reduce the redundancy of data in memory reading and writing and computing methods, so as to achieve the purpose of reducing resource overhead and power consumption. The neural network compression is to realize the compression of the network model by reducing the number of weights and quantization bits in the neural network, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G05B13/02

CPCG05B13/024

Inventor 周军王尹

Owner 周军

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Binary neural network compression method based on weight sensitivity

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology