Method and system capable of switching bit-wide quantized neural network on line

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A neural network and bit-bit technology, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve problems such as changing bit widths

Pending Publication Date: 2020-12-18

SHANGHAI JIAO TONG UNIV

View PDF2 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

But it can only quantize the neural network to 8 bits, and cannot flexibly change the bit width during network operation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0073] A method for quantizing a neural network capable of switching bit widths online according to the present invention, comprising:

[0074] Step M1: Integrate deep neural networks with different bit widths into a super network, and all networks with bit widths share the same network architecture;

[0075] Step M2: The supernetwork operates with different bit widths. For any bit width, the corresponding network intermediate layer features are obtained, and each network intermediate layer feature is processed by a corresponding batch normalization layer;

[0076] Step M3: train the supernetwork through supervised learning, simulate quantization noise in the supernetwork training stage, until the consistency loss function between the low-bit mode and the high-bit mode converges, and obtain the trained supernetwork;

[0077] Step M4: using the preset quantizer to extract the quantized neural network of the target bit from the trained hypernetwork for low-bit inference;

[007...

Embodiment 2

[0132] Embodiment 2 is a modification of embodiment 1

[0133] In view of the defects in the prior art, the object of the present invention is to provide a quantized neural network capable of switching bit widths online, so that the deep neural network can be deployed with different bit widths without any additional training.

[0134] like figure 1 As shown, it is a flow chart of the quantized neural network that can switch the bit width online in the present invention. The method builds a hypernetwork, integrates the neural networks of different bits into the same network structure, and through different bit Wide, features are processed with a separate batch normalization layer to ensure network convergence. Through the consistency loss function step, the consistency between the low-bit mode and the high-bit mode is constrained in the training phase to reduce the error caused by quantization. The hypernetwork is optimized by quantization-aware training, and fast inference a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a method and a system for switching bit-wide quantized neural network on line, and the method comprises the steps: enabling deep neural networks with different bit widths to beintegrated into a super network, wherein all the networks with different bit widths share the same network architecture; the super network operating at different bit widths, for any bit width, obtaining a corresponding network intermediate layer feature, and processing each network intermediate layer feature by adopting a corresponding batch normalization layer; training the super network throughsupervised learning, and simulating quantization noise in a super-network training stage until a consistency loss function between a low-bit mode and a high-bit mode converges, so as to obtain a trained super network; and extracting the quantized neural network of the target bit from the trained super network by using a preset quantizer to perform low-bit reasoning. While the neural network is nottrained again, the bit width can be switched at will to adapt to different hardware deployment environments.

Description

technical field [0001] The invention relates to the fields of computer vision and image processing, in particular to a method and system for a quantized neural network capable of switching bit widths online. Background technique [0002] With the increasing complexity of deep neural networks (DNNs), there are often great challenges in deploying deep neural networks. Therefore, model compression and model acceleration have received increasing attention in machine learning. An important research direction is quantization of deep neural networks, which quantizes model weights and intermediate layer activations to smaller bit widths. Due to the reduced bit width, quantized deep neural networks have smaller model sizes and can leverage efficient fixed-point computations for fast inference. However, when the bit width is reduced to 4 bits or even smaller, there will be a significant loss of precision. To alleviate this problem, quantization-aware training is commonly employed t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06N3/08

CPCG06N3/08G06N3/045

Inventor 张娅杜昆原王延峰

Owner SHANGHAI JIAO TONG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method and system capable of switching bit-wide quantized neural network on line

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology