Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system capable of switching bit-wide quantized neural network on line

A neural network and bit-bit technology, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve problems such as changing bit widths

Pending Publication Date: 2020-12-18
SHANGHAI JIAO TONG UNIV
View PDF2 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But it can only quantize the neural network to 8 bits, and cannot flexibly change the bit width during network operation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system capable of switching bit-wide quantized neural network on line
  • Method and system capable of switching bit-wide quantized neural network on line
  • Method and system capable of switching bit-wide quantized neural network on line

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0073] A method for quantizing a neural network capable of switching bit widths online according to the present invention, comprising:

[0074] Step M1: Integrate deep neural networks with different bit widths into a super network, and all networks with bit widths share the same network architecture;

[0075] Step M2: The supernetwork operates with different bit widths. For any bit width, the corresponding network intermediate layer features are obtained, and each network intermediate layer feature is processed by a corresponding batch normalization layer;

[0076] Step M3: train the supernetwork through supervised learning, simulate quantization noise in the supernetwork training stage, until the consistency loss function between the low-bit mode and the high-bit mode converges, and obtain the trained supernetwork;

[0077] Step M4: using the preset quantizer to extract the quantized neural network of the target bit from the trained hypernetwork for low-bit inference;

[007...

Embodiment 2

[0132] Embodiment 2 is a modification of embodiment 1

[0133] In view of the defects in the prior art, the object of the present invention is to provide a quantized neural network capable of switching bit widths online, so that the deep neural network can be deployed with different bit widths without any additional training.

[0134] like figure 1 As shown, it is a flow chart of the quantized neural network that can switch the bit width online in the present invention. The method builds a hypernetwork, integrates the neural networks of different bits into the same network structure, and through different bit Wide, features are processed with a separate batch normalization layer to ensure network convergence. Through the consistency loss function step, the consistency between the low-bit mode and the high-bit mode is constrained in the training phase to reduce the error caused by quantization. The hypernetwork is optimized by quantization-aware training, and fast inference a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and a system for switching bit-wide quantized neural network on line, and the method comprises the steps: enabling deep neural networks with different bit widths to beintegrated into a super network, wherein all the networks with different bit widths share the same network architecture; the super network operating at different bit widths, for any bit width, obtaining a corresponding network intermediate layer feature, and processing each network intermediate layer feature by adopting a corresponding batch normalization layer; training the super network throughsupervised learning, and simulating quantization noise in a super-network training stage until a consistency loss function between a low-bit mode and a high-bit mode converges, so as to obtain a trained super network; and extracting the quantized neural network of the target bit from the trained super network by using a preset quantizer to perform low-bit reasoning. While the neural network is nottrained again, the bit width can be switched at will to adapt to different hardware deployment environments.

Description

technical field [0001] The invention relates to the fields of computer vision and image processing, in particular to a method and system for a quantized neural network capable of switching bit widths online. Background technique [0002] With the increasing complexity of deep neural networks (DNNs), there are often great challenges in deploying deep neural networks. Therefore, model compression and model acceleration have received increasing attention in machine learning. An important research direction is quantization of deep neural networks, which quantizes model weights and intermediate layer activations to smaller bit widths. Due to the reduced bit width, quantized deep neural networks have smaller model sizes and can leverage efficient fixed-point computations for fast inference. However, when the bit width is reduced to 4 bits or even smaller, there will be a significant loss of precision. To alleviate this problem, quantization-aware training is commonly employed t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/04G06N3/08
CPCG06N3/08G06N3/045
Inventor 张娅杜昆原王延峰
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products