Unlock instant, AI-driven research and patent intelligence for your innovation.

Cursor-based adaptive quantization for deep neural networks

A neural network and cursor technology, which is used in biological neural network models, neural architectures, neural learning methods, etc.

Active Publication Date: 2021-05-25
BAIDU COM TIMES TECH (BEIJING) CO LTD +1
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] While existing quantization-based approaches (mainly using a fixed-bit scheme to represent the entire DNN model) yield some encouraging compression ratios while maintaining model performance, simply using only fixed-bit quantization may not be a good trade-off between model size and The best choice for the trade-off between performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cursor-based adaptive quantization for deep neural networks
  • Cursor-based adaptive quantization for deep neural networks
  • Cursor-based adaptive quantization for deep neural networks

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] In the following description, for purposes of explanation, specific details are set forth in order to provide an understanding of the present disclosure. It will be apparent, however, to one skilled in the art that the present disclosure may be practiced without these details. Furthermore, those skilled in the art will appreciate that the embodiments of the present disclosure described below can be implemented in various ways, such as a process, apparatus, system, device, or method, on tangible computer-readable media or media.

[0017] Components or modules shown in the figures are illustrative of embodiments of the disclosure and are intended to avoid obscuring the disclosure. It should also be understood that throughout this discussion, components may be described as separate functional units (which may include subunits), but those skilled in the art will recognize that various components, or portions thereof, may be divided into separate components, or may be Integ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Deep neural networks (DNN) model quantization may be used to reduce storage and computation burdens by decreasing the bit width. A method for cursor-based adaptive quantization for a neural network, a multiple bits quantization mechanism is formulated as a differentiable architecture search (DAS) process with a continuous cursor that represents a possible quantization bit. The cursor-based DAS adaptively searches for a quantization bit for each layer. The DAS process may be accelerated via an alternative approximate optimization process, which is designed for mixed quantization scheme of a DNN model. A new loss function is used in the search process to simultaneously optimize accuracy and parameter size of the model. In a quantization step, the closest two integers to the cursor may be adopted as the bits to quantize the DNN together to reduce the quantization noise and avoid the local convergence problem.

Description

technical field [0001] The present disclosure generally relates to systems and methods for computer learning that can provide improved computer performance, features and uses. More particularly, the present disclosure relates to systems and methods for efficiently reducing the memory size of deep neural networks. Background technique [0002] Deep Learning (DL) has achieved great success in various fields such as gaming, natural language processing, speech recognition, computer vision, etc. However, its huge computational burden and large memory consumption still limit many potential applications, especially for mobile devices and embedded systems. [0003] Much effort has been invested to compress the size of DL models and speed up their training and testing. These efforts can be broadly classified into four categories: network pruning, low-rank approximation, knowledge distillation, and network quantization. Among them, network quantization methods jointly optimize the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/08G06N3/04
CPCG06N3/08G06N3/063G06N3/04
Inventor 李抱朴范彦文程治宇包英泽
Owner BAIDU COM TIMES TECH (BEIJING) CO LTD