Batch normalization layer fusion and quantization method for model inference in ai neural network engine

A neural network and normalization technology, applied in biological neural network models, neural learning methods, kernel methods, etc., can solve problems such as reducing the computational efficiency of CNN

Pending Publication Date: 2020-12-25
BAIDU USA LLC
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Retrieving results from memory thus reduces the computational efficiency of the CNN

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Batch normalization layer fusion and quantization method for model inference in ai neural network engine
  • Batch normalization layer fusion and quantization method for model inference in ai neural network engine
  • Batch normalization layer fusion and quantization method for model inference in ai neural network engine

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The following detailed description provides implementations and examples of batch normalization (BN) layer fusion and quantization methods for model inference in an artificial intelligence (AI) network engine. References to neural networks (NN) include any type of neural network that can be used in an AI network engine, including deep neural networks (DNN) and convolutional neural networks (CNN). For example, a NN may be an instance of a machine learning algorithm or process, and may be trained to perform a given task, such as classifying input data (eg, an input image). Training a NN may include determining weights and biases for data passing through the NN, including determining batch normalization (BN) parameters for the NN's inference performance.

[0020] Once trained, a NN can perform a task by computing an output using the parameters, weights, and biases of any number of layers to produce activations that determine the classification or score of the input data. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Batch normalization (BN) layer fusion and quantization method for model inference in artificial intelligence (AI) network engine are disclosed. A method for a neural network (NN) includes merging batch normalization (BN) layer parameters with NN layer parameters and computing merged BN layer and NN layer functions using the merged BN and NN layer parameters. A rectified linear unit (RELU) functioncan be merged with the BN and NN layer functions.

Description

technical field [0001] Embodiments of the invention generally relate to data and computational processing, including neural network (NN) hardware and software, for improved inference performance. More specifically, embodiments of the present invention relate to batch normalization layer fusion and quantization methods for model inference in artificial intelligence (AI) network engines. Background technique [0002] A neural network (NN) is a type of machine learning that mimics the human brain. For example, a NN takes an input (vector) and passes it through a series of hidden layers of nodes (neurons) that transform the input using weights or filters. Each node can be connected to nodes in the previous layer to receive input, and the last output layer or fully connected layer can provide classification or category scores. One type of NN commonly used for image classification is a convolutional neural network (CNN)—for example, using a CNN to determine whether an input imag...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/04G06N3/08
CPCG06N3/08G06N3/045G06N3/063G06N3/048G06N3/04G06N20/10
Inventor 郭敏
Owner BAIDU USA LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products