Mixed precision training of an artificial neural network

A technology of artificial neural network and width, applied in the field of mixed precision training of artificial neural network, can solve problems such as accuracy loss, and achieve the effect of improving performance, increasing the level of parallelization, and high precision

Pending Publication Date: 2021-11-09
MICROSOFT TECH LICENSING LLC
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the use of quantization-precision floating-point formats can have c

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mixed precision training of an artificial neural network
  • Mixed precision training of an artificial neural network
  • Mixed precision training of an artificial neural network

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0098] Example 1: A computer implemented method, including: defining an artificial neural network (ANN) including a plurality of node layers; setting the first bit width for the activation value associated with the first layer in the plurality of node layers; The activation value associated with the second layer in the plurality of node layers sets the second bit width; and the first activation function is applied to the first layer in the plurality of node layers during the training of ANN or from the ANN. Thereby, a plurality of activation values ​​having the first bit width are generated, and the second activation function is applied to the second layer in the plurality of node layers, thereby generating a second plurality of activation values ​​having the second bit width.

example 2

[0099] Example 2: Method of computer implementation of Example 1, further comprising: setting the third bit width for the weight associated with the first layer in the plurality of node layers, wherein during the training of ANN or from an in-ANN's inference, multiple The first layer in the node layer generates the weight having the third bit width; and set the fourth bit width for the weight associated with the second layer in the plurality of node layers, wherein during the training of ANN or from the ANN During the inferior period, the second layer in the plurality of node layers generates weights having the fourth bit width.

example 3

[0100] Example 3: Method of computer implementation of Example 1, wherein the first layer in the plurality of node layers comprises an input layer, wherein the second layer in the plurality of node layers includes an output layer, and wherein the first bit width and the second bit width. It is set to a bit width associated with a collection of the remaining node layer.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The use of mixed precision values when training an artificial neural network (ANN) can increase performance while reducing cost. Certain portions and/or steps of an ANN may be selected to use higher or lower precision values when training. Additionally, or alternatively, early phases of training are accurate enough with lower levels of precision to quickly refine an ANN model, while higher levels of precision may be used to increase accuracy for later steps and epochs. Similarly, different gates of a long short-term memory (LSTM) may be supplied with values having different precisions.

Description

Background technique [0001] Artificial Neural Networks ("ANN" or "NN") is applied to many applications in artificial intelligence ("AI") and machine learning ("ml"), including image identification, speech recognition, search engine, and other suitable applications. Ann is usually trained in multiple "EPOCH). In each period, ANN training all training data in the training data in multiple steps. In each step, ANN first predicts an example of training data (in this article, as "sample"). This step is often referred to as "forward pass) (in this article, it can also be referred to as" forward training pass), but the steps may also include rear transmission. [0002] In order to perform prediction, the training data sample is fed to the first layer of ANN, which is often referred to as "input layer". Then, each layer of the ANN will then use a function to calculate a function on its input (Weights)) to generate inputs for the next layer. The output of the last layer of "Output Layer" i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/04G06N3/063G06N3/08
CPCG06N3/063G06N3/084G06N3/088G06N3/048G06N3/044
Inventor 朱海杉T·纳D·洛E·S·钟
Owner MICROSOFT TECH LICENSING LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products