Sparse accelerator applied to on-chip training

An accelerator, sparse technology, applied in biological neural network models, neural architectures, neural learning methods, etc., can solve the problems of ineffective operations on-chip training, inability to guarantee on-chip training, etc., to improve hardware utilization and accurate elimination.

Pending Publication Date: 2022-05-13
NANJING UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] This application provides a sparse accelerator applied to on-chip training, which is used to solve the technical problem that after the acceleration of the existing accele

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sparse accelerator applied to on-chip training
  • Sparse accelerator applied to on-chip training
  • Sparse accelerator applied to on-chip training

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0057] In order to make the objectives, technical solutions and advantages of the present application clearer, the embodiments of the present application will be further described in detail below with reference to the accompanying drawings.

[0058] The terminology used in the following embodiments is for the purpose of describing particular embodiments only, and is not intended to be limiting of the present application. As used in the specification of this application and the appended claims, the singular expressions "a," "an," "the," "above," "the," and "the" are intended to also Expressions such as "one or more" are included unless the context clearly dictates otherwise. It should also be understood that, in the following embodiments of the present application, "at least one" and "one or more" refer to one, two or more, and "a plurality" refers to two or more. The term "and / or", used to describe the association relationship of related objects, indicates that there can be t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

According to the sparse accelerator applied to on-chip training provided by the embodiment of the invention, a plurality of input numerical values in the input numerical value buffer module, a plurality of reference numerical values in the reference numerical value buffer module and a mask in the mask buffer module are dynamically adjusted in different acceleration stages; the method comprises the following steps of: performing primary screening on invalid operations by a coarse-grained unit in a coarse-grained manner, and further screening the invalid operations by a fine-grained unit included in each processing unit in a processing module in a fine-grained manner, thereby eliminating all invalid operations in three training stages of on-chip training; in addition, the plurality of computing cores can perform parallel acceleration processing, and the plurality of processing units included in the processing modules in the plurality of computing cores can also perform parallel acceleration processing, so that the hardware utilization rate of the sparse accelerator is further improved. Therefore, the sparse accelerator applied to on-chip training can efficiently and accurately eliminate all invalid operations in three stages of on-chip training.

Description

technical field [0001] This application relates to the field of computer and electronic information technology, in particular to a sparse accelerator applied to on-chip training. Background technique [0002] In recent years, Convolutional Neural Networks (CNN) have performed well in the fields of computer vision, speech recognition, and natural language processing. In order to improve the accuracy of recognition, it is necessary to train the CNN model on-chip. Fine-tune the CNN model with the user data on the network, and then improve the training accuracy of the CNN model when it is used. [0003] The on-chip training process includes three stages, namely the forward propagation (FP) stage, the backward propagation (BP) stage and the weight gradient calculation (weight gradient, WG) stage. In the FP stage, the previous layer The activation value of the current layer is used as the input activation value of the current layer, combined with the convolution kernel weight cor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/04G06N3/063G06N3/08
CPCG06N3/063G06N3/082G06N3/084G06N3/045
Inventor 王中风黄健鲁金铭
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products