Unlock instant, AI-driven research and patent intelligence for your innovation.

Implementation method for parallel acceleration of ResNet based on general neural network processor

A technology of neural network and implementation method, which is applied in the field of parallel accelerated ResNet based on general neural network processors, can solve problems such as poor effect, frequent reading and writing of memory, etc., and achieve short model convergence time, fast image recognition speed, and low energy consumption Effect

Inactive Publication Date: 2021-11-16
苏州仰思坪半导体有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

To implement these operations on the GPU, it needs to be decomposed into basic scalar multiplication and addition, resulting in a large number of intermediate results, frequent reading and writing of memory, and the application effect is not good

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Implementation method for parallel acceleration of ResNet based on general neural network processor
  • Implementation method for parallel acceleration of ResNet based on general neural network processor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] In order to understand the above-mentioned purpose, features and advantages of the present invention more clearly, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments can be combined with each other.

[0028] In the following description, many specific details are set forth in order to fully understand the present invention. However, the present invention can also be implemented in other ways different from those described here. Therefore, the protection scope of the present invention is not limited by the specific details disclosed below. EXAMPLE LIMITATIONS.

[0029] Such as figure 1 As shown, an implementation method based on a general-purpose neural network processor to accelerate ResNet in parallel includes the following steps: Step S1: Load the data s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an implementation method for parallel acceleration of ResNet based on a general neural network processor. The method comprises the following steps: loading a data set and a weight into a register from a central cache region, carrying out the operation of a convolutional layer, and loading a weight matrix into a matrix multiplication unit from the register; sending the data set matrix to a matrix multiplication unit in a streaming data mode, writing an operation result back to a register, and repeating the process until all data are processed; using a vector compression unit to complete batch standardization operation; completing the operation of linear rectification by using an SIMD operation unit ; completing the operation of a pooling layer by using the SIMD operation unit and a vector compression unit; completing operation of a full connection layer by using a matrix multiplication unit, and writing a result back to the register; and writing the result in the register back to the central cache region. The matrix multiplication unit is adopted to complete the operation of the convolution layer and a full connection layer, the optimal performance and performance power consumption ratio are achieved, and lower energy consumption, shorter model convergence time and higher image recognition speed are achieved.

Description

technical field [0001] The invention relates, in particular, to a method for realizing parallel acceleration of ResNet based on a general neural network processor. Background technique [0002] Residual network "ResNet" is a machine learning model. In the past, when performing deep network training, there was a problem that the difference gradient disappeared. Repeated matrix multiplication may make the difference gradient between the result and the initial content smaller. As the network In-depth, the performance of the model will reach saturation or even decline in the end. "ResNet" solves this problem by establishing a shortcut to alleviate the problem of gradient disappearance, so that our model can learn a constant function to ensure that the performance is at least as before without deteriorating. ResNet makes computer image recognition practical and has been widely and importantly used in life. With the development of deep neural network technology, the number of la...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/063G06N3/08G06F9/38
CPCG06N3/063G06N3/08G06F9/3887G06N3/045
Inventor 杨龚轶凡闯小明郑瀚寻王润哲
Owner 苏州仰思坪半导体有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More