Programmable Deep Neural Network Processor

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep neural network and processor technology, applied in the field of programmable deep neural network processors, can solve the problems of high power consumption, frequent transmission of chip data, etc., achieve low power consumption, low cost, and improve hardware utilization.

Active Publication Date: 2020-09-04

周军

View PDF10 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Repeated loading of the same filter will lead to frequent data transmission on-chip or off-chip, resulting in large power consumption

[0009] Fifth: For deep convolutional neural networks, multiply-accumulate operations generate most of the power consumption

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0058] Embodiment 1: see Figure 1 to Figure 7 . In the prior art, by figure 1 It can be seen that since the points in the output feature map are calculated on a behavior basis, it is necessary to wait for multiple lines in the output feature map to complete. This makes pipelining difficult, and it also requires a first-in-first-out memory to store all points in a row, which increases hardware overhead.

[0059] Depend on figure 2 It can be seen that the present invention and figure 1 Differently, we propose a cluster-based convolution operation, which computes the points in the output feature map in units of clusters instead of rows.

[0060] Depend on image 3 , Figure 4 , Figure 5 It can be seen that in the prior art, after convolution, the output results of different input feature maps need to be added (such as image 3 ), which is usually done by computing points with the same location from different input feature maps and adding them together (eg Figure 4 ),...

Embodiment 2

[0081] Example 2: see Figure 8 , the system constructs a block diagram of a specific embodiment. Among them, DDR3, JTAG, DDR controller, selector, arbitrator, feature map buffer and filter buffer constitute the storage part of the programmable deep neural network processor. The data comes from three parts, and one part is loaded through the JTAG port. The data, that is, user instructions and other upper instructions, part of which is data such as weights and feature maps, and part of it is intermediate data processed by the present invention, which needs to be temporarily stored in DDR3.

[0082] Therefore, DDR3 is used to store data. When the program control unit is working, the data is read from DDR3 to the chip, JTAG is used to write all data into DDR3, and the DDR controller is used to control whether DDR3 is read or written; the data passes through the DDR controller. After the read and write control, enter the arbitrator through the selector, where the selector is used...

Embodiment 3

[0085] Embodiment 3: see image 3 and Figure 4 , assuming one input feature map and one output feature map.

[0086] The pixel of the input feature map is Xin*Xin is 256*256, the pixel of the corresponding weight data is 11*11, and the convolution step S is 4;

[0087] Its processing method is:

[0088] (1) The program control unit obtains the user instruction, analyzes the user instruction, and obtains the parameters of the convolutional neural network; the parameters include that the pixel of the input feature map is Xin*Xin, which is 256*256, and the pixel Y*Y of the corresponding weight data is 11*11, the convolution step size S is 4, the input feature map is one, and the output feature map is one;

[0089] Then, the program control unit reads a feature map from the feature map buffer as an input feature map, and obtains its corresponding weight data from the filter buffer according to the input feature map, wherein the pixel of the input image is Xin*Xin , the pixel ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a programmable deep neural network processor, which includes a program control unit, a filter buffer area, and a feature map buffer area. The feature map buffer area is used to cache multiple feature maps. The filter buffer area is used for It also includes a layer processing engine, and the convolution unit part of the layer processing engine includes a multiplication and accumulation unit, a convolution accumulation unit and a feature map accumulation unit arranged in sequence, and the feature map cache The area and the filter buffer area are connected to the input end of the layer processing engine, and a data shaping and multiplexing unit is also arranged between the feature map buffer area and the input end of the layer processing engine. The present invention realizes a low-power, low-cost programmable deep neural network processor through the multiplexing control of the multiply-accumulate unit, the feature map data reading control, the feature map accumulation control, and the redundant data elimination control.

Description

technical field [0001] The invention relates to a deep neural network processor, in particular to a programmable deep neural network processor. Background technique [0002] Today, artificial intelligence based on deep neural networks has been proven to assist or even replace humans in many applications, such as autonomous driving, image recognition, medical diagnosis, gaming, financial data analysis, and search engines. This makes artificial intelligence algorithms a research hotspot. However, the related algorithms lack the matching hardware (especially the core chip) support. Traditional CPUs and GPUs are not specifically developed for artificial intelligence algorithms, and have major problems in terms of performance, power consumption, and hardware overhead. In recent years, there have been some dedicated artificial intelligence processors, which are mainly based on FPGA (Field Programmable Gate Array) or ASIC (Application Specific Integrated Circuit) platforms, such ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06N3/04G06N3/063

CPCG06N3/063G06N3/045

Inventor 周军王波

Owner 周军

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Programmable Deep Neural Network Processor

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology