Programmable depth neural network processor

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep neural network and processor technology, applied in the field of programmable deep neural network processors, can solve the problems of high power consumption, frequent transmission of chip data, etc., and achieve the effect of low power consumption, low cost, and reduced computing performance

Active Publication Date: 2018-09-11

周军

View PDF10 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Repeated loading of the same filter will lead to frequent data transmission on-chip or off-chip, resulting in large power consumption

[0009] Fifth: For deep convolutional neural networks, multiply-accumulate operations generate most of the power consumption

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0058] Embodiment 1: see Figure 1 to Figure 7 . In the prior art, by figure 1 It can be seen that since the points in the output feature map are calculated on a behavior basis, it is necessary to wait for multiple lines in the output feature map to complete. This makes pipelining difficult, and it also requires a first-in-first-out memory to store all points in a row, which increases hardware overhead.

[0059] Depend on figure 2 It can be seen that the present invention and figure 1 Differently, we propose a cluster-based convolution operation, which computes the points in the output feature map in units of clusters instead of rows.

[0060] Depend on image 3 , Figure 4 , Figure 5 It can be seen that in the prior art, after convolution, the output results of different input feature maps need to be added (such as image 3 ), which is usually done by computing points with the same location from different input feature maps and adding them together (eg Figure 4 ),...

Embodiment 2

[0081] Example 2: see Figure 8 , the system constructs a block diagram of a specific embodiment. Among them, DDR3, JTAG, DDR controller, selector, arbitrator, feature map buffer and filter buffer constitute the storage part of the programmable deep neural network processor. The data comes from three parts, and one part is loaded through the JTAG port. The data, that is, user instructions and other upper instructions, part of which is data such as weights and feature maps, and part of it is intermediate data processed by the present invention, which needs to be temporarily stored in DDR3.

[0082] Therefore, DDR3 is used to store data. When the program control unit is working, the data is read from DDR3 to the chip, JTAG is used to write all data into DDR3, and the DDR controller is used to control whether DDR3 is read or written; the data passes through the DDR controller. After the read and write control, enter the arbitrator through the selector, where the selector is used...

Embodiment 3

[0085] Embodiment 3: see image 3 and Figure 4 , assuming one input feature map and one output feature map.

[0086] The pixel of the input feature map is Xin*Xin is 256*256, the pixel of the corresponding weight data is 11*11, and the convolution step S is 4;

[0087] Its processing method is:

[0088] (1) The program control unit obtains the user instruction, analyzes the user instruction, and obtains the parameters of the convolutional neural network; the parameters include that the pixel of the input feature map is Xin*Xin, which is 256*256, and the pixel Y*Y of the corresponding weight data is 11*11, the convolution step size S is 4, the input feature map is one, and the output feature map is one;

[0089] Then, the program control unit reads a feature map from the feature map buffer as an input feature map, and obtains its corresponding weight data from the filter buffer according to the input feature map, wherein the pixel of the input image is Xin*Xin , the pixel ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a programmable depth neural network processor, which comprises a program control unit, a filter cache area and a characteristic graph cache area, wherein the characteristic graph cache area is used for caching a plurality of characteristic graphs, and the filter cache area is used for caching weight data matched with the characteristic graph. The programmable depth neural network processor further comprises a layer processing engine, wherein the convolution unit part of the layer processing engine comprises a multiply accumulation unit, a convolution accumulation unit and a characteristic graph accumulation unit, which are arranged in order. The characteristic graph cache area and the filter cache area are connected with the input end of the layer processing engine,and a data shaping and multiplexing unit is further arranged between the characteristic graph buffer area and the input end of the layer processing engine. According to the invention, the multiplex control of the multiply accumulation unit, the characteristic graph data reading control and the characteristic graph accumulation control are carried out, and redundant data removal control is achieved, so that the programmable depth neural network processor with low power consumption and low cost is achieved.

Description

technical field [0001] The invention relates to a deep neural network processor, in particular to a programmable deep neural network processor. Background technique [0002] Today, artificial intelligence based on deep neural networks has been proven to assist or even replace humans in many applications, such as autonomous driving, image recognition, medical diagnosis, gaming, financial data analysis, and search engines. This makes artificial intelligence algorithms a research hotspot. However, the related algorithms lack the matching hardware (especially the core chip) support. Traditional CPUs and GPUs are not specifically developed for artificial intelligence algorithms, and have major problems in terms of performance, power consumption, and hardware overhead. In recent years, there have been some dedicated artificial intelligence processors, which are mainly based on FPGA (Field Programmable Gate Array) or ASIC (Application Specific Integrated Circuit) platforms, such ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06N3/063

CPCG06N3/063G06N3/045

Inventor 周军王波

Owner 周军

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Programmable depth neural network processor

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology