A dynamically reconfigurable convolution neural network accelerator architecture oriented to the field of the Internet of things

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and Internet of Things technology, applied in the field of dynamic reconfigurable convolutional neural network accelerator architecture, can solve the problems of inability to apply intelligent mobile terminals, energy efficiency (low performance/power consumption, high power consumption, etc.), and achieve network Simple structure, reduced external memory access, and low power consumption

Active Publication Date: 2019-03-08

XI AN JIAOTONG UNIV

View PDF6 Cites 13 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the existing hardware implementation has high power consumption and low energy efficiency (performance / power consumption), and cannot be applied to smart mobile terminals, such as smartphones, wearable devices or self-driving cars. Wait

In this context, reconfigurable processors have been proven to be a form of parallel computing architecture with both high flexibility and high energy efficiency. Improving processing performance while using a dedicated processor is one of the solutions to the limitations of the further development of multi-core CPU and FPGA technology, and may become one of the solutions for realizing high-performance deep learning SoC in the future

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0090] For the speed index, the superiority of the present invention comes from the design of the processing unit array and the cache architecture. First, the processing unit adopts the Winograd convolution acceleration algorithm. For example, for 5*5 input data, 3*3 convolution kernel size, and convolution operation with stride 1, traditional convolution requires 81 multiplication operations. It is published that each processing unit only needs to introduce 25 multiplications. In addition, the processing unit array in the convolutional network, the input channel and the output channel are processed with a certain degree of parallelism, which makes the convolution operation faster. On the other hand, the cache architecture has two working modes. In the on-chip working mode, the data generated by the middle layer of the convolutional neural network does not need to be stored off-chip and can be directly sent to the next layer of the network. For lightweight convolutional neural...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a dynamic reconfigurable convolution neural network accelerator architecture oriented to the field of the Internet of things, comprising a buffer architecture and the like. The buffer architecture is used for storing data from a storage external memory or data generated in the calculation process, organizing the buffer architecture, arranging the buffer architecture, and transmitting the buffer architecture to a processing unit array for calculation. A processing unit array is used for receiving the data from the cache architecture and storing the data in the cache architecture after the convolution operation processing. A computing module is used for receiving data from the array of processing units, selecting three operations of pooling, standardizing, or activating functions, and storing the output data in the buffer architecture. A controller is used to send commands to the cache architecture, the array of processing units and the computing module, and is designed with an external interface for communicating with the external system. The invention improves the performance of the convolution neural network accelerator and reduces the power consumption bydesigning a processing unit array with high parallelism and high utilization ratio and a buffer structure capable of improving the data multiplexing rate.

Description

technical field [0001] The invention belongs to the field of neural network accelerators, and in particular relates to a dynamically reconfigurable convolutional neural network accelerator architecture oriented to the field of Internet of Things. Background technique [0002] Artificial intelligence is one of the most popular computer sciences at present. As the main way to realize artificial intelligence, deep learning has also achieved far-reaching development. With the increase of the number of network layers and the number of neurons in each layer, the computational complexity of the model will increase. As the size of the network increases, it grows exponentially. Therefore, the learning speed and running speed bottleneck of deep learning algorithms are increasingly dependent on large-scale computing platforms such as cloud computing. For the hardware acceleration of deep learning algorithms, there are usually three types of implementations - multi-core CPU, GPU and FP...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06N3/063G06N3/08

CPCG06N3/063G06N3/08G06N3/045

Inventor 杨晨王逸洲王小力耿莉

Owner XI AN JIAOTONG UNIV

A dynamically reconfigurable convolution neural network accelerator architecture oriented to the field of the Internet of things

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology