Cellular neural network hardware architecture optimization method

A neural network and hardware architecture technology, applied in biological neural network models, physical implementation, etc., can solve the problems of redundant computing performance, slow memory reading, lack of reconfigurable design space exploration, etc., to achieve optimal computing performance, The effect of memory bandwidth reduction

Inactive Publication Date: 2018-09-28
ZHEJIANG UNIV
View PDF6 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] 1. The iterative operation of most of the existing cellular neural network hardware designs is only executed in one operation module, which does not make full use of the parallel features of the hardware
[0013] 2. The current work does not take into account that repeated reading and writing parameters in the neural network will cause unnecessary memory bandwidth and storage resource occupation, and make use of this feature
Because the memory read in FPGA is much slower than the calculation speed, these repeated parameters will cause calculation performance redundancy
[0014] 3. Many cellular neural network systems do not make full use of hardware resources and bandwidth. This lack of reconfigurable design space exploration makes it difficult for hardware to achieve optimal performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cellular neural network hardware architecture optimization method
  • Cellular neural network hardware architecture optimization method
  • Cellular neural network hardware architecture optimization method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The present invention will be further described below in conjunction with the accompanying drawings.

[0044] Such as figure 1 As shown, the cellular neural network hardware architecture is composed of external memory, memory interface controller, on-chip input cache, on-chip output cache, computing acceleration unit and AXI4 bus. Due to the limited on-chip storage resources, the data is first read from the external memory into the on-chip input buffer through the memory interface controller and the AXI4 bus, and the calculation operation is performed in the computing acceleration unit. The calculation acceleration unit includes a template RAM, a data transmission control unit, an iterative unit A and several iterative units B connected in sequence, and the operation of the entire cellular neural network is completed through the iterative unit pipeline. The details are as follows:

[0045] Iteration unit A (IU A): perform the initial calculation operation in the iterativ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cellular neural network hardware architecture optimization method. The method comprises the steps of constructing cellular neural network hardware architecture and realizingsystem-level optimal design, module-level optimal design and design space-level optimal design for a calculation acceleration unit, wherein the architecture consists of an external memory, a memory interface controller, an on-chip input cache, an on-chip output cache, the calculation acceleration unit and a bus; the calculation acceleration unit comprises a plurality of iteration units which are connected in sequence; each iteration unit comprises a plurality of parallel operation modules; data is operated in the calculation acceleration unit and the operation result is written into the on-chip output cache; and an operation of the whole cellular neural network is completed through the iteration units in an assembly line manner. According to the method, parallel calculation of the cellularneural network is realized through system-level optimization, memory bandwidth in hardware is sufficiently utilized and data transmission delay is decreased through module-level optimization, and optimal calculation performance of systems is achieved under the condition of limited hardware resources through design space-level optimization.

Description

technical field [0001] The invention belongs to the field of hardware accelerator design, and in particular relates to an optimization method for cellular neural network hardware architecture. Background technique [0002] With the increasing demand of artificial intelligence for low-power devices, the shortcomings of traditional image processing applications, such as low data processing speed and high power consumption, are becoming more and more obvious. As an effective means to improve processing performance and reduce energy consumption, cellular neural network has gradually been applied in the fields of noise elimination, edge detection, path planning, etc., and has attracted extensive attention from both academia and industry. [0003] The cellular neural network is a nonlinear structure that is locally connected and composed of a large number of cells. Each cell has a template composed of a 3x3 matrix and is connected to its adjacent 8 cells. The parameters in the tem...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/06
CPCG06N3/061
Inventor 卓成刘仲阳
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products