Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

An OPU instruction set definition method for CNN acceleration

An instruction set and instruction technology, applied in program control design, biological neural network model, instruments, etc., can solve the problems of difficult starting point, large uncertainty of instruction execution time, limited change of starting conditions, etc., to achieve shortened length, accurate Predicting the order of instructions, addressing the effect of universality

Active Publication Date: 2019-07-26
深圳市比昂芯科技有限公司
View PDF9 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

On the other hand, when it comes to external memory usage, the cyclic simulation of memory read and write operations is not very accurate, because additional refresh time and other overhead may occur during external memory usage; if the instruction is executed immediately after decoding, Then the order of operations can only be controlled by the order of the instruction sequence; if the operation cycle is not accurately simulated, it will become difficult to control the starting point of the operations executed in parallel; at the same time, the starting conditions of the main business have limited changes and are usually reached in the first few steps It is triggered after a certain state, resulting in a large uncertainty in the execution time of instructions. Therefore, an instruction set definition method is needed to define instructions to overcome the above problems, provide OPU instruction sets to reorganize network mappings of different structures into specific structures, and optimize instruction control. The universality of the processor can realize the configuration of different target networks according to the instructions, and realize the general CNN acceleration through the OPU

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An OPU instruction set definition method for CNN acceleration
  • An OPU instruction set definition method for CNN acceleration
  • An OPU instruction set definition method for CNN acceleration

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0050] A method for defining an OPU instruction set for CNN acceleration, including defining conditional instructions, defining unconditional instructions and setting instruction granularity;

[0051] When using the defined instruction set for CNN acceleration, it is necessary to define the instruction type of the instruction, the corresponding operation of each instruction, the definition of general parameters and the granularity of instructions. The definition of general parameters includes instruction length and instruction sequence. When the OPU instruction is running, it includes step 1: read the instruction block (the instruction set is a collection list of all instructions; the instruction block is a group of continuous instructions, and the instructions used to execute a network include multiple instruction blocks); step 2: get The unconditional instructions in the instruction block are directly executed, and the parameters contained in the unconditional instructions ar...

Embodiment 2

[0064] Based on Embodiment 1, six kinds of instructions in the conditional instructions of the present application: include read storage instructions, write storage instructions, data capture instructions, data post-processing instructions and calculation instructions; conditional instructions meet the trigger conditions of hardware writing After execution, the conditional instruction register includes a parameter register and a trigger condition register; the conditional instruction performs parameter configuration according to the unconditional instruction.

[0065] The read-storage instruction includes the read-storage operation according to mode A1 and the read-storage operation according to mode A2; the parameters that can be configured for the read-storage operation instruction include the starting address, the number of operands, the post-read processing mode, and the on-chip storage location.

[0066] Mode A1: read n numbers backward from the specified address, n is a p...

Embodiment 3

[0076] Based on Embodiment 1, when used for CNN acceleration, there are multiple consecutive repeated instructions in the instruction sequence, so when defining the instruction set, define the definition method of the instruction sequence, specifically: if the instruction sequence is a continuous number of repeated instructions, Then only a single instruction is set, and the instruction is executed repeatedly until the contents of the trigger condition register and parameter register are updated; when there are multiple consecutive repeated instructions, only the first one is defined, and the trigger condition register and parameter register keep the contents until they are updated, which is beneficial to Acceleration of different target networks is achieved through quick configuration of instructions.

[0077] Multiple parameters need to be defined in the unconditional command, and the corresponding command length is long. In order to reduce the command length, a unified metho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an OPU instruction set defining method for CNN acceleration, and relates to the field of instructions of CNN acceleration processors, and the method comprises the steps of defining a conditional instruction, defining an unconditional instruction and setting the instruction granularity, wherein the unconditional instruction provides configuration parameters for the conditional instruction, the conditional instruction sets a trigger condition, the trigger condition is hard-written in hardware, the conditional instruction sets a trigger condition register corresponding tothe conditional instruction, the conditional instruction is executed after the trigger condition is met, the unconditional instruction is directly executed after being read, and the content of the parameter register is replaced. According to the CNN network and acceleration requirements, the calculation modes of the parallel input and output channels are selected, and the instruction granularity is set. The instruction set provided by the invention avoids the problem that the order of the instructions cannot be predicted due to large uncertainty of the operation period. The instruction set andthe corresponding processor OPU can be realized by an FPGA or an ASIC. The OPU can accelerate different target CNN networks, and hardware reconstruction is avoided.

Description

technical field [0001] The invention relates to the field of a method for defining a CNN accelerator instruction set, in particular to a method for defining an OPU instruction set for CNN acceleration. Background technique [0002] Deep convolutional neural networks (CNNs) have demonstrated high accuracy in various applications, such as visual object recognition, speech recognition, and object detection, etc. However, its breakthrough in accuracy comes at the cost of high computational cost, which needs to be accelerated by computing clusters, GPUs and FPGAs. Among them, the FPGA accelerator has the advantages of high energy efficiency, good flexibility, and strong computing power, especially in deep CNN applications on edge devices such as speech recognition and visual object recognition on smartphones; it usually involves architecture exploration and optimization , RTL programming, hardware implementation and software-hardware interface development. With the development, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/30G06N3/04
CPCG06F9/30003G06N3/045Y02D10/00
Inventor 喻韵璇王铭宇
Owner 深圳市比昂芯科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products