Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Scalable parallel data loading device and method

A technology for data and input data, applied in the field of key processing devices for input data, can solve the problems of waste of data reuse accelerator resources, complicated data segmentation and arrangement methods, large buffer size, etc., to simplify space complexity and improve practical Application efficiency, the effect of simplifying software arrangement of data

Pending Publication Date: 2021-07-09
苏州芯启微电子科技有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the inventions described in patent documents 1 and 3, due to the different sizes of different neural network algorithm layers and different degrees of data reuse, accelerator resources are wasted, so that it is necessary to cooperate with other heterogeneous processors to help solve data-related problems; patent The storage method described in 3 needs to back up more data, resulting in too large a Buffer size; the method of patent 2 adopts the idea of ​​reconfigurable computing, although it pays great attention to saving resources and waste, its data segmentation and arrangement methods are very complicated; the method of patent 4 The invention is too coupled with the design of the central processing unit, and the design and implementation complexity is too high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Scalable parallel data loading device and method
  • Scalable parallel data loading device and method
  • Scalable parallel data loading device and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] The present invention will be described in further detail below with reference to the accompanying drawings and examples.

[0030] figure 1 It is a structural diagram of a scalable parallel data loading device in a deep convolutional neural network hardware accelerator of the present invention, which includes a parallel input register array (IRA) 202 and a parallel input data access engine (IDE) 203 . The figure also illustrates a simplified connection design between the device of the present invention and the parallel hardware computing element array (PEA) 1 . The device 1 is composed of several parallel hardware computing units (PE) 101. There is a one-to-one fixed connection between the PE and the output of the IDE, which greatly simplifies the circuit complexity, area and power consumption.

[0031] The connection design of the registers constituting the parallel input register array (IRA) 202 may adopt different design architectures, such as full connection-multip...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a scalable parallel data loading device and method, which are used for parallel loading acceleration of the data tensor, can be conveniently scaled and expanded, provide data input for parallel execution units of any scale, and have the characteristic of high data bandwidth. The device comprises a parallel input register array, the design size and the calculation unit array size meet a certain rule, and hardware customization design is carried out based on the rule. The method comprises an input data transformation method for determining a sequence of data after dimension reduction; and the device also includes a parallel input data access engine which performs parallel access on the data in the parallel input register array, has a specific control algorithm and a circuit structure, and is optimized for facilitating chip implementation. A set of hardware data processing method is designed and formed, the hardware data processing method comprises a transformation algorithm and an addressing rule, the utilization rate of local space information of input data is improved, high-bandwidth data input is provided for a parallel acceleration computing unit, and the number of times of access to a main memory is reduced.

Description

technical field [0001] The invention belongs to the field of computer hardware, artificial neural network algorithm deployment hardware acceleration, tensor computing hardware, and the field of digital integrated circuit design, and specifically relates to a key processing device for input data of a deep convolutional neural network hardware acceleration chip, and its design method. Background technique [0002] The deep convolutional neural network algorithm is composed of multiple layers of specific neuron algorithm layers and hidden layers, mainly including convolutional layers, and the main operator is convolution calculation of matrix or vector. The main characteristics of this computing task are the large amount of input data, the coupling of the input data with spatial feature information, and the data calculated by each convolution often overlaps with the already calculated data, and the input data is often data in tensor format. The required calculation data is ext...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F30/373G06F30/27G06N3/04G06N3/08
CPCG06F30/373G06F30/27G06N3/08G06N3/045
Inventor 杨旭光
Owner 苏州芯启微电子科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products