GPU data processing system based on multi-input single-output FIFO structure

A data processing system and a single-output technology, which is applied in the field of GPU data processing, can solve problems such as parallel output information channel blockage, inability to input multiple information in parallel, and reduce GPU data processing efficiency, so as to avoid blockage, widely use value, and improve The effect of data acquisition efficiency

Active Publication Date: 2022-06-21
METAX INTEGRATED CIRCUITS (SHANGHAI) CO LTD
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] There are many parallel processing scenarios in GPU-based data processing, and the parallel output information needs to be stored in FIFO (First Input First Output) for subsequent use. However, since the existing FIFO is a first-out queue, and every Only one piece of information can be input at a time, and multiple information cannot be input in parallel. When encountering a parallel output scenario, only multiple parallel output information can be input separately, which will inevitably cause blockage of the parallel output information channel and reduce GPU data. Processing efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • GPU data processing system based on multi-input single-output FIFO structure
  • GPU data processing system based on multi-input single-output FIFO structure

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0015] The allocation unit is used to transmit the second data acquisition request to the corresponding cache memory based on the cache memory identification information in each second data acquisition request, and the N second data acquisition requests can be allocated after passing through the allocation unit Divided into P road.

[0016] As an embodiment, among the P cache memories, each cache memory corresponds to a physical address storage interval, and is used to obtain from the memory the data corresponding to the physical address in the corresponding physical address storage interval, and the P physical addresses Storage intervals do not overlap. It can be understood that, based on the corresponding relationship between each cache memory and the physical address storage interval, the upstream device can directly specify the corresponding cache memory when sending the first data acquisition request. Each of the physical address storage intervals includes a plurality of...

Embodiment 2

[0029] To illustrate with a specific example, assuming P=4, F 2 and F 4 The input port inputs the corresponding third data acquisition request, then F 2 The third get data request is mapped to the output port E 1 , the F 4 The third get data request on the input port is mapped to the output port E 2 , the output port E 1 and E 2 , will input port F 2 and F 4 The third fetch data request is stored in FIFO in parallel, where E 1 Will F 2 The third data acquisition request is stored in the WR line of the FIFO, E 2 Will F 2 The third data acquisition request is stored in line WR+1 of FIFO.

[0030] In Embodiment 2 of the present invention, by setting the mapper, the FIFO of the multi-input and single-output port, and the write pointer, Q third acquisition data request information processed in parallel can be input into the FIFO in parallel, avoiding any third acquisition data request information acquisition The blockage of the channel improves the data acquisition effi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a GPU (Graphics Processing Unit) data processing system based on a multi-input single-output FIFO (First In First Out) structure, which comprises a mapper, an FIFO and a write pointer, the mapper comprises P input ports and P output ports {E1, E2,... EP}, the P input ports are used for parallelly inputting Q third data acquisition requests and mapping the Q third data acquisition requests to the first Q output ports {E1, E2,... EQ}, and the first Q output ports {E1, E2,... EQ} are used for parallelly inputting the Q third data acquisition requests and mapping the Q third data acquisition requests to the first Q output ports {E1, E2,... EQ}. Storing Q pieces of third data in {E1, E2,... EQ} into an FIFO (First In First Out); the FIFO is a multi-input single-output FIFO and is used for parallelly inputting Q third data acquisition requests and singly outputting the third data acquisition requests in the FIFO; the writing pointer always points to the next row of data to be stored in the current FIFO, the value of the row pointed by the current writing pointer is WR, and after the mapper stores Q third data obtaining requests in the FIFO in parallel, the WR is updated. According to the invention, the data processing efficiency of the GPU is improved.

Description

technical field [0001] The invention relates to the technical field of GPU data processing, in particular to a GPU data processing system based on a multiple-input-single-output FIFO structure. Background technique [0002] There are many parallel processing scenarios in GPU-based data processing, and the parallel output information needs to be stored in FIFO (First Input First Output) for subsequent use. However, since the existing FIFO is a first-out queue, and every Only one piece of information can be input at a time, and multiple information cannot be input in parallel. When encountering a parallel output scenario, only multiple parallel output information can be input separately, which will inevitably cause blockage of the parallel output information channel and reduce GPU data. Processing efficiency. It can be seen from this that how to realize the parallel multi-input of FIFO and improve the data processing efficiency of GPU has become a technical problem to be solv...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50G06K9/62G06T1/20G06F5/06G06F12/06G06F12/0877
CPCG06F9/5011G06T1/20G06F5/065G06F12/0646G06F12/0877G06F18/25Y02D10/00
Inventor 不公告发明人
Owner METAX INTEGRATED CIRCUITS (SHANGHAI) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products