CUDA multi-thread processing method and system and related equipment

A processing method and multi-threading technology, applied in the computer field, can solve the problems of increasing the execution delay of kernel functions and affecting the efficiency of parallel processing in CUDA, and achieve the effects of saving hardware costs, improving efficiency, and saving time and overhead

Active Publication Date: 2022-02-08
AZURENGINE TECH ZHUHAI INC
View PDF7 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Before the kernel function executes the thread, it usually needs to generate the index of the thread. In the case of low-complexity hardware, it takes a lot of clock cycles to generate all the indexes in a thread block. The execution delay becomes larger, which will affect the efficiency of parallel processing in CUDA

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • CUDA multi-thread processing method and system and related equipment
  • CUDA multi-thread processing method and system and related equipment
  • CUDA multi-thread processing method and system and related equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0100] In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiment of the application. Obviously, the described embodiment is only It is an embodiment of a part of the application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.

[0101] The terms "comprising" and "having" and any variations thereof appearing in the specification, claims and drawings of this application are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but optionally furth...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a CUDA (Compute Unified Device Architecture) multi-thread processing method and system and related equipment, and the method comprises the following steps: obtaining configuration information corresponding to a kernel function; under the condition that the target historical configuration information matched with the configuration information does not exist in the historical configuration information, generating a three-dimensional index of the thread according to the configuration information; compressing and packaging the generated three-dimensional index according to the configuration information, and storing the compressed and packaged three-dimensional index in a memory; under the condition that the target historical configuration information exists in the historical configuration information, obtaining a historical three-dimensional index corresponding to the target historical configuration information; and compressing and packaging the historical three-dimensional index according to the target historical configuration information, and storing the compressed and packaged historical three-dimensional index in a memory. According to the embodiment of the invention, the multi-thread parallel processing efficiency in the CUDA can be improved.

Description

technical field [0001] The present application relates to the field of computer technology, in particular to a CUDA multi-thread processing method, system and related equipment. Background technique [0002] CUDA (Compute Unified Device Architecture, Unified Computing Device Architecture) is a computing platform launched by graphics card manufacturer NVIDIA (NVIDIA). It uses C language as a programming language to provide a large number of high-performance computing instruction development capabilities. Computation in CUDA is inseparable from kernel function (kernel) and thread (thread), a kernel function corresponds to a thread grid (grid), a thread grid contains several thread blocks (thread block), and a thread block contains several thread. Before the kernel function executes the thread, it usually needs to generate the index of the thread. In the case of low-complexity hardware, it takes a lot of clock cycles to generate all the indexes in a thread block. The executio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/38G06T1/20
CPCG06F9/3851G06T1/20Y02D10/00
Inventor 雷宇李原朱建斌付尧永田敏雄
Owner AZURENGINE TECH ZHUHAI INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products