Supercharge Your Innovation With Domain-Expert AI Agents!

Method and system for managing nested execution streams

一种子系统、已执行的技术,应用在电数字数据处理、多道程序装置、程序控制设计等方向,能够解决影响、系统性能负面等问题,达到提高处理效率的效果

Active Publication Date: 2013-12-04
NVIDIA CORP
View PDF4 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If the GPU uses locking while queuing new tasks, system performance will be negatively affected

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for managing nested execution streams
  • Method and system for managing nested execution streams
  • Method and system for managing nested execution streams

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without one or more of these specific details.

[0022] System Overview

[0023] figure 1 is a block diagram illustrating a computer system 100 configured to implement one or more aspects of the present invention. Computer system 100 includes a central processing unit (CPU) 102 and system memory 104 in communication via an interconnection path that may include a memory bridge 105 . The memory bridge 105 may be, for example, a north bridge chip, connected to an I / O (input / output) bridge 107 via a bus or other communication path 106 (eg, a HyperTransport link). I / O bridge 107 , which may be, for example, a south bridge chip, receives user input from one or more user input devices 108 (eg, keyboard, mouse) and forwards the input to C...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

One embodiment of the present disclosure sets forth an enhanced way for GPUs to queue new computational tasks into a task metadata descriptor queue (TMDQ). Specifically, memory for context data is pre-allocated when a new TMDQ is created. A new TMDQ may be integrated with an existing TMDQ, where computational tasks within that TMDQ include task from each of the original TMDQs. A scheduling operation is executed on completion of each computational task in order to preserve sequential execution of tasks without the use of atomic locking operations. One advantage of the disclosed technique is that GPUs are enabled to queue computational tasks within TMDQs, and also create an arbitrary number of new TMDQs to any arbitrary nesting level, without intervention by the CPU. Processing efficiency is enhanced where the GPU does not wait while the CPU creates and queues tasks.

Description

technical field [0001] The present invention relates generally to computer architecture, and, more particularly, to methods and systems for managing nested execution flows. Background technique [0002] In a conventional computing system having both a central processing unit (CPU) and a graphics processing unit (GPU), the CPU determines which specific computing tasks are performed by the GPU and in what order. GPU computing tasks typically include highly parallel, highly similar operations across parallel datasets, such as images or sets of images. In a conventional GPU execution model, a CPU initiates a specific computing task by selecting a corresponding thread program and directing the GPU to execute a set of parallel instances of the thread program. In conventional GPU execution models, the CPU is often the only entity that can initiate execution of a thread program on the GPU. After all thread instances have finished executing, the GPU must notify the CPU and wait for...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50
CPCG06F9/50G06F9/46G06F2209/483G06F9/4881
Inventor 卢克·杜兰特
Owner NVIDIA CORP
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More