Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method, device and computer storage medium for analyzing gpu performance

A performance and technology to be analyzed, which is applied in the field of analyzing GPU performance, can solve problems such as low operating efficiency, poor scalability, and large error rate, and achieve the effect of reducing error rate and accurate performance data

Active Publication Date: 2022-05-27
芯瞳半导体技术(山东)有限公司
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] For the simulation modeling method, although it can truly simulate the hardware execution process and obtain real simulation data; however, since the simulation model needs to simulate the execution of the real GPU, the running efficiency is low. If the GPU architecture needs to be adjusted, then It is necessary to rebuild the simulation model for the GPU after the architecture adjustment. Therefore, using the simulation modeling method for GPU performance statistics has the disadvantages of relatively poor scalability and long development cycle.
For the analysis and modeling method, the analysis model does not need to simulate the actual operation process of the instruction, but only needs to perform modeling analysis operations on the input instruction information to obtain the performance result data, so the analysis and modeling method is used to calculate the operating efficiency of GPU performance statistics Very high, simple structure design, and strong scalability; however, in the actual implementation process, if the analysis model does not process the output instructions finely, it will cause a large error rate in the final GPU performance statistics

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method, device and computer storage medium for analyzing gpu performance
  • A method, device and computer storage medium for analyzing gpu performance
  • A method, device and computer storage medium for analyzing gpu performance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

[0025] At present, the instructions that the GPU can process can generally be divided into three types, namely arithmetic logic instructions, memory access instructions, and branch and jump instructions. At present, the conventional method of analyzing and modeling GPU performance statistics involves arithmetic logic instructions and memory access instructions, and can accurately calculate and obtain the execution time of each arithmetic logic instruction and memory access instructions. As for the branch and jump instructions, on the one hand, the existing conventional solutions do not consider the processing of the branch and jump instructions, and on the other hand, the branch and jump instructions have a great impact on the execution performance of the instructions....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the present invention discloses a method, device, and computer storage medium for analyzing GPU performance; the method may include: obtaining the instruction list obtained by running the target program in a set environment, the number of threads to be started, and the effect of each thread on all The execution result of each instruction in the above instruction list; Start the thread simulator in the GPU performance model to be analyzed according to the number of threads to be started by the simulation scheduler in the GPU performance model to be analyzed; each thread simulator Each instruction in the instruction list is traversed, and the instruction is executed according to the instruction execution control value of each instruction during the traversal process, so as to measure the duration of executing the traversed instruction; when all the instructions in the instruction list have been traversed , to obtain the total execution time of all the thread emulators executing all the instructions in the instruction list.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of graphics processing units (GPUs, Graphics Processing Units), and in particular, to a method, an apparatus, and a computer storage medium for analyzing GPU performance. Background technique [0002] In GPU performance statistics, Instructions Per Cycle (IPC, Instructions Per Cycle) is a relatively important GPU performance indicator, which represents how many instructions the GPU can process in each clock cycle; The execution time and the main frequency of the system are calculated to obtain the IPC. [0003] In the process of GPU performance statistics, it is usually necessary to model GPU performance. Specifically, the performance of GPU is usually modeled in two ways: one is simulation modeling, such as using software simulation to build a simulation model of GPU, and performing real execution process according to the simulation model to obtain real performance data of GPU; The s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F11/34
CPCG06F11/3457
Inventor 齐航空张竞丹李亮
Owner 芯瞳半导体技术(山东)有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products