Unlock instant, AI-driven research and patent intelligence for your innovation.

Performance Evaluation of Algorithmic Tasks and Dynamic Parameterization on Multi-Core Processing Systems

a multi-core processing system and performance evaluation technology, applied in the field of can solve the problems of not taking into account the optimizations possible with regard to dma operations, the performance of dma lists is often not as good as that of contiguous dma, and affect the performance of algorithmic tasks being performed. to achieve the effect of efficient evaluation of the performance of algorithmic tasks

Inactive Publication Date: 2009-06-04
GLOBALFOUNDRIES INC
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0012]An illustrative embodiment of the present invention meets the above-noted need by providing techniques for more efficiently evaluating the performance of algorithmic tasks on a target multi-core processing system. Results of a benchmark indicative of a measure of performance of a template characterizing an algorithmic task to be evaluated on a target multi-core processing system can be collected and stored. The stored performance results can be used to dynamically determine optimal performance parameters with which to schedule a task at run-time.

Problems solved by technology

Unfortunately, however, these conventional techniques work offline and generate optimal code for fixed configurations (Fast Fourier Transform in the West (FFTW), a C subroutine library, may work dynamically at run-time, but that is only useful if the plan (i.e., outcome) is to be reused multiple times; otherwise it is more beneficial to store and reuse the plan rather than run it every time).
Moreover, these techniques do not take into account optimizations possible with regards to DMA operations (e.g., they do not search the DMA parameter space).
Some of the issues involved include the following:DMA operations tend to have high latencies, discouraging working iteratively on small blocks / vectors.Performance of DMA lists is often not as good as that of contiguous DMA.
In single-ported local memory units, the DMA operations can undesirably interfere with the computation, thereby impacting the performance of the algorithmic task(s) being performed (for instance, the core could starve for instructions if DMA is given higher priority than the local memory unit).

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Performance Evaluation of Algorithmic Tasks and Dynamic Parameterization on Multi-Core Processing Systems
  • Performance Evaluation of Algorithmic Tasks and Dynamic Parameterization on Multi-Core Processing Systems
  • Performance Evaluation of Algorithmic Tasks and Dynamic Parameterization on Multi-Core Processing Systems

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022]One or more embodiments of the present invention provide a means for evaluating the performance of algorithmic-tasks that use DMA for data transfers on a multi-core processing system. Furthermore, aspects of the invention can be used for dynamically determining optimal performance parameters for a scheduled task at run-time based at least in part on results of the performance evaluation, as will become apparent to those skilled in the art given the teachings of the invention provided herein, although the invention is not limited to such an application. While certain aspects of the invention are described herein in the context of illustrative program code implementations, it should be understood that the present invention is not limited to the specific implementations shown.

[0023]With reference to FIG. 1, a flow chart 100 of exemplary method steps is shown for at least a portion of an exemplary method for evaluating the performance of algorithmic-tasks that use DMA for data tra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Apparatus for evaluating the performance of DMA-based algorithmic tasks on a target multi-core processing system includes a memory and at least one processor coupled to the memory. The processor is operative: to input a template for a specified task, the template including DMA-related parameters specifying DMA operations and computational operations to be performed; to evaluate performance for the specified task by running a benchmark on the target multi-core processing system, the benchmark being operative to generate data access patterns using DMA operations and invoking prescribed computation routines as specified by the input template; and to provide results of the benchmark indicative of a measure of performance of the specified task corresponding to the target multi-core processing system.

Description

CROSS-REFERENCE TO RELATED APPLICATION(S)[0001]The present application is related to a commonly assigned U.S. application entitled “Performance Evaluation of Algorithmic Tasks and Dynamic Parameterization on Multi-core Processing Systems,” identified by attorney docket number IN920070084US1, and filed on even date herewith, the disclosure of which is incorporated by reference herein in its entirety.FIELD OF THE INVENTION[0002]The present invention relates to the electrical, electronic, and computer arts, and, more particularly, to evaluating the performance of algorithmic tasks.BACKGROUND OF THE INVENTION[0003]A multi-core computing system typically includes some combination of shared memory units, accessible by all cores, and / or local memory units, associated with individual cores. Most of the cores, although not necessarily all, access these memory units using direct memory access (DMA). Access to local memory units may be direct and / or some cores may have direct access to the sha...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F9/50G06F9/44
CPCG06F11/3404G06F11/3433G06F11/3447G06F11/3428
Inventor GUNNELS, JOHN A.KAPOOR, SHAKTIKOTHARI, RAVISABHARWAL, YOGISHSEXTON, JAMES C.
Owner GLOBALFOUNDRIES INC