Optimizing compiler for improving application performance on many-core coprocessors

a compiler and many-core technology, applied in the field of compilers, can solve the problems of requiring significant developer effort and experimentation to maximize performance, and achieve the effect of eliminating one or more redundant data transfers

Inactive Publication Date: 2013-02-28
NEC LAB AMERICA
View PDF8 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0007]A method for compiling includes parsing code of an application stored in a computer readable storage medium to identify one or more parallelizable code portions. At least one parallelizable code portion is optimized by transforming offload construct code portions to provide an optimized application. Transforming offload construct code portions includes one or more of: moving an offload construct from within a loop of the at least one parallelizable code portion to outside the loop; moving a declaration of a variable from outside the offload construct to inside the offload construct; transforming code to generate a direct memory access transfer; and eliminating one or more redundant data transfers of the variable within the at least one parallelizable code portion.

Problems solved by technology

However, porting legacy applications to the MIC architecture may involve manually identifying highly parallel code portions and corresponding data transfers, which may require significant developer effort and experimentation to maximize performance.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Optimizing compiler for improving application performance on many-core coprocessors
  • Optimizing compiler for improving application performance on many-core coprocessors
  • Optimizing compiler for improving application performance on many-core coprocessors

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018]In accordance with the present principles, a compiler for ×86-based many-core coprocessors is provided to port legacy applications to benefit from many-core architecture. Preferably, the compiler receives an annotated application identifying parallelizable code portions. For each parallelizable code portion, the compiler first performs a liveness analysis to determine variables that are to be copied in to (live-in variables) and out of (live-out variables) the many-core coprocessor. An array bound analysis is also performed to determine the start and end location of each array / pointer used in the code portion as a size in memory.

[0019]The compiler then transforms the parallelizable code portions by inserting an offload construct before the parallelizable code portions. In / out / inout clauses are passed as arguments of the offload construct and are populated based on the results of the liveness analysis and array bound analysis. In a preferred embodiment, the parallelizable code ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A system and method for compiling includes parsing code of an application stored in a computer readable storage medium to identify one or more parallelizable code portions. At least one parallelizable code portion is optimized by transforming offload construct code portions to provide an optimized application.

Description

RELATED APPLICATION INFORMATION[0001]This application claims priority to provisional application Ser. No. 61 / 527,147 filed on Aug. 25, 2011 and provisional application Ser. No. 61 / 605,370 filed on Mar. 1, 2012, both of which are incorporated herein by reference.BACKGROUND[0002]1. Technical Field[0003]The present invention relates to a compiler, and more specifically to an optimizing compiler for many-core coprocessors.[0004]2. Description of the Related Art[0005]Many core processors, such as the Intel™ Many Integrated Core (MIC), are aimed at accelerating multi-core high performance computing (HPC) applications. Legacy applications can be compiled and executed on MIC by selectively inserting Language Extensions for Offload (LEO) keywords in the application code identifying parallel code portions to be offloaded to the MIC coprocessor. The goal is to improve overall application performance by taking advantage of the large number of cores on MIC (for multithreading) and the wide singl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F9/45
CPCG06F8/4441G06F8/443G06F8/456G06F8/451G06F9/52
Inventor RAVI, NISHKAMBAO, TAOOZTURK, OZCANCHAKRADHAR, SRIMAT
Owner NEC LAB AMERICA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products