Patents

Literature

Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.

69 results about "Optimizing compiler" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory requirement, and power consumption (the last two being popular for portable computers).

Loop allocation for optimizing compilers

InactiveUS6651246B1Software engineeringProgram controlData dependency graphReachability

Loop allocation for optimizing compilers includes the generation of a program dependence graph for a source code segment. Control dependence graph representations of the nested loops, from innermost to outermost, are generated and data dependence graph representations are generated for each level of nested loop as constrained by the control dependence graph. An interference graph is generated with the nodes of the data dependence graph. Weights are generated for the edges of the interference graph reflecting the affinity between statements represented by the nodes joined by the edges. Nodes in the interference graph are given weights reflecting resource usage by the statements associated with the nodes. The interference graph is partitioned using a profitability test based on the weights of edges and nodes and on a correctness test based on the reachability of nodes in the data dependence graph. Code is emitted based on the partitioned interference graph.

Loop allocation for optimizing compilers

Loop allocation for optimizing compilers

Loop allocation for optimizing compilers

Owner:IBM CORP

Lifetime-sensitive instruction scheduling mechanism and method

InactiveUS6305014B1Effective instructionHigh degree of parallelismSoftware engineeringSpecific program execution arrangementsScheduling instructionsDegree of parallelism

An instruction scheduler in an optimizing compiler schedules instructions in a computer program by determining the lifetimes of fixed registers in the computer program. By determining the lifetimes of fixed registers, the instruction scheduler can achieve a schedule that has a higher degree of parallelism by relaxing dependences between instructions in independent lifetimes of a fixed register so that instructions can be scheduled earlier than would otherwise be possible if those dependences were precisely honored.

Lifetime-sensitive instruction scheduling mechanism and method

Lifetime-sensitive instruction scheduling mechanism and method

Lifetime-sensitive instruction scheduling mechanism and method

Owner:IBM CORP

System and method for scheduling instructions to maximize outstanding prefetches and loads

InactiveUS6918111B1Minimizes processor stallMinimizes processor stallsSoftware engineeringDigital computer detailsScheduling instructionsParallel computing

The present invention discloses a method and device for ordering memory operation instructions in an optimizing compiler. for a processor that can potentially enter a stall state if a memory queue is full. The method uses a dependency graph coupled with one or more memory queues. The dependency graph is used to show the dependency relationships between instructions in a program being compiled. After creating the dependency graph, the ready nodes are identified. Dependency graph nodes that correspond to memory operations may have the effect of adding an element to the memory queue or removing one or more elements from the memory queue. The ideal situation is to keep the memory queue as full as possible without exceeding the maximum desirable number of elements, by scheduling memory operations to maximize the parallelism of memory operations while avoiding stalls on the target processor.

System and method for scheduling instructions to maximize outstanding prefetches and loads

System and method for scheduling instructions to maximize outstanding prefetches and loads

System and method for scheduling instructions to maximize outstanding prefetches and loads

Owner:ORACLE INT CORP

Directed least recently used cache replacement method

InactiveUS20020152361A1Performance maximizationImprove cache hit ratioEnergy efficient ICTMemory adressing/allocation/relocationParallel computingLeast recently frequently used

Fine grained control of cache maintenance resulting in improved cache hit rate and processor performance by storing age values and aging rates for respective code lines stored in the cache to direct performance of a least recently used (LRU) strategy for casting out lines of code from the cache which become less likely, over time, of being needed by a processor, thus supporting improved performance of a processor accessing the cache. The invention is implemented by the provision for entry of an arbitrary age value when a corresponding code line is initially stored in or accessed from the cache and control of the frequency or rate at which the age of each code is incremented in response to a limited set of command instructions which may be placed in a program manually or automatically using an optimizing compiler.

Directed least recently used cache replacement method

Directed least recently used cache replacement method

Directed least recently used cache replacement method

Owner:IBM CORP

Programming model oriented to neural network heterogeneous computing platform

ActiveCN107239315AGood computing powerImprove scalabilityLink editingVersion controlNerve networkNetwork model

The invention provides a programming model oriented to a neural network heterogeneous computing platform. Specifically, the invention provides a compiling method and system of a heterogeneous computing platform and a program running support method and system thereof. A trained neural network model is input to a neural network (NN) optimization complier to generate a NN assembling file corresponding to the NN. The NN assembling file is input to a NN assembler to generate a NN binary file corresponding to the neural network; a host complier tool chain is used for compiling and assembling a neural network application program developed by a user by using the high-level language, and orderly generates a corresponding host assembling file and a host binary file. The host linker is used for linking the NN binary file and the host binary file to generate a single mixed link executable file. The technical scheme provided by the invention has the features of being good in calculation performance, strong in expandability, strong in compatibility and high in flexibility.

Programming model oriented to neural network heterogeneous computing platform

Programming model oriented to neural network heterogeneous computing platform

Programming model oriented to neural network heterogeneous computing platform

Owner:XILINX INC

Method and computer program product for precise feedback data generation and updating for compile-time optimizations

InactiveUS7120906B1Speed up executionProvide flexibilitySoftware engineeringProgram controlSPECintTheoretical computer science

A method and computer program product, within an optimizing compiler, for precise feedback data generation and updating. The method and computer program uses instrumentation and annotation of frequency values to allow feedback data to stay current during the multiple optimizations that the program code undergoes during compilation. Global propagation of known precise feedback values are used to replace approximate and unavailable values, and global verification of feedback data after optimization to detect discrepancies is employed. The method and computer program also provides improved instrumentation to anticipate cloning when code is cloned during ceratin compiler optimizations and handles inlined procedures. The result is compiled executables with improved SPECint benchmarks.

Method and computer program product for precise feedback data generation and updating for compile-time optimizations

Method and computer program product for precise feedback data generation and updating for compile-time optimizations

Method and computer program product for precise feedback data generation and updating for compile-time optimizations

Owner:MORGAN STANLEY +1

Compiling Source Code For Debugging With Variable Value Restoration Based On Debugging User Activity

InactiveUS20130283243A1Error detection/correctionSoftware engineeringProgram instructionProcessor register

Compiling source code includes receiving, by an optimizing compiler from a debugger, a variable value modification profile that specifies locations in the source code at which variable values modified during a debug session; compiling the source code, including: inserting snapshots at one or more of the locations in the source code at which variable values were modified, each snapshot including a breakpoint; and only for each snapshot at a location in the source code at which variable values were modified: inserting, between the breakpoint and remaining source code at the location of the snapshot, a module of computer program instructions that when executed retrieves a current value of a variable and stores the current value in a register; and recording the location of each inserted snapshot; and providing, to the debugger by the optimizing compiler, the recorded locations of each inserted snapshot along with the compiled source code.

Compiling Source Code For Debugging With Variable Value Restoration Based On Debugging User Activity

Compiling Source Code For Debugging With Variable Value Restoration Based On Debugging User Activity

Compiling Source Code For Debugging With Variable Value Restoration Based On Debugging User Activity

Owner:IBM CORP

Vectorization in an optimizing compiler

InactiveUS20140237460A1Software testing/debuggingProgram controlParallel computingComputer engineering

An optimizing compiler includes a vectorization mechanism that optimizes a computer program by substituting code that includes one or more vector instructions (vectorized code) for one or more scalar instructions. The cost of the vectorized code is compared to the cost of the code with only scalar instructions. When the cost of the vectorized code is less than the cost of the code with only scalar instructions, the vectorization mechanism determines whether the vectorized code will likely result in processor stalls. If not, the vectorization mechanism substitutes the vectorized code for the code with only scalar instructions. When the vectorized code will likely result in processor stalls, the vectorization mechanism does not substitute the vectorized code, and the code with only scalar instructions remains in the computer program.

Vectorization in an optimizing compiler

Vectorization in an optimizing compiler

Vectorization in an optimizing compiler

Owner:IBM CORP

Code Motion Based on Live Ranges in an Optimizing Compiler

InactiveUS20100162220A1Software engineeringProgram controlSource codeMotion recognition

Optimizing program code in a static compiler by determining the live ranges of variables and determining which live ranges are candidates for moving code from the use site to the definition site of source code. Live ranges for variables in a flow graph are determined. Selected live ranges are determined as candidates in which code will be moved from a use site within the source code to a definition site within the source code. Optimization opportunities within the source code are identified based on the code motion.

Code Motion Based on Live Ranges in an Optimizing Compiler

Code Motion Based on Live Ranges in an Optimizing Compiler

Code Motion Based on Live Ranges in an Optimizing Compiler

Owner:IBM CORP

Phantom serializing compiler and method of operation of same

ActiveUS7886283B2Fine grained controlImprove performanceSoftware engineeringProgram controlOperational systemApplication software

An alternative to a real time operating system (RTOS) is provided based on serializing compilers. A serializing compiler can transform a multitasking application into an equivalent and optimized monolithic sequential code, to be compiled with the embedded processor's native optimizing compiler, effectively filling the RTOS gap. The serializing compiler can analyze the tasks at compile time and generate a fine-tuned, application specific infrastructure to support multitasking, resulting in a more efficient executable than one that is intended to run on top of a generic RTOS. By having control over the application execution and context switches, the serializing compiler enables the fine grain control of task timing while enhancing overall performance. The serializing compiler technology strengthens existing compilers, making them timing and task-aware. The Phantom compiler provides a fully automated mechanism to synthesize a single threaded, ANSI C / C++ program from a multithreaded C / C++ (extended with POSIX) program.

Phantom serializing compiler and method of operation of same

Phantom serializing compiler and method of operation of same

Phantom serializing compiler and method of operation of same

Owner:RGT UNIV OF CALIFORNIA

Compiling Source Code For Debugging With Expanded Snapshots

InactiveUS20130275948A1Error detection/correctionSpecific program execution arrangementsSource codeDebugger

Debugging source code includes: tracking, by a debugger during a debug session, duration of user examination of source code locations; providing, by the debugger to an optimizing compiler, a source code examination profile specifying source code locations examined by the user during the debug session; and receiving, by the debugger from the optimizing compiler: compiled source code for debugging, the compiled source code comprising, at each of one or more source code locations specified in the source code examination profile: a snapshot before the source code of the source code location, followed by an expanded snapshot, the expanded snapshot including computer program instructions to enable, during a debug session, examination of variable values changing during execution of the source code at the source code location; and a recording of snapshot locations and expanded snapshot locations.

Compiling Source Code For Debugging With Expanded Snapshots

Compiling Source Code For Debugging With Expanded Snapshots

Compiling Source Code For Debugging With Expanded Snapshots

Owner:IBM CORP

Compiler optimizations for vector instructions

ActiveUS20160048445A1Reduce in quantityImprove performanceHandling data according to predetermined rulesSoftware maintainance/managementVector elementParallel computing

An optimizing compiler includes a vector optimization mechanism that optimizes vector instructions by eliminating one or more vector element reverse operations. The compiler can generate code that includes multiple vector element reverse operations that are inserted by the compiler to account for a mismatch between the endian bias of the instruction and the endian preference indicated by the programmer or programming environment. The compiler then analyzes the code and reduces the number of vector element reverse operations to improve the run-time performance of the code.

Compiler optimizations for vector instructions

Compiler optimizations for vector instructions

Compiler optimizations for vector instructions

Owner:IBM CORP

Techniques for checking whether a complex digital object conforms to a standard

ActiveUS20080071825A1Easy accessDigital data processing detailsMetadata multimedia retrievalDICOMDocumentation

Techniques for validating complex digital objects such as DICOM objects. The techniques employ a declarative validation document which employs a declarative constraint language to specify the constraints to which the complex digital object is subject. A validator performs an evaluation of the constraint document with regard to the complex digital object. The complex digital object is valid if all of the constraints in the validation document are satisfied. The constraint document may be compiled by an optimizing compiler and the validator may apply the resulting compiled constraint specification to an in-memory representation of the digital object which has been optimized for fast reference. An example is given of the use of the techniques with DICOM objects.

Techniques for checking whether a complex digital object conforms to a standard

Techniques for checking whether a complex digital object conforms to a standard

Techniques for checking whether a complex digital object conforms to a standard

Owner:ORACLE INT CORP

Compiling source code for debugging with user preferred snapshot locations

InactiveUS9111033B2Software testing/debuggingSpecific program execution arrangementsSource code fileOptimizing compiler

Compiling source code for debugging, including: receiving, by an optimizing compiler from a debugger, a user specification of preferred breakpoint locations in the source code; compiling, by the optimizing compiler, the source code, wherein compiling includes inserting a snapshot at one or more of the preferred breakpoint locations, and recording the location of each inserted snapshot; and providing, to the debugger by the optimizing compiler, the recorded locations of each inserted snapshot along with the compiled source code.

Compiling source code for debugging with user preferred snapshot locations

Compiling source code for debugging with user preferred snapshot locations

Compiling source code for debugging with user preferred snapshot locations

Owner:INT BUSINESS MASCH CORP

System and Method for Optimizing Compiler Performance by Object Collocation

InactiveUS20110055819A1Software engineeringProgram controlCollocationInterference graph

A computer-implemented method, system, and computer program product for performing object collocation on a computer system are provided. The method includes analyzing a sequence of computer instructions for object allocations and uses of the allocated objects. The method further includes creating an allocation interference graph of object allocation nodes with edges indicating pairs of allocations to be omitted from collocation. The method also includes coloring the allocation interference graph such that adjacent nodes are assigned different colors, and creating an object allocation at a program point prior to allocations of a selected color from the allocation interference graph. The method additionally includes storing an address associated with the created object allocation in a collocation pointer, and replacing a use of each allocation of the selected color with a use of the collocation pointer to collocate multiple objects.

System and Method for Optimizing Compiler Performance by Object Collocation

System and Method for Optimizing Compiler Performance by Object Collocation

System and Method for Optimizing Compiler Performance by Object Collocation

Owner:IBM CORP

Compiler driven mechanism for registration and deregistration of memory pages

InactiveUS20090276765A1Efficient solutionEasy to doSoftware engineeringProgram controlParallel computingTerm memory

A method, system and article of manufacture are disclosed for registering and deregistering memory pages in a computer system. The method comprises the steps of hoisting register and deregister calls in a given routine where temporal locality is present to overlap computation and communication; using software pipelined registration and deregistration where spatial locality is observed; and using intra-procedural and inter-procedural analysis by a compiler of the computer system to deregister dynamically allocated buffers. The preferred embodiment of the invention is based on an optimizing compiler. The compiler is used to extract information such as addresses of buffers which are being reused repeatedly (temporal locality), preferably in a loop. The compiler may also find information about spatial locality, such as arrays whose indexes are used in a well-defined manner in a series of messages, for example, array pages being accessed in a pre-defined pattern in a loop.

Compiler driven mechanism for registration and deregistration of memory pages

Compiler driven mechanism for registration and deregistration of memory pages

Compiler driven mechanism for registration and deregistration of memory pages

Owner:IBM CORP

Debugging device and debugging method

ActiveUS8370810B2Error detection/correctionSpecific program execution arrangementsProcessing InstructionSource code

A debugging device configured to debug a program includes an analysis section configured to analyze information of a code that does not need to be debugged in which a predetermined processing instruction is described, the code being generated by optimization of a compiler for a source code of the program, and an output section configured to output processing content information, a start address, and an end address of the code that does not need to be debugged which are obtained by the analysis.

Debugging device and debugging method

Debugging device and debugging method

Debugging device and debugging method

Owner:KK TOSHIBA

Breaking read barrier to apply optimizations

ActiveUS20050149587A1Minimize changesMemory adressing/allocation/relocationSpecial data processing applicationsWaste collectionComputer science

A garbage collection system that needs to meet real-time requirements utilizes a read barrier that is implemented in an optimizing compiler. The read barrier is implemented with a forwarding pointer positioned in a header of each object. The forwarding pointer points to the object unless the object has been moved. The barrier is optimized by breaking the barrier and applying barrier sinking to sink the read barrier to its point of use and by using sub-expression elimination. A null-check for the read barrier is combined with a null-check required by the real-time application. All objects are located and moved with the collector to minimize variations in mutator utilization.

Breaking read barrier to apply optimizations

Breaking read barrier to apply optimizations

Breaking read barrier to apply optimizations

Owner:TWITTER INC

Code outlining without trampolines

ActiveUS7735074B2Optimal code layoutImprove localitySoftware engineeringSpecific program execution arrangementsCoding blockParallel computing

A system and method for optimizing compiler performance including outlining cold code at link time, rather than compile time, such that trampolines are not required. Branch instructions connecting a hot block to a cold block can be converted from a short branch distance limit to a longer branch distance limit, further optimizing code performance. Editors, implementing a plurality of windows that can be maintained for each function, can display the maximum distance that code blocks can be safely outlined. Other implementations allow the optimal placement of code that is significantly greater in size than the maximum possible branch distance.

Code outlining without trampolines

Code outlining without trampolines

Code outlining without trampolines

Owner:ORACLE INT CORP

Compiler optimizations for vector instructions

ActiveUS20160048379A1Reduce in quantityImprove performanceError detection/correctionHandling data according to predetermined rulesVector elementParallel computing

An optimizing compiler includes a vector optimization mechanism that optimizes vector instructions by eliminating one or more vector element reverse operations. The compiler can generate code that includes multiple vector element reverse operations that are inserted by the compiler to account for a mismatch between the endian bias of the instruction and the endian preference indicated by the programmer or programming environment. The compiler then analyzes the code and reduces the number of vector element reverse operations to improve the run-time performance of the code.

Compiler optimizations for vector instructions

Compiler optimizations for vector instructions

Compiler optimizations for vector instructions

Owner:IBM CORP

Method and apparatus for selecting references for prefetching in an optimizing compiler

InactiveUS7234136B2Software engineeringMemory adressing/allocation/relocationSystems analysisArray data structure

One embodiment of the present invention provides a system that generates code to perform anticipatory prefetching for data references. During operation, the system receives code to be executed on a computer system. Next, the system analyzes the code to identify data references to be prefetched. This analysis can involve: using a two-phase marking process in which blocks that are certain to execute are considered before other blocks; and analyzing complex array subscripts. Next, the system inserts prefetch instructions into the code in advance of the identified data references. This insertion can involve: dealing with non-constant or unknown stride values; moving prefetch instructions into preceding basic blocks; and issuing multiple prefetches for the same data reference.

Method and apparatus for selecting references for prefetching in an optimizing compiler

Method and apparatus for selecting references for prefetching in an optimizing compiler

Method and apparatus for selecting references for prefetching in an optimizing compiler

Owner:ORACLE INT CORP

Debugging device and debugging method

ActiveUS20090144705A1Error detection/correctionSpecific program execution arrangementsProcessing InstructionSource code

A debugging device configured to debug a program includes an analysis section configured to analyze information of a code that does not need to be debugged in which a predetermined processing instruction is described, the code being generated by optimization of a compiler for a source code of the program, and an output section configured to output processing content information, a start address, and an end address of the code that does not need to be debugged which are obtained by the analysis.

Debugging device and debugging method

Debugging device and debugging method

Debugging device and debugging method

Owner:KK TOSHIBA

Vectorization in an optimizing compiler

InactiveUS9047077B2Software testing/debuggingSpecific program execution arrangementsOptimizing compilerComputer program

An optimizing compiler includes a vectorization mechanism that optimizes a computer program by substituting code that includes one or more vector instructions (vectorized code) for one or more scalar instructions. The cost of the vectorized code is compared to the cost of the code with only scalar instructions. When the cost of the vectorized code is less than the cost of the code with only scalar instructions, the vectorization mechanism determines whether the vectorized code will likely result in processor stalls. If not, the vectorization mechanism substitutes the vectorized code for the code with only scalar instructions. When the vectorized code will likely result in processor stalls, the vectorization mechanism does not substitute the vectorized code, and the code with only scalar instructions remains in the computer program.

Vectorization in an optimizing compiler

Vectorization in an optimizing compiler

Vectorization in an optimizing compiler

Owner:IBM CORP

Method and apparatus for inserting prefetch instructions in an optimizing compiler

InactiveUS7257810B2Software engineeringDigital computer detailsArray data structureSystems analysis

One embodiment of the present invention provides a system that generates code to perform anticipatory prefetching for data references. During operation, the system receives code to be executed on a computer system. Next, the system analyzes the code to identify data references to be prefetched. This analysis can involve: using a two-phase marking process in which blocks that are certain to execute are considered before other blocks; and analyzing complex array subscripts. Next, the system inserts prefetch instructions into the code in advance of the identified data references. This insertion can involve: dealing with non-constant or unknown stride values; moving prefetch instructions into preceding basic blocks; and issuing multiple prefetches for the same data reference.

Method and apparatus for inserting prefetch instructions in an optimizing compiler

Method and apparatus for inserting prefetch instructions in an optimizing compiler

Method and apparatus for inserting prefetch instructions in an optimizing compiler

Owner:ORACLE INT CORP

Method for fast compilation of preverified JAVA bytecode to high quality native machine code

InactiveUS6978451B2Binary to binarySpecific program execution arrangementsParallel computingJava bytecode

The present invention is a new method and apparatus to perform fast compilation of platform independent bytecode instruction listings into high quality machine code in a single sequential pass. More specifically, the present invention creates a new method and apparatus for the translation of platform neutral bytecode into high quality machine code in a single sequential pass in which information from the preceding instruction translation is used to mimic an optimizing compiler without the extensive memory and time requirements. Where the preceding instruction translation cannot be used due to no direct control flow, information from comprehensive stack maps is then used.

Method for fast compilation of preverified JAVA bytecode to high quality native machine code

Method for fast compilation of preverified JAVA bytecode to high quality native machine code

Method for fast compilation of preverified JAVA bytecode to high quality native machine code

Owner:MYRIAD GROUP

Compiler optimizations for vector operations that are reformatting-resistant

ActiveUS20170052769A1Reduce in quantityImprove encoding performanceSoftware engineeringProgram controlScalar ValueRunning time

An optimizing compiler includes a vector optimization mechanism that optimizes vector operations that are reformatting-resistant, such as source instructions that do not have a corresponding reformatting operation, sink instructions that do not have a corresponding reformatting operation, a source instruction that is a scalar value, a sink instruction that may produce a scalar value, and an internal operation that depends on lanes being in a specified order. The ability to optimize vector instructions that are reformatting-resistant reduces the number of operations to improve the run-time performance of the code.

Compiler optimizations for vector operations that are reformatting-resistant

Compiler optimizations for vector operations that are reformatting-resistant

Compiler optimizations for vector operations that are reformatting-resistant

Owner:IBM CORP

Optimizing compiler for improving application performance on many-core coprocessors

InactiveUS20130055224A1Eliminating one or more redundant data transfersProgram synchronisationSoftware engineeringCoprocessorOptimizing compiler

A system and method for compiling includes parsing code of an application stored in a computer readable storage medium to identify one or more parallelizable code portions. At least one parallelizable code portion is optimized by transforming offload construct code portions to provide an optimized application.

Optimizing compiler for improving application performance on many-core coprocessors

Optimizing compiler for improving application performance on many-core coprocessors

Optimizing compiler for improving application performance on many-core coprocessors

Owner:NEC LAB AMERICA

Method and system for implementing invocation stubs for the application programming interfaces embedding with function overload resolution for dynamic computer programming languages

InactiveUS20160246622A1Well formedSoftware reuseSoftware simulation/interpretation/emulationImage resolutionApplication programming interface

Systems and methods for increasing the execution speed of external API functions invocation and runtime checks. The techniques for generating invocation stubs for an application programming interfaces embedding with functions overload resolution so that a script or program written in a dynamic high-level programming language may reuse existing code base from other high-level programming language and be more flexible than traditional approaches. The method further involves compiling the high-level code templates to native code to obtain optimized native code templates, using an optimizing compiler subsystem designed for runtime use with the virtual machine. With some of the described techniques, invocation stubs are generated by a compiler, when a corresponding API import instruction is encountered at runtime, and those stubs bridge an application programming interfaces to the actual programming language for usage.

Method and system for implementing invocation stubs for the application programming interfaces embedding with function overload resolution for dynamic computer programming languages

Method and system for implementing invocation stubs for the application programming interfaces embedding with function overload resolution for dynamic computer programming languages

Method and system for implementing invocation stubs for the application programming interfaces embedding with function overload resolution for dynamic computer programming languages

Owner:SIMONYAN KARLEN

Compiler optimizations for vector instructions

ActiveUS9619214B2Reduce in quantityImprove performanceError detection/correctionHandling data according to predetermined rulesVector elementParallel computing

An optimizing compiler includes a vector optimization mechanism that optimizes vector instructions by eliminating one or more vector element reverse operations. The compiler can generate code that includes multiple vector element reverse operations that are inserted by the compiler to account for a mismatch between the endian bias of the instruction and the endian preference indicated by the programmer or programming environment. The compiler then analyzes the code and reduces the number of vector element reverse operations to improve the run-time performance of the code.

Compiler optimizations for vector instructions

Compiler optimizations for vector instructions

Compiler optimizations for vector instructions

Owner:INT BUSINESS MASCH CORP

Parallelization of irregular reductions via parallel building and exploitation of conflict-free units of work at runtime

InactiveUS20110088020A1Good parallel speedupKeeping memory footprintSoftware engineeringProgram controlParallel computingRunning time

An optimizing compiler device, a method, a computer program product which are capable of performing parallelization of irregular reductions. The method for performing parallelization of irregular reductions includes receiving, at a compiler, a program and selecting, at compile time, at least one unit of work (UW) from the program, each UW configured to operate on at least one reduction operation, where at least one reduction operation in the UW operates on a reduction variable whose address is determinable when running the program at a run-time. At run time, for each successive current UW, a list of reduction operations accessed by that unit of work is recorded. Further, it is determined at run time whether reduction operations accessed by a current UW conflict with any reduction operations recorded as having been accessed by prior selected units of work, and assigning the unit of work as a conflict free unit of work (CFUW) when no conflicts are found. Finally, there is scheduled, for parallel run-time operation, at least two or more processing threads to process a respective the at least two or more assigned CFUWs.

Parallelization of irregular reductions via parallel building and exploitation of conflict-free units of work at runtime

Parallelization of irregular reductions via parallel building and exploitation of conflict-free units of work at runtime

Parallelization of irregular reductions via parallel building and exploitation of conflict-free units of work at runtime

Owner:IBM CORP

Popular searches

Data dependence Resource use Program Dependence Graph Instruction scheduling Computer architecture Engineering Memory operation Cache hit rate Age values Hit ratio

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com