Multi-threaded system for performing atomic binary translations

a multi-threaded binary translation and multi-threaded technology, applied in the field of multi-threaded software, can solve the problems of unnecessarily large performance overhead of mutual exclusion software primitives, inability to perform multi-core target systems correctly, and inability to perform multi-core target systems. to achieve the effect of minimizing overhead, speeding up subsequent store execution, and minimizing extra check overhead

Inactive Publication Date: 2015-05-28
NXP USA INC
View PDF9 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0039]The method 600 when in fast mode operation, the data store is execute in an unmodified manner. In contrast, when in accurate mode operation, the goal is to clear the Reservation Valid flags RV of all vCPUs that have a valid reservation on the same coherency granule as the shared memory address. To optimize this process and minimize the extra check overheads (for most store instructions), a core Instruction Set Simulation TLB array data structure has a Reservation Monitor flag RM per entry. Pages whose TLB entry's RM flag is clear (implying no active reservations on this page in the system), proceed the fastest in the same manner similar to fast mode. However, pages that have a Reservation Monitor flag RM set for their TLB entry are required to clear all vCPUs that have RV flags set and their RA values are in coherency granule of the store instruction address. Store instructions also are responsible to set the RM for TLB entries where any vCPU has a valid reservation falling into that page. In order to minimize this overhead, store only sets RM flags if LRMC and GRMC are not equal i.e., if a new reservation address is active since the last store execution. Further, the store instruction execution also clears the RM on its TLB page if there are no active reservations anymore on this page. As will be apparent to a person skilled in the art, the method 600 has the potential to speed up execution of subsequent stores on the page in question.

Problems solved by technology

When considering multi-core architectures, sequential target simulation is prohibitively slow, thereby motivating the use of parallel simulation in which multiple threads may be running target ISAs.
One challenge with the parallel simulation of atomic instructions relates to the complexity of parallel access to shared memory locations by multiple contending threads.
However, mutual exclusion software primitives have unnecessarily large performance overhead.
Hence, wait-free non-blocking algorithms are preferable over lock-free, however wait free non-blocking algorithms typically have inherent race conditions and will not work correctly for multi-core target systems.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-threaded system for performing atomic binary translations
  • Multi-threaded system for performing atomic binary translations
  • Multi-threaded system for performing atomic binary translations

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0012]The detailed description set forth below in connection with the appended drawings is intended as a description of presently preferred embodiments of the invention, and is not intended to represent the only forms in which the present invention may be practised. It is to be understood that the same or equivalent functions may be accomplished by different embodiments that are intended to be encompassed within the spirit and scope of the invention. In the drawings, like numerals are used to indicate like elements throughout. Furthermore, terms “comprises,”“comprising,” or any other variation thereof, are intended to cover a non-exclusive inclusion, such that module, circuit, device components, structures and method steps that comprises a list of elements or steps does not include only those elements but may include other elements or steps not expressly listed or inherent to such module, circuit, device components or steps. An element or step proceeded by “comprises . . . a” does n...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A multi-threaded binary translation system performs atomic operations by a thread, such operations include processing a load linked instruction and a store conditional instruction. The store conditional instruction updates data stored in a shared memory address only when at least three conditions are satisfied. The conditions are: a copy of a load linked shared memory address of the load linked instruction is the same as the store conditional shared memory address, a reservation flag indicates that the thread has a valid reservation, and the copy of data stored by the load linked instruction is the same as data stored in the store conditional shared memory address.

Description

BACKGROUND OF THE INVENTION[0001]The present invention relates generally to multi-threaded software and, more particularly, to a system for performing an atomic operation by a thread in a multi-threaded binary translation system.[0002]Binary translation is the simulation of one (target) Instruction Set Architecture (ISA) with another (host) ISA. The performing of binary translations (target simulations) can be optionally accompanied with optimization and code instrumentation in which the host and target ISA may be the same or different architectures.[0003]When considering multi-core architectures, sequential target simulation is prohibitively slow, thereby motivating the use of parallel simulation in which multiple threads may be running target ISAs. In this regard the target hardware architecture provides hardware guaranteed atomic instructions for implementing synchronization primitives in a shared memory cache coherent multi-core environment or system. More specifically, when an ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F12/08
CPCG06F12/0875G06F2212/6042G06F12/084G06F8/45G06F8/52G06F9/3004G06F9/45558
Inventor MATHUR, ASHISHJAIN, SANDEEP
Owner NXP USA INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products