Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Partitioning and thread-aware based performance optimization method of last level cache (LLC)

A technology of last-level cache and optimization method, which is applied in the direction of memory system, memory address/allocation/relocation, instrument, etc., and can solve problems such as difficulty in improving the performance of multi-processor global cache

Inactive Publication Date: 2010-12-15
SUZHOU INST FOR ADVANCED STUDY USTC
View PDF2 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a method for optimizing the performance of the last-level cache based on partition awareness and thread awareness, which solves the problems in the prior art that the performance of the multi-processor global cache on a chip is difficult to improve when multiple threads are concurrent.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Partitioning and thread-aware based performance optimization method of last level cache (LLC)
  • Partitioning and thread-aware based performance optimization method of last level cache (LLC)
  • Partitioning and thread-aware based performance optimization method of last level cache (LLC)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] The above solution will be further described below in conjunction with specific embodiments. It should be understood that these examples are used to illustrate the present invention and not to limit the scope of the present invention. The implementation conditions used in the examples can be further adjusted according to the conditions of specific manufacturers, and the implementation conditions not indicated are usually the conditions in routine experiments.

[0056] Embodiment This embodiment uses an event-driven, cycle-accurate multi-core simulator Multi2sim based on x86 instruction set architecture to evaluate the effectiveness of the PAE-TIP method. The target simulation platform is a 4-way multi-core processor, and all processor cores share a 4MB, 16-way set-associative secondary cache. Each processor core is a 4-issue, out-of-order superscalar structure, and has a private first-level instruction and data cache. See Table 2 for detailed configuration information...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a partitioning and thread-aware based performance optimization method of a last level cache (LLC), which is used for the communication of data and commands between a multicore processor and a shared last level high-grade cache. The method is characterized by comprising the following steps of: selecting a candidate substitute block of the LLC by the shared last level high-grade cache through a partitioning-aware elimination method when the multicore processor is incollateral execution of application; and optimizing a method between a basic LRU (Least Recently Used) method and bi-directional insertion method by using a thread-aware dynamic insertion policy (TADIP) to insert a new data block in the cache and select a method between a method used most frequently recently and a single-step promotion method by using a thread-aware dynamic promotion policy with feedback (TADPP-F) to promote a hit block in the LLC. The method promotes the property of a chip multiprocessor (CMP) global Cache during simultaneous multiple-threading.

Description

technical field [0001] The invention belongs to the technical field of improving the performance of the last-level advanced cache of a multi-core processor in an information processing system, and in particular relates to a method for improving cache performance through partition-aware elimination of thread-aware insertion of the last-level cache. Background technique [0002] With the development of microprocessor technology, multi-core processors, especially on-chip multiprocessor (ChipMultiprocessor, CMP) have become the only technical way to build contemporary high-performance microprocessors. Sharing a last level cache (Last Level Cache, LLC) based on a Least Recently Used (LRU) method has been widely adopted by contemporary on-chip multi-core processors CMP. However, some past research results have shown that when there is interference between concurrently executing threads or when the working set of the load exceeds the cache capacity, the performance of the last leve...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F12/08G06F9/38G06F12/121
Inventor 吴俊敏赵小雨隋秀峰尹巍唐轶轩朱小东
Owner SUZHOU INST FOR ADVANCED STUDY USTC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products