Storage and calculation integrated program division method and device based on pure deletion detection method

A detection method and an integrated technology, applied in the direction of program control design, machine execution device, memory architecture access/allocation, etc., can solve the problems of system performance loss, code transfer, storage and calculation integration and overall system performance degradation, etc., to achieve good performance, Avoid code transfer, avoid the effect of task migration

Active Publication Date: 2021-07-16
ZHEJIANG LAB +1
View PDF14 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In a modern storage system that considers concurrency, pure deletion may cause a greater performance loss to the system as a whole
[0007] On the other hand, limited by the existing process conditions, such as chip area, heat dissipation, etc., the processing core unit on the PIM side cannot achieve the same performance as the CPU side, and this trend cannot be improved in the short term.
Under the usual PIM architecture, the existing program division method often divides the program into a rough standard division, resulting in more codes executed by the integrated computing unit, which may affect the overall performance of the PIM system.
[0008] In summary, since the current program division method does not fully consider concurrent memory access, both types of memory access misses may be loaded into the PIM computing unit, resulting in unnecessary code transfer to the PIM
At the same time, due to the performance difference between the CPU and the PIM processing unit, migrating too many tasks to the PIM side for execution may lead to a decrease in the performance of the overall storage and computing system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Storage and calculation integrated program division method and device based on pure deletion detection method
  • Storage and calculation integrated program division method and device based on pure deletion detection method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0031] Most existing PIM program partitioning strategies do not fully consider the concurrency of memory access. An accurate criterion is needed to define which parts of the code cause CPU memory fetch waits in order to move them to PIM for execution. The present invention proposes a program code division method between the CPU and the PIM processing unit, considers the concurrency of memory access, uses the pure miss rate of the last level of cache to judge the code segment that will actually cause the CPU to stall, and transports it to execute in PIM. On the other hand, this method can also relatively reduce the amount of program handling, so that the overal...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a storage and calculation integrated program division method and device based on a pure deletion detection method. The method comprises the steps that S1, whether a code is a loop body or not is judged; s2, if yes, part of instructions of the loop body are executed on a CPU, the pure missing rate is calculated by detecting pure missing, and missing detection comprises the steps that S21, when cache hit occurs, current cache line information is recorded into a hit list, a state is kept in a missing state, and missing cache line information is recorded into a missing list; s22, comparing the hit list with the missing list, and finding out pure missing; and S3, if the pure missing rate is greater than a preset threshold value, dividing the loop body into a storage and calculation integrated calculation unit for calculation. The device comprises a CPU and a storage and calculation integrated unit, the CPU is provided with a division and recognition module, the module performs cyclic sampling and comprises a last-stage cache and a missing state keeping register which are connected with each other, and a pure missing detection part is arranged in the last-stage cache.

Description

technical field [0001] The invention relates to the technical field of computer architecture design, in particular to a program division method and device for integrated storage and calculation based on a pure missing detection method. Background technique [0002] With the rapid development of information technologies such as big data, cloud computing, and artificial intelligence, information processing has shifted from computing-intensive to data-intensive. The traditional von Neumann architecture faces the bottleneck of the storage wall, and it is difficult to meet the demand for computing power in the information society. The challenges mainly come from two aspects: first, in the von Neumann architecture where storage and computing are separated, frequent data transmission between computing units and storage units has become the bottleneck of power consumption and system performance; second, as data As the data volume increases, the effectiveness of the traditional meth...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/38G06F9/30G06F12/0811G06F12/0842G06F12/0875
CPCG06F9/3836G06F9/30145G06F9/3802G06F12/0811G06F12/0842G06F12/0875G06F2212/452Y02D10/00
Inventor 邹兴奇闫亮
Owner ZHEJIANG LAB
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products