Unlock instant, AI-driven research and patent intelligence for your innovation.

A low memory overhead hot spot data identification method for hybrid storage systems

A hot data, hybrid storage technology, applied in memory systems, digital data processing, special data processing applications, etc., can solve the problems of inability to delete, increase memory overhead, increase memory overhead, etc., to improve accuracy and efficiency, reduce Memory overhead, effect of increased flexibility

Active Publication Date: 2017-12-12
NAT UNIV OF DEFENSE TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] 1) Bloom Filter cannot record repeated visits to data; no matter how many times page x is accessed, BloomFilter only records k bits h corresponding to x 1 (x), h 2 (x),...,h k (x) is set to 1, but no action is taken to record the number of times x is accessed
Bloom Filter cannot identify hotspot data from the visit history because there is no visit count information
[0006] 2) Bloom Filter can only continuously record new pages, but cannot delete any pages already recorded in Bloom Filter, resulting in longer and longer recorded access history; in fact, early access history has no effect on hotspot data identification Using value, recording too long access history will increase memory overhead
[0008] Although the MBFs method can use multiple Bloom Filters to record the number of page visits, and delete the early access history by periodically clearing a Bloom Filter, it can efficiently identify hot data from the access history; however, the MBFs method also has two problems: Defects in aspects: on the one hand, due to the use of multiple Bloom Filter data structures, the memory overhead of the MBFs method is still high; on the other hand, the range of page access times that MBFs can record is limited; assuming that the MBFs method maintains n Bloom Filters, for For a specific page, the maximum number of visits that this method can record for it is n; if the number of visits to this page exceeds n, the excess visit information cannot be recorded in the Bloom Filter, resulting in the loss of some hotspot information
The shortcomings of the MBFs method in the above two aspects cannot be compensated at the same time, that is, increasing the counting range (n) of the number of accesses will inevitably lead to an increase in memory overhead, so the MBFs method is also difficult to apply to large-scale hybrid storage systems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A low memory overhead hot spot data identification method for hybrid storage systems
  • A low memory overhead hot spot data identification method for hybrid storage systems
  • A low memory overhead hot spot data identification method for hybrid storage systems

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The present invention will be further described below in conjunction with the accompanying drawings and specific preferred embodiments, but the protection scope of the present invention is not limited thereby.

[0042] Such as figure 1 As shown, this embodiment is oriented to a low-memory-overhead hotspot data identification method for a hybrid storage system, and the specific implementation steps are:

[0043] 1) Definition of UBF and initialization: define a Bloom filter UBF with a super large counting capacity, and the Bloom filter UBF with a super large counting capacity records the number of page visits through a bit array and k hash functions that map each page to the bit array; initialize The bit array is 0 and each hash function is set;

[0044] 2) Query page popularity: When page x receives an access request, query the popularity of page x through the bit array, and the popularity of page x is the number of digits with a value of 1 among the k bits correspondi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a low-memory-overhead hotspot data identification method oriented to a hybrid storage system. The specific steps are: 1) defining a Bloom filter UBF with a super large counting capacity to record the number of page visits; 2) when page x is accessed, query The popularity of page x; 3) If x has not been recorded in UBF, record x by flipping several bits corresponding to x in UBF from 0 to 1; if x has been recorded in UBF, record x by flipping a corresponding 0 The bit is flipped to 1 with a certain probability, thereby increasing the popularity of x; 4) Compare the popularity of page x with the specified popularity threshold, if the popularity of page x is greater than the popularity threshold, the data corresponding to page x is identified as hot data; jump Execute step 2). The invention has the advantages of low memory overhead, capable of realizing identification of hotspot data with a large amount of data, and high identification accuracy.

Description

technical field [0001] The invention relates to the technical field of large-scale hybrid storage, in particular to a method for identifying hotspot data with low memory overhead for a hybrid storage system. Background technique [0002] The current large-scale storage systems have high requirements on performance and cost, and the hybrid storage architecture is a solution that can meet these requirements at the same time. In a hybrid storage system, high-end devices with small capacity and high performance are used to ensure performance, while low-end devices with low performance and low prices are used to reduce costs. Therefore, in such a heterogeneous storage system, high-end devices are generally used to save frequent Accessed hot data, low-end devices are used to save infrequently accessed cold data, but to achieve this optimization measure depends to a large extent on the accurate identification of hot data. [0003] Hot data is generally identified through statistic...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F12/0882G06F17/30
Inventor 肖侬陈志广卢宇彤周恩强张伟董勇
Owner NAT UNIV OF DEFENSE TECH