A cache management method, apparatus and device

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By training a cache write prediction model online and dynamically adjusting the cache write strategy, the problem of low hit rate in long-tail distribution and linear scan scenarios of the cache system is solved, and more efficient cache management is achieved.

CN114817319BActive Publication Date: 2026-06-23HUAWEI CLOUD COMPUTING TECHNOLOGIES CO LTD

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: HUAWEI CLOUD COMPUTING TECHNOLOGIES CO LTD
Filing Date: 2021-04-16
Publication Date: 2026-06-23

AI Technical Summary

Technical Problem

Existing caching systems have low hit rates in long-tail distribution and linear scan scenarios, and it is difficult to adaptively adjust write strategies to cope with changes in business data.

Method used

An online adaptive cache writing method is adopted. By training a cache write prediction model, the future hit probability of data is predicted based on a data-driven algorithm, and the writing strategy is dynamically adjusted.

Benefits of technology

It improves cache hit rate, enhances the performance and adaptability of the caching system, and reduces cache flushing speed.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN114817319B_ABST

Patent Text Reader

Abstract

The application provides a cache management method, which is used for a cache management device. After receiving a first data write request, the cache management method trains a cache write prediction model according to related parameters of the first data, writes the first data into a cache, receives a second data write request, and determines whether to write the second data into the cache according to the cache write prediction model. The cache management method trains the cache write prediction model online by using a data-driven algorithm, and effectively improves the cache hit rate.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of caching, and in particular to a cache management method, apparatus, and device. Background Technology

[0002] Application-level caching is a common storage component used in systems such as databases, content delivery networks, and data storage to accelerate data access. The main function of caching is to temporarily store data that will be frequently accessed by the system in the near future, thereby reducing the average latency of data access. Cache media offer fast access (read and write) speeds but have small storage capacities. The core competitive advantage of caching is a high hit rate. The hit rate indicates the proportion of data accessed from the cache and retrieved.

[0003] Therefore, improving cache hit rate has become the most pressing issue in the industry. Summary of the Invention

[0004] This application provides a cache management method that can improve cache hit rate.

[0005] The first aspect of this application provides a cache management method for a cache management device. The method includes: receiving a first data write request, the first data write request requesting that first data in a hard disk be written to a cache; training a cache write prediction model based on relevant parameters of the first data; writing the first data to the cache; receiving a second data write request, the second data write request requesting that second data in a hard disk be written to the cache; and determining whether to write the second data to the cache based on the cache write prediction model.

[0006] This cache management method improves the accuracy of writing probability prediction for data to be written by training a cache write prediction model online, thereby increasing the cache hit rate.

[0007] In some possible designs, the method further includes: receiving the first batch of data write requests; and determining the first data write request and the second data write request from the first batch of data write requests according to sampling rules.

[0008] In some possible designs, the method further includes: obtaining the identifier of the data to be written carried by each data write request in the first batch of data write requests; determining that the hash value of the identifier of the data to be written carried by each data write request in the first batch of data write requests is divisible by the sample value as the first data write request; and determining that the hash value of the identifier of the data to be written carried by each data write request in the first batch of data write requests is not divisible by the sample value as the second data write request.

[0009] In some possible designs, the method further includes: receiving a second batch of data write requests; and determining, based on the trained cache write prediction model, whether to write at least one piece of data from the data to be written corresponding to the second batch of data write requests into the cache.

[0010] In some possible designs, the method further includes: using at least one of the following parameters as training inputs to the cache write prediction model: the average size of the data in the cache, the total number of write requests per historical period, the total number of read requests per historical period, the average evictation period of the data in the cache, and the relevant parameters of the first data; determining the write probability of the first data as the training output of the cache write prediction model based on the request and evictation status of the first data within one average evictation period after the first data request occurs; and training the cache write prediction model based on the training inputs and the training outputs.

[0011] In some possible designs, the relevant parameters of the first data include at least one of the following: the number of historical requests to write the first data, and the number of historical requests to read the first data.

[0012] In some possible designs, the method further includes: obtaining a write threshold; obtaining the write probability of the second data based on the cache write prediction model; and determining whether to write the second data to the cache based on the write threshold and the write probability of the second data.

[0013] In some possible designs, the method further includes: using at least one of the average size of the data in the cache, the average evictation period of the data in the cache, the total number of write requests per historical period, the total number of read requests per historical period, and the second data-related parameters as the prediction input of the cache write prediction model; and using the prediction input as the input of the trained cache write prediction model to obtain the write probability of the second data.

[0014] A second aspect of this application provides a cache management apparatus, comprising a communication unit and a processing unit: the communication unit is configured to receive a first data write request, the first data write request being used to request that first data in the hard disk be written to the cache; the processing unit is configured to train a cache write prediction model based on relevant parameters of the first data; and write the first data to the cache; the communication unit is further configured to receive a second data write request, the second data write request being used to request that second data in the hard disk be written to the cache; the processing unit is further configured to determine whether to write the second data to the cache based on the cache write prediction model.

[0015] In some possible designs, the communication unit is used to receive the first batch of data write requests; the processing unit is used to determine the first data write request and the second data write request from the first batch of data write requests according to the partitioning rules.

[0016] In some possible designs, the processing unit is used to obtain the identifier of the data to be written carried by each data write request in the first batch of data write requests; determine that the hash value of the identifier of the data to be written carried by each data write request in the first batch of data write requests is divisible by the sample value as the first data write request; and determine that the hash value of the identifier of the data to be written carried by each data write request in the first batch of data write requests is not divisible by the sample value as the second data write request.

[0017] In some possible designs, the communication unit is used to receive a second batch of data write requests; the processing unit is used to determine, based on the trained cache write prediction model, whether to write at least one piece of data from the data to be written corresponding to the second batch of data write requests into the cache.

[0018] In some possible designs, the processing unit is used as the training input of the cache write prediction model, taking at least one of the following: the average size of the data in the cache, the total number of write requests per historical cycle, the total number of read requests per historical cycle, and the average evictation cycle of the data in the cache, and the relevant parameters of the first data; determining the write probability of the first data as the training output of the cache write prediction model based on the request and evictation status of the first data within one of the average evictation cycles after the first data request occurs; and training the cache write prediction model based on the training input and the training output.

[0019] In some possible designs, the relevant parameters of the first data include at least one of the following: the number of historical requests to write the first data, and the number of historical requests to read the first data.

[0020] In some possible designs, the communication unit is used to obtain a write threshold; the processing unit is used to obtain the write probability of the second data based on the cache write prediction model; and based on the write threshold and the write probability of the second data, to determine whether to write the second data into the cache.

[0021] In some possible designs, the processing unit is used as at least one of the average size of the data in the cache, the average evictation period of the data in the cache, the total number of write requests per historical period, the total number of read requests per historical period, and the second data-related parameters as the prediction input of the cache write prediction model; and uses the prediction input as the input of the trained cache write prediction model to obtain the write probability of the second data.

[0022] A third aspect of this application provides a computing device cluster including at least one computing device, each computing device including a processor and a memory; the processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device to cause the computing device to perform a method as provided in the first aspect or any possible design of the first aspect.

[0023] The fourth aspect of this application provides a computer program product comprising instructions that, when executed by a cluster of computer devices, cause the cluster of computer devices to perform a method as provided in the first aspect or any possible design of the first aspect.

[0024] The fifth aspect of this application provides a computer-readable storage medium including computer program instructions that, when executed by a cluster of computing devices, perform a method as provided in the first aspect or any possible design of the first aspect. Attached Figure Description

[0025] To more clearly illustrate the technical methods of the embodiments of this application, the accompanying drawings used in the embodiments will be briefly described below.

[0026] Figure 1 This is a schematic diagram of a possible application scenario applicable to the embodiments of this application.

[0027] Figure 2 This is a flowchart of a possible cache management method applicable to embodiments of this application.

[0028] Figure 3 This is a schematic diagram of a possible cache management device structure applicable to embodiments of this application.

[0029] Figure 4 This is a schematic diagram of a possible computing device structure applicable to embodiments of this application.

[0030] Figure 5 This is a schematic diagram of a possible computing device cluster structure applicable to embodiments of this application.

[0031] Figure 6 This is a schematic diagram of a possible computing device cluster structure applicable to embodiments of this application.

[0032] Figure 7 This is a schematic diagram of a possible computing device cluster structure applicable to embodiments of this application. Detailed Implementation

[0033] The terms "first" and "second" used in the embodiments of this application are for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Therefore, a feature defined with "first" and "second" may explicitly or implicitly include one or more of that feature.

[0034] First, some technical terms involved in the embodiments of this application will be introduced.

[0035] Caching: A middle layer between fast storage media and service systems, used to temporarily store recently accessed data and reduce data access latency.

[0036] Cache write: The process of deciding whether to write data to the cache when a data write request arrives at the caching system.

[0037] Cache hit rate: When a user accesses an acceleration node, a cache hit occurs if the node has cached the data to be accessed. If not, the data needs to be retrieved from the original server, resulting in a cache miss. The data retrieval process is synchronous with the user's access, so even if new data is retrieved, the user will not experience any delay. The hit rate equals the number of hits divided by the sum of the number of hits and misses. Cache hit rate is one of the important factors in judging the effectiveness of acceleration.

[0038] Access speeds of cache media are typically faster than those of the system's main memory, with speed differences often exceeding several orders of magnitude. Application systems often use memory as a cache medium for hard drives and remote networks. However, the available storage capacity of cache media is extremely small relative to the data size of the application system. For example, in a typical internet service database, the database usually stores petabytes (PB) of data, but the cache capacity is only a few gigabytes (GB), or even less than 1GB.

[0039] Because cache media offers fast access speeds, a higher cache hit rate generally translates to better system performance. However, due to the limited storage capacity of cache media, data needs to be continuously evicted to free up storage space for newly written data. Common cache eviction algorithms include Least Recently Used (LRU) and First-In, First-Out (FIFO). These methods tend to retain newer data and evict older data.

[0040] In current mainstream cloud and internet services, the frequency of data requests typically follows a long-tail distribution, with the vast majority of requests concentrated on a small subset of data, and most data being accessed only once. Under this data distribution, a large amount of newly requested data is constantly written to memory, while older data is continuously evicted. However, newly written data will not be accessed a second time, often resulting in low cache hit rates. On the other hand, linear data traversal is also a common business practice in internet services, where data is scanned linearly one by one, with no second access during the scan. Furthermore, various network attacks often attempt to scan the entire system's data set. In scenarios like long-tail distribution and linear scanning, a large amount of data is accessed only once in a short period. Using mainstream cache eviction algorithms can lead to a rapid erosion (also known as thrashing) problem in the cache media, meaning data is constantly written to the cache with extremely low hit rates.

[0041] In this context, introducing cache write functionality can typically reduce cache flushing speed and improve hit rate. Cache write refers to determining whether data is valuable enough to be included in the cache system before it enters the cache medium; if it is not valuable, the data is not placed in the cache medium.

[0042] Currently, there are already relevant methods for managing cache writes.

[0043] For example, to cope with linear scanning, some systems do not allow data accessed for the first time recently to enter the cache, but only write data accessed a second or subsequent time to the cache. In addition, some systems use statistical information, such as counting the recent access count of each piece of data, and set a threshold; if access exceeds this threshold, the data is written to the cache, otherwise it is blocked. There are also some write techniques based on rules or frequency statistics.

[0044] These methods typically have poor business adaptability, as little is known about the actual data that will be entered into the system before it goes live. In such cases, pre-determining rules or deciding on statistical indicators and thresholds makes the results difficult to predict.

[0045] Furthermore, in common application systems, the distribution of data changes according to business needs and changing scenarios. If the write strategy cannot adapt to these changes, a decrease in cache hit rate is likely to occur.

[0046] In view of this, embodiments of this application provide a technical solution capable of adaptively adjusting the write strategy, namely, an online adaptive cache write method. Driven by a data-driven algorithm, the system automatically predicts the future hit probability of data during runtime, learns the write model online, and thus quickly adapts to changes in business data, achieving accurate and efficient write results.

[0047] To make the technical solution of this application clearer and easier to understand, the following is combined with... Figure 1The scenario for cache write method 200 is introduced.

[0048] In one possible implementation, after user 101 clicks on application 102, application 102 will be triggered to initiate a data read request to cache 103. Application 102 can be a web application or a third-party application on a smart terminal, etc.

[0049] It should be noted that the data read request initiated by application 102 to cache 103 will be recorded by cache management device 104. Specifically, the identification information of the requested data and the request time will be recorded. Furthermore, the historical number of times this data has been requested can be updated.

[0050] When the data exists in cache 103, cache 103 will return the data to application 102. Simultaneously, cache management device 104 will record the result of the data request. That is, the data read request has been responded to.

[0051] When the data is not present in cache 103, application 102 will initiate a data read request to hard disk 105. Simultaneously, cache management device 104 will record the result of this data request. That is, the data read request was not responded to. Hard disk 105 may include solid-state drives, traditional hard disks, and hybrid hard disks.

[0052] When the data exists in hard disk 105, hard disk 105 will return the data to application 102. At the same time, cache management device 104 will decide whether to write the data to cache 103.

[0053] In this type of possible implementation, the request to write data to cache 103 is triggered by user 101 clicking application 102.

[0054] If the data is not present on hard disk 105, the system will return a message to application 102 indicating that the data cannot be found.

[0055] In one possible implementation, the request to write data to cache 103 can be triggered by application 102. For example, it could be triggered by refresh and warm-up of the content delivery network (CDN).

[0056] Specifically, for the Uniform Resource Locator (URL) that the tenant specifies in application 102 needs to be refreshed or preheated, application 102 will actively retrieve the updated data from the origin server. This allows user 101 to access the CDN without having to return to the tenant's origin server to retrieve the data. Therefore, after the refresh or preheating action is triggered, the cache management device 104 will determine whether to write this data to the cache 103.

[0057] The following describes a cache writing method 200 provided in this application. This cache writing method can be run on a cache management device 104.

[0058] The flowchart of cache write method 200 is as follows: Figure 2 As shown, this cache writing method comprises four parts: request information recording, prediction model training, write judgment, and cache data eviction.

[0059] First, the cache management device 104 may receive request information including data write requests and data read requests. By recording parameters such as the request time and number of requests, data can be provided for training the prediction model. Specifically, the request information recording part includes steps S201 to S204.

[0060] S201: Cache management device 104 receives a data read request.

[0061] As mentioned earlier, after user 101 clicks on application 102, application 102 will initiate a data read request to cache management device 104. In other words, when a data read request arrives, cache management device 104 receives the data read request. The data read request indicates that data in cache 103 be returned to application 102.

[0062] S202: The cache management device 104 records the request information for the data to be read.

[0063] When a data read request arrives, the cache management device 104 records the information of the data to be read and the request time. The data information can be an identifier (ID). Optionally, it can also be information such as a universally unique identifier (UUID) for the data. The following section will use ID as an example.

[0064] Based on the recorded time of each data read request, at least one of the following parameters can be calculated: the number of read requests per cycle for each data point and the historical average number of read requests per cycle. The cycle length can be set as needed. For example, if the cycle is 1 second, the number of read requests per cycle indicates the number of times the data is requested to be read within 1 second, while the historical average number of read requests per cycle indicates the average number of read requests per cycle for the data over a historical counting period. The historical counting period can be set as needed.

[0065] Optionally, to record the number of read requests per cycle for each data point within a certain time period, an 8-byte data set can be maintained for each data point. Further, the 8 bytes are divided into 32 equal units. Each unit is 2 bits, numbered 1, 2, ..., 32. Each unit can represent one of four states (00, 01, 10, 11) of the number of read requests per cycle for that data, corresponding to no read request, one, two, three, or more read requests, respectively. In other words, by maintaining an 8-byte data set for each data point, the number of read requests per cycle for that data over the past 32 cycles can be recorded. Using the above storage method, a large number of read requests per cycle can be stored with a relatively small storage space.

[0066] Optionally, based on the request time of each recorded data read request, the total number of read requests within each period can also be obtained. That is, the total number of read requests per period. Similarly, after recording the total number of read requests within each period, it is possible to retain the total number of read requests per period for a certain historical time period. The length of the historical time period can be set as needed. Furthermore, the total number of read requests per period for a certain time period can also be recorded by maintaining a data group of several bytes. For specific methods, refer to the method described above for maintaining the data group corresponding to the number of read requests per period.

[0067] It should be noted that steps S201 and S202 are not mandatory. As mentioned earlier, when the data write request is not triggered by user 101, but by application 102's refresh or warm-up action, steps S201 and S202 are not mandatory.

[0068] S203: Cache management device 104 receives a data write request.

[0069] As mentioned earlier, there are two scenarios for data write requests: triggered by user 101 and triggered by application 102. In both scenarios, the cache management device 104 will receive the data write request. The data write request indicates that the requested data from the hard drive be written to the cache.

[0070] In one possible implementation, after user 101 clicks on application 102, application 102 will be triggered to initiate a data write request to cache 103. At the same time, the data write request initiated by application 102 to cache 103 will be recorded by cache management device 104.

[0071] When the requested data exists in cache 103, cache 103 will return the data to application 102. Simultaneously, cache management device 104 will record the result of the data request. That is, the data read request has been responded to.

[0072] If the requested data is not present in cache 103, application 102 will initiate a data read request to disk 105. Simultaneously, cache management device 104 will record the data request result. That is, the data read request did not receive a response.

[0073] When the data exists in hard disk 105, hard disk 105 will return the requested data to application 102. Simultaneously, cache management device 104 will decide whether to write the requested data to cache 103. That is, cache management device 104 receives the write request for the requested data.

[0074] In this type of possible implementation, the request to write data to cache 103 is triggered by user 101 clicking application 102. That is, it is triggered by steps S101 and S102.

[0075] In one possible implementation, the request to write data to cache 103 may be triggered by application 102. For example, it could be triggered by refresh and warm-up of the content delivery network (CDN). Therefore, cache management device 104 will receive the data write request.

[0076] In some possible implementations, the cache management device 104 will receive a batch of data write requests while accumulating a certain amount of training data. This "certain amount" can be a sampling threshold or can be set as needed.

[0077] For example, the first batch of data write requests refers to a set of data write requests received by the cache management device 104 within a certain time period. In step S205, the first data write request and the second data write request in the first batch of data write requests can be determined according to the sampling rules. The first data write request or the second data write request may contain one or more data write requests. Furthermore, the first data mentioned below refers to the data requested to be written in the first data write request. Similarly, the second data refers to the data requested to be written in the second data write request.

[0078] The following section will use the first batch of data write requests as an example.

[0079] S204: Cache management device 104 records request information for data to be written.

[0080] When the first batch of data write requests arrives, record the ID and request time of each data item in the first batch. Furthermore, at least one of the following parameters can be obtained: the number of write requests per cycle for each data item, the historical average number of write requests per cycle, and the total number of write requests per cycle. The cycle length can be set as needed. For example, if the cycle is 1 second, the number of write requests per cycle indicates the number of times each data item is requested to be written within 1 second, while the historical average number of write requests per cycle for each data item indicates the average number of write requests per cycle for each data item over the historical counting period.

[0081] Based on the request time of each recorded data write request, the total number of write requests per cycle can be obtained. This metric indicates the total number of write requests for all data received by the cache management device 104 within one second.

[0082] Furthermore, after recording the number of write requests for each data item per cycle, it is possible to retain the number of write requests for each cycle for a certain period of time. In large-scale data systems with a large data volume, storing the number of write requests for each data item per cycle for a long period of time would occupy a large amount of storage space. Therefore, in order to reduce storage space, methods such as optimizing data storage and shortening the retention period can be adopted.

[0083] Taking data storage optimization as an example, each piece of data can be maintained in an 8-byte data group. Further, these 8 bytes are divided into 32 equal units. Each unit is 2 bits, numbered 1, 2, ..., 32. Each unit can represent one of four states (00, 01, 10, 11) of the number of write requests per cycle for that data, corresponding to no write request, one, two, three, or more write requests, respectively. In other words, by maintaining an 8-byte data group for each piece of data, the number of write requests per cycle for that data over the past 32 cycles can be recorded. Using this storage method, a large number of write requests per cycle can be stored in a relatively small amount of storage space.

[0084] Based on the request time of each recorded data write request, the total number of write requests within each period can be obtained. That is, the total number of write requests per period. Similarly, after recording the total number of write requests within each period, it is possible to retain the total number of write requests per period for a certain period. Furthermore, it is also possible to maintain a data group of several bytes to record the total number of write requests per period for a certain period. For the specific method of maintaining the data group, please refer to the method described above for maintaining the data group corresponding to the number of write requests per period.

[0085] It should be noted that the cache management device 104 can receive read requests and write requests independently. Therefore, the execution time of these two steps (steps S203 and S204) is not sequential with the execution time of steps S201 and S202. In other words, steps S203 and S204 can be executed before or after steps S201 and S202. Optionally, steps S203 and S204 can also be executed simultaneously with steps S201 and S202.

[0086] After receiving data read requests and data write requests in steps S201 to S204 and recording relevant information, a portion of the data and related information can be used to train the cache write prediction model. Specifically, the prediction model training part includes steps S205 to S207.

[0087] S205: The cache management device 104 determines whether the data to be written meets the sampling rules.

[0088] Based on the identifiers of each piece of data to be written obtained in step S204, it can be determined whether each piece of data to be written meets the sampling rules. Specifically, the first data write request and the second data write request in the first batch of data write requests can be determined according to the sampling rules. That is, the first piece of data to be written and the second piece of data to be written in the first batch of data to be written can be determined.

[0089] The first data to be written indicates data that meets the sampling rules, while the second data to be written indicates data that does not meet the sampling rules. Simultaneously, the first data to be written is written to the cache and used to train the write prediction model. For the second data to be written, the process proceeds to step S209.

[0090] Optionally, in large-scale data systems, the data volume is so large that it is difficult to use every data point as a training sample for the prediction model. Therefore, it is necessary to sample a portion of the data as training samples. At least two factors influence sampling: the sampling rate and the sampling rule.

[0091] The sampling rate indicates the average probability that each data point will be sampled, and its value is typically between 0 and 1. The lower the sampling rate, the slower the sample collection process.

[0092] Sampling rules can be determined based on the identifiers of data in the system. For example, sampling rules can be determined based on identifiers such as ID, URL, or UUID. This application uses ID as an example to establish sampling rules. Generally, ID can be a string or a number. Therefore, after obtaining the calculation result using the data's ID, it can be determined whether to sample the data based on the calculation result and the sampled value. The calculation method can be a neural network or a hash algorithm, etc. This application does not limit the calculation method. The sampled value can be set as needed. For example, the sampled value can be equal to the sampling rate.

[0093] Specifically, the sampled value is moduloed by the hash operation result. For data IDs with a remainder of 0, the data to be written is used as the first data to be written. That is, training data. Conversely, if the remainder of the sampled value modulo the hash operation result is not 0, then the data is not sampled. That is, proceed to step S209.

[0094] S206: The cache management device 104 marks the first data to be written that meets the sampling rules.

[0095] Based on the access and eviction status of the first data to be written, the cache management device 104 can mark the first data to be written.

[0096] In one possible implementation, the first data to be written can be marked based on the request status of the first data to be written within a certain period after the judgment in S205. That is, the marking status of the first data to be written can be determined based on the number of times the first data to be written is requested to be written and the number of times it is read within an average cache eviction cycle.

[0097] The length of the time period can be set as needed. The following section will use the average cache eviction period as an example; the specific method for obtaining the average cache eviction period will be explained in detail in step S213.

[0098] In this type of possible implementation, the first data to be written is marked as hot data when it meets at least one of the following conditions within an average eviction cycle: it has been requested to be written at least once and has been requested to be read at least once.

[0099] Conversely, if the first piece of data to be written meets both of the following conditions within a cache average eviction period: it is not requested to be written and it is not requested to be read, the first piece of data to be written is marked as cold data.

[0100] In one possible implementation, the training data can also be labeled based on the elimination status of the first data to be written over a future average elimination cycle.

[0101] As mentioned earlier, in S205, if the data to be written meets the sampling rules, the first data to be written is marked and then directly written to the cache. Since the cache has limited storage space, the stored data needs to be evicted according to the eviction rules. The specific eviction rules will be detailed in S212.

[0102] If the first piece of data to be written is neither requested to be written nor requested to be read within one cache eviction cycle and before being evicted by the cache, then the first piece of data to be written is marked as cold data after the cache evicts it.

[0103] It should be noted that after labeling the first piece of data to be written, the labeling status of that data will not be modified. After completing the labeling of the data to be written, the process can proceed to step S207 for training the prediction model.

[0104] S207: The cache management device 104 trains the prediction model based on the recorded information and the labeled first data to be written.

[0105] After labeling the first data to be written in S206, the prediction model can be trained based on the labeling information of each first data to be written and at least one of the following parameters: number of write requests per cycle, total number of write requests per cycle, average size of cached data, average cache eviction cycle, number of read requests per cycle, and total number of read requests per cycle.

[0106] Specifically, based on the total number of write requests per cycle obtained in S202, the average total number of write requests per cycle can be obtained. As mentioned earlier, a data set of several bytes is maintained in S202, recording the total number of write requests per cycle over a certain period of time. Furthermore, the average total number of write requests per cycle can be obtained by averaging this data set.

[0107] Optionally, the average total number of read requests per cycle obtained in S204 can be calculated. Specifically, as mentioned earlier, a data set of several bytes is maintained in S204 to record the total number of read requests per cycle over a certain period of time. Further, the average total number of read requests per cycle can be obtained by averaging this data set.

[0108] The average size of cached data indicates the average size of the data stored in the cache medium. The specific method for obtaining this value will be described in detail in step S211.

[0109] The average cache eligibility period indicates the average time from when each piece of data in the cache enters the cache to when it is evicted. The specific method for obtaining this period will be described in detail in step S213.

[0110] By taking at least one of the following as inputs—average cache size, average cache eviction period, average total write requests per cycle, write requests per cycle for each piece of data to be written, average total read requests per cycle, and read requests per cycle for each piece of data to be written—and outputting the annotation information for each piece of data to be written, the prediction model can be trained.

[0111] The prediction model can be an artificial intelligence model such as a backpropagation neural network or a long short-term memory network. It should be noted that this application does not limit the method of establishing the prediction model, and since it is prior art, it will not be described in detail.

[0112] In some possible implementations, training of the prediction model can be initiated when the number of labeled data points in S206 that have not yet been used for training reaches or exceeds a sampling threshold. The sampling threshold can be set as needed. For example, it can be set to initiate one round of training for the prediction model when the number of labeled training data points reaches 10,000.

[0113] In some possible implementations, the number of training epochs for the prediction model can be set. Based on the number of epochs the prediction model has already been trained and an epoch threshold, it can be determined whether further training of the model is needed. When the training epochs reach or exceed the epoch threshold, the prediction model can be discontinued.

[0114] For example, when the round threshold is set to 100, after training the prediction model for 100 rounds, sampling and labeling of the data to be written can be discontinued. Furthermore, the prediction model can be discontinued.

[0115] In some possible implementations, after training the prediction model with the first data to be written, the first data to be written stored in the prediction model can be deleted.

[0116] In some possible implementations, the output, i.e., the annotation information of each of the first data to be written, can be preprocessed. Specifically, the cold / hot data in the annotation state can be converted into 0 / 1 or 1 / 0. In this type of possible implementation, the output of the prediction model can be controlled between 0 and 1.

[0117] After sampling the data to be written and training the prediction model in steps S205 to S207, a trained prediction model is obtained. For data that does not meet the sampling rules in S205, its writing probability can be determined based on the prediction model. The prediction model can be the model trained in step S207 or the model before training in step S207. Specifically, the model used for prediction needs to undergo at least one round of training.

[0118] The write judgment part includes steps S208 to S211.

[0119] S208: The cache management device 104 generates a prediction model and uses it to predict the probability of writing data to be predicted.

[0120] After training the prediction model in S207, a prediction model that has completed at least one round of training can be obtained, which is used to predict the writing status of the data to be predicted. The data to be predicted refers to the second data to be written that does not meet the special rules determined in step S209. The specific method for obtaining the second data to be written that does not meet the special rules will be described in step S209.

[0121] It should be noted that the prediction model used in this step to predict the write probability of the second batch of data to be written may differ from the prediction model trained in step S207. In some possible implementations, the cache write prediction model used in step S208 is the cache write prediction model trained in step S207. That is, the cache write prediction model trained on the data corresponding to the previous batch of data write requests. In other words, the cache write prediction model trained on the first batch of data to be written can be used to predict a portion of the data in the second batch of data corresponding to the second batch of data write requests. The occurrence time of the second batch of data write requests should be later than the occurrence time of the first batch of data write requests.

[0122] By taking at least one of the following as inputs—the number of write requests per cycle for the data to be predicted, the average size of the cached data, the average cache eviction period, the average total number of write requests per cycle, the average total number of read requests per cycle, and the number of read requests per cycle for the second data to be written—the write status of the data to be predicted can be predicted.

[0123] After using the prediction model to predict the write status of the data to be predicted, an output value between 0 and 1 can be obtained. That is, the write probability of the data to be predicted.

[0124] S209: The cache management device 104 determines whether the second data to be written, which does not meet the sampling rules, meets the special rules.

[0125] After evaluation in S205, the second piece of data to be written that does not meet the sampling rules will be transferred to S209 for processing. In S209, it will be further determined whether the data meets the special rules.

[0126] Special rules include allowing writing to certain data and prohibiting access to certain data. For example, according to the tenant's requirements, certain types of files may not be written, and files with specific domain names may not be written.

[0127] For the second piece of data to be written that meets the special rules, write it to the cache. That is, proceed to step S211.

[0128] For the second data to be written that does not meet the special rules, proceed to step S210.

[0129] It should be noted that step S209 is an optional step in cache writing method 200.

[0130] S210: The cache management device 104 determines the write status of the second data to be written based on the write probability and write threshold of the data that does not meet the special rules.

[0131] The write status of the data can be determined based on the write threshold and the write probability of the second piece of data to be written that does not meet the sampling rules. The write threshold can be set as needed.

[0132] The write probability of the second piece of data that does not meet the sampling rules can be obtained using the prediction model in S208. That is, the second piece of data that does not meet the sampling rules is used as the data to be predicted in the prediction model. Specifically, at least one of the following is used as the input to the prediction model: the number of write requests per cycle for this data, the average size of cached data, the average cache eviction period, the average total number of write requests per cycle, the average total number of read requests per cycle, and the number of read requests per cycle for this data. The write status of this data can then be predicted.

[0133] In some possible implementations, after predicting the write status of the data using a predictive model, an output value between 0 and 1 can be obtained. That is, the write probability of the data.

[0134] Furthermore, based on the write probability and the write threshold, the write status of the second data to be written that does not meet the sampling rules can be determined.

[0135] Specifically, when the write probability is greater than or equal to the write threshold, the data will be written to the cache. When the write probability is less than the write threshold, the data will not be written to the cache.

[0136] In some possible implementations, the prediction model can incorporate the write threshold into the prediction model itself. That is, the prediction model can directly output the result of whether to write the second piece of data to be written to the cache.

[0137] It should be noted that the above-mentioned prediction of the write probability using a prediction model is based on the premise that the prediction model has been trained at least once in step S207. In other words, the prediction model is not trained before the accumulated amount of data to be written in step S206 reaches the sampling threshold. Furthermore, the write status of the second data to be written that does not meet the sampling rules can be determined according to existing technologies. These existing technologies include write rules based on specific rules or frequency statistics, etc., which will not be elaborated further.

[0138] S211: The cache management device 104 writes the data to be written to the cache and counts the size of the written data.

[0139] In step S211, the data to be written is determined to include at least the following three categories:

[0140] In some possible implementations, the data to be written can be the data to be written that satisfies the sampling rule in step S205. That is, the first data to be written whose ID is the remainder of 0 when divided by the sampled value.

[0141] In some possible implementations, the data to be written can also be the second data to be written that satisfies the special rules in step S209. Here, the second data to be written refers to the data whose ID modulo a sampled value is not 0.

[0142] In some possible implementations, the data to be written can also be a second data to be written that does not meet the special rules and whose write probability is not less than the write threshold in step S210.

[0143] Optionally, while writing the data to be written to the cache, it is necessary to record the size of each piece of data being written, so as to calculate the average size of the data in the cache medium at the current moment. Further, this average data size can be used as an input in step S207 to train the prediction model.

[0144] After determining whether the second data to be written is to be written, a portion of the second data to be written will be written to the cache. Because the storage space of the cache is limited, some data needs to be evicted periodically. Specifically, the cache data eviction process includes steps S212 to S214.

[0145] S212: Determine whether to evict some data in the cache based on the eviction rules.

[0146] Because the data storage capacity of the cache medium is limited, it is necessary to continuously evict data to free up storage space for caching newly written data. When the amount of data written exceeds the capacity, a portion of the data can be selectively evicted. Specific eviction rules are existing technology and will not be elaborated further. Common cache data eviction algorithms include Least Recently Used (LRU) and First-In, First-Out (FIFO) methods.

[0147] For write data that is determined not to be evicted, no operation is performed on it. For write data that is determined to be evicted, the data is removed from the cache.

[0148] S213: The cache management device 104 updates the labeling status of the evicted data and calculates the average cache evicting cycle.

[0149] As mentioned above, the data to be written to the cache in step S211 includes at least three cases. In some possible implementations, the data to be written may be the first data to be written that satisfies the sampling rule in step S205. That is, the first data to be written will simultaneously wait for marking in step S206.

[0150] Based on the evicting status of the first data to be written in the cache, the first data to be written can be labeled. Specifically, for evicted data, if the data belongs to the first data to be written and has not been labeled in step S206, then the first data to be written is labeled as cold data. Further, the first data to be written that has been labeled in step S206 is transferred to step S207 to be used as training data for the prediction model.

[0151] Furthermore, the average cache eviction period can be obtained based on the eviction time and write time of each evicted data. The write time is the request write time of the data recorded in step S202. Specifically, the eviction period of each evicted data is obtained by subtracting its write time from its eviction time. Further, the average cache eviction period can be determined by calculating the average eviction period of each evicted data. The evicted data can be all historically evicted data. Optionally, the evicted data can also be evicted data from a past period. Further, this average cache eviction period can be used as input in step S207 to train the prediction model.

[0152] S214: The cache management device 104 outputs obsolete data.

[0153] Output the data that needs to be eliminated in step S212.

[0154] It should be noted that there is no fixed execution order for steps S213 and S214. That is, step S213 can be executed before or after step S214. Optionally, step S213 and step S214 can also be executed simultaneously.

[0155] This application also provides a cache management device 104, such as Figure 3 As shown, it includes:

[0156] Communication unit 302 is used to receive data read requests in S201 and data write requests in S203. Communication unit 302 is also used to receive a set sampling threshold in S207. In S210, communication unit 302 is used to obtain a write threshold. Communication unit 302 is also used to receive special rules set by the tenant in S209.

[0157] Storage unit 304 is used to store request information for data to be read recorded in S202 and request information for data to be written recorded in S204. In cache management method 200, storage unit 304 is used to store relevant information of the first data to be written determined in S205. Storage unit 304 is also used to store parameters in the trained prediction model in S207. The sampling threshold received in S207 and the write threshold received in S210 are also stored in storage unit 304. Storage unit 304 is also used to store special rules set by the tenant in S209.

[0158] In cache management method 200, cache unit 306 is used to cache the first data to be written determined in S205. The second data to be written that meets the special rules and is determined to be written to the cache in S209, and the second data to be written with a write probability greater than the write threshold in S210, will both be cached in cache unit 306.

[0159] Processing unit 308 is used to execute the recording operations in S202 and S204, and store the request information of the data to be read and the data to be written into storage unit 304. Processing unit 308 is also used in S205 to judge the current data to be written, and determine the first data to be written and the second data to be written. In cache management method 200, processing unit 308 is used to label the first data to be written. Further, the operation of training the prediction model based on the recorded request information and labeling information in S207 is also performed by processing unit 308. In S208, processing unit 308 is used to perform the operation of predicting the write probability of the data to be predicted. Processing unit 308 is also used in S209 to determine whether the second data to be written meets a special rule. In S210, the operation of determining whether to write the second data to be written to the cache based on the write probability and write threshold obtained in S208 is also performed by processing unit 308. The operation of writing the data to be written to the cache and calculating the size of the data in the cache in S211 is also performed by processing unit 308. The processing unit 308 is also used to output a portion of the data in the cache according to the replacement rules in S212. In S213, the processing unit 308 performs the operation of updating the labeling status of the replaced data according to the data replacement status and calculating the average replacement cycle.

[0160] Specifically, the processing unit 308 may include a recording unit 310, a training unit 312, a decision-making unit 314, and an elimination unit 316.

[0161] Recording unit 310 is used to perform the recording operations in S202 and S204, and store the request information for the data to be read and the data to be written into storage unit 304. Decision unit 314 is used in S205 to judge the current data to be written and determine the first data to be written and the second data to be written. In cache management method 200, after determining the first data to be written, training unit 312 is used to label the first data to be written. Furthermore, the operation of training the prediction model based on the recorded request information and labeling information in S207 is also performed by training unit 312.

[0162] In S208, decision unit 314 performs the operation of predicting the write probability of the data to be predicted. Decision unit 314 is also used in S209 to determine whether the second data to be written meets a special rule. In S210, the operation of determining whether to write the second data to be written to the cache based on the write probability and write threshold obtained in S208 is also performed by decision unit 314. In S211, the operation of writing the data to be written to the cache and calculating the size of the data in the cache is also performed by decision unit 314. The eviction unit 316 is used in S212 to output a portion of the data in the cache according to the eviction rules. In S213, the operation of updating the labeling status of the evictioned data based on the data eviction status and calculating the average eviction cycle is performed by eviction unit 316.

[0163] This application also provides a computing device 400. For example... Figure 4 As shown, the computing device includes a bus 402, a processor 404, a memory 406, and a communication interface 408. The processor 404, memory 406, and communication interface 408 communicate with each other via the bus 402. The computing device 400 can be a server or a terminal device. It should be understood that this application does not limit the number of processors and memories in the computing device 400.

[0164] Bus 402 can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of representation, Figure 4 The bus 404 is represented by only one line, but this does not mean that there is only one bus or one type of bus. The bus 404 may include a path for transmitting information between various components of the computing device 400 (e.g., memory 406, processor 404, communication interface 408).

[0165] Processor 404 may include any one or more processors such as a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP), or a digital signal processor (DSP).

[0166] The memory 406 may include volatile memory, such as random access memory (RAM). The processor 404 may also include non-volatile memory, such as read-only memory (ROM), flash memory, hard disk drive (HDD), or solid-state drive (SSD). The memory 406 stores executable program code, which the processor 404 executes to implement the aforementioned cache management method 200. Specifically, the memory 406 stores instructions for the cache management device 104 to execute the cache management method 200.

[0167] The communication interface 403 uses transceiver modules such as, but not limited to, network interface cards and transceivers to enable communication between the computing device 400 and other devices or communication networks.

[0168] This application also provides a computing device cluster. For example... Figure 5 As shown, the computing device cluster includes at least one computing device 400. The computing device cluster may consist entirely of terminal devices, entirely of cloud servers, or a combination of cloud servers and terminal devices.

[0169] In the three deployment methods of computing device clusters described above, the memory 406 of one or more computing devices 400 in the computing device cluster can store the same cache management device 104 for executing the instructions of cache management method 200.

[0170] In some possible implementations, one or more computing devices 400 in the computing device cluster can also be used to execute some of the instructions of the cache management device 104 for executing the cache management method 200. In other words, a combination of one or more computing devices 400 can jointly execute the instructions of the cache management device 104 for executing the cache management method 200.

[0171] It should be noted that the memory 406 in different computing devices 400 in the computing device cluster can store different instructions for executing some functions of the cache management device 104.

[0172] Figure 6 One possible implementation is shown. For example... Figure 6As shown, two computing devices 400A and 400B are connected via a communication interface 408. The memory in computing device 400A stores instructions for executing the functions of the communication unit 302, storage unit 304, recording unit 308, training unit 310, decision-making unit 312, and elimination unit 314. The memory in computing device 400B stores instructions for executing the functions of the cache unit 306. In other words, the memory 406 of computing devices 400A and 400B jointly stores instructions for the cache management device 104 to execute the cache management method 200.

[0173] Figure 6 The connection method between the computing device clusters shown can be considered because the cache management method 200 provided in this application requires high-speed write or read operations on the data in the cache unit 306. Therefore, it is considered to delegate the cache function to the computing device 400B.

[0174] It should be understood that Figure 6 The functions of the computing device 400A shown can also be performed by multiple computing devices 400. Similarly, the functions of the computing device 400B can also be performed by multiple computing devices 400.

[0175] In some possible implementations, one or more computing devices in a computing device cluster can be connected via a network. This network can be a wide area network (WAN) or a local area network (LAN), etc. Figure 7 One possible implementation is shown. For example... Figure 7 As shown, the two computing devices 400C and 400D are connected via a network. Specifically, they are connected to the network through the communication interfaces in each computing device. In this possible implementation, the memory 406 in computing device 400C stores instructions for executing the communication unit 302, storage unit 304, recording unit 308, decision unit 312, and elimination unit 314. Meanwhile, the memory 406 in computing device 400D stores instructions for executing the cache unit 306 and training unit 310.

[0176] Figure 7 The connection method between the computing device clusters shown can be based on the fact that the cache management method 200 provided in this application needs to perform high-speed write or read operations on the data in the cache unit 306 and perform a large amount of computation to train the prediction model. Therefore, it is considered that the functions implemented by the cache unit 306 and the training unit 310 are handed over to the computing device 400D for execution.

[0177] It should be understood that Figure 7 The functions of the computing device 400C shown can also be performed by multiple computing devices 400. Similarly, the functions of the computing device 400D can also be performed by multiple computing devices 400.

[0178] This application embodiment also provides a computer-readable storage medium. The computer-readable storage medium can be any available medium that a computing device can store, or a data storage device such as a data center containing one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state drive). The computer-readable storage medium includes instructions that instruct the computing device to execute the cache management method 200 applied to the cache management device 104 described above.

[0179] This application also provides a computer program product containing instructions. The computer program product may be a software or program product containing instructions, capable of running on a computing device or stored on any usable medium. When the computer program product is run on at least one computer device, it causes the at least one computer device to perform the cache management method 200 described above.

[0180] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the protection scope of the technical solutions of the embodiments of the present invention.

Claims

1. A cache management method, characterized in that, The cache management method is used in a cache management device, and the method includes: Receive a first data write request, the first data write request being used to request that the first data in the hard disk be written to the cache; Based on the relevant parameters of the first data, train the cache write prediction model; Write the first data into the cache; Receive a second data write request, the second data write request being used to request that second data in the hard disk be written to the cache; Based on the cache write prediction model, determine whether to write the second data into the cache; The step of training the cache write prediction model based on the relevant parameters of the first data includes: The average size of the data in the cache, the total number of write requests per historical cycle, the total number of read requests per historical cycle, and the average evictation cycle of the data in the cache, along with the relevant parameters of the first data, are used as the training inputs for the cache write prediction model. Based on the request and eviction status of the first data within one average eviction cycle after the first data request occurs, the write probability of the first data is determined as the training output of the cache write prediction model. Based on the training input and the training output, the training cache is written into the prediction model; The relevant parameters of the first data include at least one of the following: the number of historical requests to write the first data, and the number of historical requests to read the first data.

2. The method as described in claim 1, characterized in that, The first data write request and the second data write request belong to the first batch of data write requests, and the method further includes: Receive the first batch of data write requests; According to the sampling rules, the first data write request and the second data write request are determined from the first batch of data write requests.

3. The method as described in claim 2, characterized in that, According to the sampling rules, the first data write request and the second data write request are determined from the first batch of data write requests, including: Obtain the identifier of the data to be written carried in each data write request in the first batch of data write requests; The data write request is determined to be the first data write request if the hash value of the identifier of the data to be written carried by each data write request in the first batch of data write requests can be divided by the sampled value. The data write request that is determined to be the second data write request is one in which the hash value of the identifier of the data to be written carried by each data write request in the first batch of data write requests is not divisible by the sampled value.

4. The method according to any one of claims 1 to 3, characterized in that, The method further includes: Receive the second batch of data write requests; Based on the cache write prediction model, determine whether to write at least one piece of data from the data to be written corresponding to the second batch of data write requests into the cache.

5. The method as described in claim 1, characterized in that, The step of determining whether to write the second data to the cache based on the cache write prediction model includes: Get the write threshold; The write probability of the second data is obtained based on the cache write prediction model. Based on the write threshold and the write probability of the second data, determine whether to write the second data to the cache.

6. The method as described in claim 5, characterized in that, The step of obtaining the write probability of the second data based on the cache write prediction model includes: At least one of the following parameters, namely the average size of data in the cache, the average evictation cycle of data in the cache, the total number of write requests per historical cycle, and the total number of read requests per historical cycle, and the relevant parameters of the second data, are used as the prediction input of the cache write prediction model. The predicted input is used as the input to the cache write prediction model to obtain the write probability of the second data.

7. A cache management device, characterized in that, The device includes a communication unit and a processing unit: The communication unit is used to receive a first data write request, which is used to request that the first data in the hard disk be written to the cache. The processing unit is configured to train a cache writing prediction model based on the relevant parameters of the first data; and write the first data into the cache. The communication unit is further configured to receive a second data write request, the second data write request being used to request that the second data in the hard disk be written to the cache; The processing unit is further configured to determine whether to write the second data into the cache based on the cache write prediction model; The processing unit is used to take at least one of the following: average size of data in the cache, total number of write requests per historical cycle, total number of read requests per historical cycle, average data eviction cycle in the cache, and relevant parameters of the first data as training inputs for the cache write prediction model. Based on the request and eviction status of the first data within one average eviction cycle after the first data request occurs, the write probability of the first data is determined as the training output of the cache write prediction model; the cache write prediction model is trained based on the training input and the training output. The relevant parameters of the first data include at least one of the following: the number of historical requests to write the first data, and the number of historical requests to read the first data.

8. The apparatus as claimed in claim 7, characterized in that, The communication unit is used to receive the first batch of data write requests; the processing unit is used to determine the first data write request and the second data write request from the first batch of data write requests according to the division rules.

9. The apparatus as claimed in claim 8, characterized in that, The processing unit is configured to obtain the identifier of the data to be written carried by each data write request in the first batch of data write requests; and determine that the hash value of the identifier of the data to be written carried by each data write request in the first batch of data write requests is divisible by the sampled value as the first data write request. The data write request that is determined to be the second data write request is one in which the hash value of the identifier of the data to be written carried by each data write request in the first batch of data write requests is not divisible by the sampled value.

10. The apparatus according to any one of claims 7 to 9, characterized in that, The processing unit is configured to receive a second batch of data write requests and, based on the cache write prediction model, determine whether to write at least one piece of data from the data to be written corresponding to the second batch of data write requests into the cache.

11. The apparatus as claimed in claim 7, characterized in that, The communication unit is used to obtain a write threshold; the processing unit is used to obtain the write probability of the second data according to the cache write prediction model; and to determine whether to write the second data into the cache according to the write threshold and the write probability of the second data.

12. The apparatus as claimed in claim 11, characterized in that, The processing unit is configured to use at least one of the average size of data in the cache, the average evictation cycle of data in the cache, the total number of write requests per historical cycle, the total number of read requests per historical cycle, and the relevant parameters of the second data as the prediction input of the cache write prediction model; and use the prediction input as the input of the cache write prediction model to obtain the write probability of the second data.

13. A computing device cluster, characterized in that, It includes at least one computing device, each computing device including a processor and memory; The processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device to cause the cluster of computing devices to perform the method as described in any one of claims 1 to 6.

14. A computer program product containing instructions, characterized in that, When the instruction is executed by a cluster of computer devices, the cluster of computer devices causes the cluster of computer devices to perform the method as described in any one of claims 1 to 6.

15. A computer-readable storage medium, characterized in that, Includes computer program instructions, which, when executed by a cluster of computing devices, perform the method as described in any one of claims 1 to 6.