Cache management method and related device

By merging the read cache and write cache into the same data structure and utilizing address mapping and a linked list of dirty data blocks in a hash table, the problem of low lookup efficiency caused by the independent read cache and write cache in existing technologies is solved, thus achieving efficient cache management.

CN118760397BActive Publication Date: 2026-06-16DAPUSTOR CORP

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
DAPUSTOR CORP
Filing Date
2024-06-28
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

In existing cache management methods, read cache and write cache are set up independently, resulting in low lookup efficiency, requiring two lookup operations in the data structures of read cache and write cache respectively.

Method used

The read cache and write cache are merged into a single data structure. By using address mapping and a linked list of dirty data blocks in a hash table, the overlapping area between the logical address range and the target dirty data block is determined. The data that has been transferred is then retrieved by searching within the merged data structure.

🎯Benefits of technology

It improves the lookup efficiency of cache management, reduces the lookup path, and achieves a single lookup operation, thereby improving the overall lookup efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN118760397B_ABST
    Figure CN118760397B_ABST
Patent Text Reader

Abstract

Embodiments of the present application disclose a cache management method and related equipment, which are used for cache management in the case of improving the search efficiency. The method comprises the following steps: obtaining a read search command, performing address mapping on a first logical address range based on a first preset address mapping rule to obtain at least one target first logical address range after address mapping, determining at least one first target dirty data block in which the logical address range and the at least one target first logical address range exist overlap in at least one target dirty data block of a dirty data block linked list of a target hash table, the first target dirty data block being any one of a first read dirty data block and a first write dirty data block, if the data transmission state of the at least one first target dirty data block is transmission completion, obtaining target data in which the logical address range of the at least one first target dirty data block overlaps, and obtaining a search result of the first read search command based on the target data.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of cache management, and more specifically, to cache management methods, cache management devices, computer equipment, computer-readable storage media, and computer program products containing instructions or computer programs. Background Technology

[0002] Currently, most enterprise-grade SSDs (Solid State Drives) use DRAM space as a cache for read and write I / O. Due to the physical characteristics of DRAM, host operations on data in the DRAM are much faster than operations on data in the NAND. Efficiently utilizing cache space is a key factor in reducing read and write latency, improving QoS, and optimizing performance. There are generally two types of caches on the disk: a write cache and a read cache. The write cache caches user data sent by the host. When certain conditions are met, the data needs to be flushed to the NAND for persistence in the background, and then released after persistence. The read cache caches prefetched data. The disk prepares the data that the host will read in advance in the cache. If the read command hits, the data is directly returned to the host from the cache. During the process of the host issuing read commands to the disk for read lookup, data lookup operations can be performed on the cache.

[0003] The existing cache management method is a read-write cache separation management method, which specifically refers to setting up separate read caches and write caches in the SSD. The read cache and write cache belong to different data structures and are independent of each other, without interference. When performing a read lookup, the cache lookup needs to be performed separately in the write cache and the read cache. For example, suppose the current host needs to access a data block. First, it needs to check if the data block exists in the write cache. If it does not exist, it needs to check if the data block exists in the read cache. If the data block is neither in the write cache nor the read cache, the data block needs to be read from the NAND flash memory.

[0004] However, in existing cache management methods, read cache and write cache are set up independently, that is, read cache and write cache belong to different data structures and are managed by their respective data structures. Therefore, when performing a read lookup operation, it is necessary to search in the data structure corresponding to the read cache and the data structure corresponding to the write cache, which requires two lookup operations. As a result, the overall lookup path is long and the lookup efficiency is low. Summary of the Invention

[0005] This application provides a cache management method, cache management device, computer equipment, computer-readable storage medium, and computer program product containing instructions or computer programs for cache management while improving lookup efficiency.

[0006] In a first aspect, embodiments of this application provide a cache management method, including:

[0007] Obtain a read search command, wherein the read search command is used to search for and read data within a first logical address range;

[0008] Based on the first preset address mapping rule, the first logical address range is mapped to obtain at least one target first logical address range after address mapping;

[0009] In at least one target dirty data block in the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range is determined, wherein the first target dirty data block is either a first read dirty data block or a first write dirty data block.

[0010] If the data transmission status of the at least one first target dirty data block is "transmission completed", then the target data with overlapping logical address ranges in the at least one first target dirty data block is obtained.

[0011] The search results of the first read search command are obtained based on the target data.

[0012] Secondly, embodiments of this application provide a cache management device, including:

[0013] The obtaining unit is used to obtain a read search command, wherein the read search command is used to search for and read data within a first logical address range;

[0014] The address mapping unit is used to perform address mapping on the first logical address range based on a first preset address mapping rule to obtain at least one target first logical address range after address mapping.

[0015] The determining unit is configured to determine, in at least one target dirty data block in at least one target dirty data block of the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range, wherein the first target dirty data block is either a first read dirty data block or a first write dirty data block.

[0016] The acquisition unit is configured to acquire target data with overlapping logical address ranges in the at least one first target dirty data block if the data transmission status of the at least one first target dirty data block is "transmission completed".

[0017] The obtaining unit is further configured to obtain the search result of the first read search command based on the target data.

[0018] Thirdly, embodiments of this application provide a computer device, including:

[0019] Central processing unit, memory, input / output interfaces, wired or wireless network interfaces, and power supply;

[0020] The memory is either a short-term storage memory or a persistent storage memory;

[0021] The central processing unit is configured to communicate with the memory and execute instructions in the memory to perform the aforementioned cache management method.

[0022] Fourthly, embodiments of this application provide a computer-readable storage medium including instructions that, when executed on a computer, cause the computer to perform the aforementioned cache management method.

[0023] Fifthly, embodiments of this application provide a computer program product containing instructions that, when run on a computer, cause the computer to execute the aforementioned cache management method.

[0024] As can be seen from the above technical solutions, the embodiments of this application have the following advantages: A read lookup command can be obtained, and the first logical address range can be mapped based on a first preset address mapping rule to obtain at least one target first logical address range after address mapping. In at least one target dirty data block in the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range is determined. The first target dirty data block is either a first read dirty data block or a first write dirty data block. If the data transmission status of the at least one first target dirty data block is "transmission complete," then the target data with overlapping logical address ranges in the at least one first target dirty data block is obtained, and the lookup result of the first read lookup command is obtained based on the target data. The read cache and write cache are merged into the same data structure, and data in both the read cache and write cache can be managed simultaneously in this data structure. Therefore, when performing a read lookup operation, it is not necessary to search in the data structure corresponding to the read cache and the data structure corresponding to the write cache separately; only the merged data structure needs to be searched. Only one lookup operation is required, thus the overall lookup path is shorter and the lookup efficiency is higher. Attached Figure Description

[0025] Figure 1 This is a schematic diagram of the architecture of a cache management system disclosed in an embodiment of this application;

[0026] Figure 2 This is a flowchart illustrating a cache management method disclosed in an embodiment of this application;

[0027] Figure 3This is a flowchart illustrating another cache management method disclosed in an embodiment of this application;

[0028] Figure 4 This is a flowchart illustrating a read lookup method for unified management of read and write caches disclosed in an embodiment of this application;

[0029] Figure 5 This is a flowchart illustrating another cache management method disclosed in an embodiment of this application;

[0030] Figure 6 A schematic diagram of the existing SSD write cache management framework;

[0031] Figure 7 A schematic diagram of the existing SSD read cache management framework;

[0032] Figure 8 This is a flowchart illustrating another cache management method disclosed in an embodiment of this application;

[0033] Figure 9 This is a flowchart illustrating a unified read cache insertion method for cache management disclosed in an embodiment of this application;

[0034] Figure 10 This is a flowchart illustrating another cache management method disclosed in an embodiment of this application;

[0035] Figure 11 This is a flowchart illustrating a unified write cache insertion method for cache management disclosed in an embodiment of this application;

[0036] Figure 12 This is a schematic diagram of the structure of a cache management device disclosed in an embodiment of this application;

[0037] Figure 13 This is a schematic diagram of another cache management device disclosed in an embodiment of this application;

[0038] Figure 14 This is a schematic diagram of the structure of a computer device disclosed in an embodiment of this application. Detailed Implementation

[0039] This application provides a cache management method, cache management device, computer equipment, computer-readable storage medium, and computer program product containing instructions or computer programs for cache management while improving lookup efficiency.

[0040] Please see Figure 1 The architecture of the cache management system 100 in this embodiment includes:

[0041] The system includes a cache management device 101 and a file management device 102 (software program such as a file system or application). When performing cache management, the cache management device 101 can connect to the file management device 102. The cache management device 101 can receive read / search commands sent by the file management device 102, and can perform address mapping on a first logical address range based on a first preset address mapping rule to obtain at least one target first logical address range after address mapping. In at least one target dirty data block in the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range is identified. The first target dirty data block is either a first read dirty data block or a first write dirty data block. If the data transmission status of at least one first target dirty data block is "transmission complete," then the target data with overlapping logical address ranges in the at least one first target dirty data block is obtained. Based on the target data, the search result of the first read / search command is obtained, and the search result of the first read / search command is returned to the file management device.

[0042] The cache management device 101 can be firmware or software used to manage data caching between computer memory and external storage.

[0043] The file management device 102 can be a device corresponding to a software program, which includes, but is not limited to, file systems and applications. The file system is a software component of the operating system responsible for managing files and directories and providing a file system interface for applications. Applications are software programs that run on computers or other electronic devices, designed to perform specific tasks or complete specific jobs, such as document editing, and can run on various operating systems and platforms.

[0044] The following are explanations of some key terms in the embodiments of this application:

[0045] DirtyNode: DirtyNode is a write cache management data structure with unit data granularity. Its main data fields include logical address range, data address, data transfer status, cache reference count, and a set of bitmaps used to manage data status.

[0046] The data states represented by each Bitmap in the Bitmap collection: Each Bitmap represents a data state including Dirtybmp, Val idbmp, Pendingbmp, Sentbmp, and Donebmp.

[0047] Dirtybmp: Indicates data that can be flushed to NAND; bit 1 indicates a flushable state. The bitmap may contain non-contiguous bit segments, therefore DirtyNode data may require multiple flushes. After data flushing, the corresponding bit segment is cleared to 0. A Dirtybmp value of 0 does not necessarily mean that all DirtyNode data has been flushed; Pendingbmp must be considered. The initial value is LEN << (SLBA % 32).

[0048] Val idbmp: Identifies valid data. When data is "invalidated," the corresponding bit needs to be cleared to 0. This Bitmap needs to be checked when a read hits the cache. The initial value is the same as Dirtybmp.

[0049] Pendingbmp: Indicates data awaiting flushing. Typically, during a write conflict, old data is being flushed, and the data in the new DirtyNode within the conflict range needs to be flushed with a delay. After the old data has been flushed, Pendingbmp clears the corresponding bit segments to 0 based on the flushed data and updates Dirtybmp.

[0050] Sentbmp: Identifies the data that has been sent and flushed. This bitmap actually represents the data being flushed to the NAND and the data that has been flushed, and is used to assist in updating Dirtybmp.

[0051] Donebmp: Indicates data that has been successfully flushed. It is usually used together with Sentbmp to calculate the data currently being flushed.

[0052] Hash tables, also known as hash lists, are data structures that allow for fast insertion and retrieval, widely used in the Linux kernel. The principle is as follows: when a node is to be inserted, its information is first mapped to a bucket in the hash table using a hash function, and then the node is added to the linked list within that bucket. When searching for a node, the same hash function is used to find the bucket based on the node information, and then the search is performed on the linked list within that specific bucket.

[0053] based on Figure 1 Please refer to the cache management system shown. Figure 2 , Figure 2 This is a flowchart illustrating a cache management method disclosed in an embodiment of this application. The method includes:

[0054] 201. Obtain the read search command, wherein the read search command is used to search for and read data in the first logical address range.

[0055] In this embodiment, when performing cache management, a read lookup command can be obtained, wherein the read lookup command is used to find and read data in the first logical address range.

[0056] Specifically, for example, the first logical address range could be [20KB, 100KB] or [100KB, 150KB].

[0057] 202. Based on the first preset address mapping rule, perform address mapping on the first logical address range to obtain at least one target first logical address range after address mapping.

[0058] After obtaining the read search command, the first logical address range can be mapped based on the first preset address mapping rule to obtain at least one target first logical address range after address mapping.

[0059] Specifically, continuing with the example above, for a first logical address range of [20KB, 100KB], the first logical address range [20KB, 100KB] can be mapped based on the first preset address mapping rule to obtain a target first logical address range [20KB, 100KB]. For a first logical address range of [100KB, 150KB], the first logical address range [100KB, 150KB] can be mapped based on the first preset address mapping rule to obtain two target first logical address ranges [100KB, 128KB] and [128KB, 150KB].

[0060] 203. In at least one target dirty data block in the dirty data block linked list of the target hash table, determine at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range, wherein the first target dirty data block is either a first read dirty data block or a first write dirty data block.

[0061] After mapping the first logical address range based on the first preset address mapping rule to obtain at least one target first logical address range after address mapping, at least one first target dirty data block can be determined in at least one target dirty data block in the dirty data block linked list of the target hash table, where the logical address range overlaps with at least one target first logical address range. The first target dirty data block is either the first read dirty data block or the first write dirty data block.

[0062] Specifically, for example, given a first logical address range of [20KB, 100KB], after obtaining a target first logical address range of [20KB, 100KB] after address mapping, it can be determined that there is a first target dirty data block (which could be a first read dirty data block or a first write dirty data block, with a logical address range of [0KB, 50KB]) overlapping with this target first logical address range of [20KB, 100KB], with an overlap range of [20KB, 50KB]. For a first logical address range of [100KB, 150KB], after obtaining two target first logical address ranges of [100KB, 128KB] and [128KB, 150KB] after address mapping, the method for determining at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range is similar to the above and is not limited here.

[0063] The method for determining at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range in at least one target dirty data block in the dirty data block linked list of the target hash table may include, but is not limited to, the following two methods:

[0064] (1) Using only traversal:

[0065] Based on the target logical address range, the dirty data block linked list of the target hash table is traversed to determine the dirty data block that is hit. The target hash table includes a dirty data block linked list.

[0066] Specifically, you can start traversing from the head node until you find a dirty data block that meets the criteria.

[0067] (2) The method of traversing after determining the corresponding bucket:

[0068] Based on the target logical address range and hash algorithm, the corresponding bucket is determined in the target hash table. The linked list of the bucket is traversed to determine the dirty data blocks. The target hash table includes multiple hash buckets, and each hash bucket includes a linked list of dirty data blocks.

[0069] Specifically, we can first use a hash algorithm to obtain the buckets corresponding to the target logical address range, and then traverse the linked list starting from the head node of that bucket until we find a dirty data block that meets the criteria. Compared to the method of simply traversing, this method can reduce the number of invalid traversals and improve search efficiency.

[0070] It is worth mentioning that the first target dirty data block can be either the first read dirty data block or the first write dirty data block. In other words, the read cache and the write cache are merged into the same data structure, and the data in the read cache and the write cache can be managed simultaneously in this data structure. Therefore, during the read lookup process, when determining at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range, it is only necessary to search in the merged data structure. Only one lookup operation is required. Therefore, the path to determine at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range is shorter and more efficient.

[0071] 204. If the data transmission status of at least one first target dirty data block is "transmission completed", then acquire the target data with overlapping logical address ranges in at least one first target dirty data block.

[0072] After determining at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range in at least one target dirty data block in the dirty data block linked list of the target hash table, if the data transmission status of at least one first target dirty data block is complete, the target data with overlapping logical address ranges in at least one first target dirty data block can be obtained.

[0073] Specifically, if the first target dirty data block is the first read dirty data block, then the data transmission status of the first read dirty data block being "transmission complete" indicates that the first read dirty data block has finished reading data from the NAND. If the first target dirty data block is the first write dirty data block, then the data transmission status of the first write dirty data block being "transmission complete" indicates that the host has finished writing data into the first write dirty data block. More specifically, continuing with the example above, for a first logical address range of [20KB, 100KB], after determining the first target dirty data block (which can be either the first read dirty data block or the first write dirty data block, with a logical address range of [0KB, 50KB]), if the data transmission status of the first target dirty data block is "transmission complete," then the target data with overlapping logical address ranges (logical address range of [0KB, 50KB]) within the first target dirty data block is obtained.

[0074] Understandably, acquiring at least one target data block with overlapping logical address ranges from the first target dirty data block only when the data transmission status of the first target dirty data block is complete ensures that the acquired data is up-to-date and accurate, thereby improving the accuracy of data retrieval. If data is not acquired until the data transmission is complete, old and inaccurate data may be obtained, affecting the accuracy of data retrieval.

[0075] 205. Obtain the search results of the first read search command based on the target data.

[0076] If the data transmission status of at least one first target dirty data block is "transmission complete", then after obtaining the target data with overlapping logical address ranges in at least one first target dirty data block, the search result of the first read search command can be obtained based on the target data.

[0077] Specifically, continuing with the example above, after obtaining the target data (data with a logical address range of [20KB, 50KB]) whose logical address range overlaps within the first target dirty data block (logical address range of [0KB, 50KB]), you can obtain the data with a logical address of [0KB, 20KB] separately (e.g., from NAND), combine the data of [0KB, 20KB] with the target data of [20KB, 50KB], and use the combined data as the search result of the first read search command.

[0078] It is understandable that, for a first logical address range of [20KB, 100KB], if the logical address range of the first target dirty data block with overlap is determined to include [20KB, 100KB], then the target data within [20KB, 100KB] of the first target dirty data block can be directly obtained, and this target data within [20KB, 100KB] of the first target dirty data block can be used as the search result of the first read search command. For a first logical address range of [100KB, 150KB], the method for obtaining the search result of the first read search command based on the target data is similar to the method corresponding to the first logical address range of [20KB, 100KB], and is not limited here.

[0079] In this embodiment, a read lookup command can be obtained. Based on a first preset address mapping rule, a first logical address range is mapped to obtain at least one target first logical address range. In at least one target dirty data block in the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range is determined. The first target dirty data block is either a first read dirty data block or a first write dirty data block. If the data transmission status of at least one first target dirty data block is "transmission complete," then the target data with overlapping logical address ranges in at least one first target dirty data block is obtained. The lookup result of the first read lookup command is obtained based on the target data. The read cache and write cache are merged into the same data structure, and this data structure can manage data in both the read cache and write cache simultaneously. Therefore, when performing a read lookup operation, it is not necessary to search separately in the data structures corresponding to the read cache and the write cache; only the merged data structure needs to be searched. Only one lookup operation is required, resulting in a shorter overall lookup path and higher search efficiency.

[0080] based on Figure 2 The described embodiment, during cache management, although using a unified data structure to manage data in both read and write caches can reduce the lookup path and improve lookup efficiency to some extent, there are still frequent situations where some data cannot be found and / or cannot meet users' higher lookup efficiency requirements, resulting in low data completeness, low accuracy, and / or low lookup efficiency. To address these issues, a detailed embodiment of this solution, which can be selectively implemented during execution, is provided below. Please refer to [link / reference]. Figure 3 , Figure 3 This is a flowchart illustrating another cache management method disclosed in an embodiment of this application. Another embodiment of the cache management method of this application includes:

[0081] 301. Obtain the read search command, wherein the read search command is used to search for and read data in the first logical address range.

[0082] Step 301 in this embodiment is the same as described above. Figure 2 Step 201 in the illustrated embodiment is similar and will not be described in detail here.

[0083] 302. If the first logical address range spans a range that is a multiple of the base region size, then the first logical address range is split into multiples of the base region size to obtain multiple first logical address portion ranges, wherein the first logical address portion range is the target first logical address range, and / or, if the first logical range is within a range that is a multiple of the base region size, then the first logical range is taken as the target first logical address range.

[0084] After obtaining the read search command, if the first logical address range spans a range that is a multiple of the base region size, the first logical address range can be split into multiples of the base region size to obtain multiple first logical address portion ranges, where each first logical address portion range is the target first logical address range, and / or, if the first logical range is within a range that is a multiple of the base region size, then the first logical range is taken as the target first logical address range.

[0085] Specifically, for example, if the base region size is 128KB and the first logical address range is [20KB, 100KB], then the first logical address range [20KB, 100KB] is within a multiple of the base region size (128KB), so [20KB, 100KB] can be used as the target first logical address range. If the first logical address range is [100KB, 150KB], then the first logical address range [100KB, 150KB] spans a multiple of the base region size (128KB), so the first logical address range is split into two first logical address sub-ranges [100KB, 1280KB] and [128KB, 150KB], where these two first logical address sub-ranges [100KB, 1280KB] and [128KB, 150KB] are the target first logical address range.

[0086] It is worth mentioning that if the first logical address range spans a range that is a multiple of the base region size, the first logical address range is split into multiples of the base region size. The resulting first logical address ranges are all within the range of multiples of the base region size. Based on this, the first target dirty data block with overlapping addresses is then determined. This can reduce the situation where some data in the first logical address range that exceeds the alignment range of multiples of the base region size cannot be found, thereby improving the integrity and accuracy of the data read and search.

[0087] 303. In at least one target dirty data block of the dirty data block linked list of the target hash table, determine at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range, wherein the first target dirty data block is either a first read dirty data block or a first write dirty data block.

[0088] If the first logical address range spans a range that is a multiple of the base region size, then the first logical address range is split into multiples of the base region size. After obtaining multiple first logical address partial ranges, at least one first target dirty data block can be determined in at least one target dirty data block in the dirty data block linked list of the target hash table, where the logical address range overlaps with at least one target first logical address range. The first target dirty data block is either the first read dirty data block or the first write dirty data block.

[0089] The method for determining at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range in at least one target dirty data block in the dirty data block linked list of the target hash table may be as follows: first, based on a hash algorithm, determine at least one hash value corresponding to at least one target first logical address range, wherein the at least one hash value is used to map at least one target bucket in the target hash table corresponding to at least one target first logical address range; then, in at least one target bucket corresponding to at least one target first logical address range, determine at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range, wherein the target hash table includes multiple hash buckets, and the hash buckets include a dirty data block linked list.

[0090] The method of first determining at least one hash value corresponding to at least one target first logical address range based on a hash algorithm, and then determining at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range in at least one target bucket corresponding to at least one target first logical address range, may include, but is not limited to, the following two methods:

[0091] (1) For each target first logical address range, divide the starting logical address of the target first logical address range by the base area size (e.g., 128KB) to obtain an integer value, and take the modulo of the integer value with the target hash table size to obtain the remainder. The remainder is the hash value, which is used to represent the bucket number to which the target first logical address range is mapped. The target hash table size is used to represent the number of buckets. The target first logical address range includes the starting logical address and the ending logical address.

[0092] Specifically, after mapping the first logical range to obtain the target first logical address range, a hash function (hash algorithm) can be designed as LMA / base area size (e.g., 128KB) % HASH_TABLE_SIZE, where HASH_TABLE_SIZE is the size of the hash table, LMA is the starting logical address of the target first logical address range, " / " represents division, and "%" represents modulo operation. This allows the HashList to be used to manage DirtyNodes, and ensures that the target first logical address range does not cross a base area size (e.g., 128KB) alignment range. This avoids situations where parts exceeding the base area size (e.g., 128KB) alignment range cannot be found, thereby improving the accuracy and completeness of the search.

[0093] (2) For each target first logical address range, the sum of the initial logical address and the ending logical address of the target first logical address range is moduloed with the target hash table size to obtain the remainder, where the remainder is the hash value, which is used to represent the bucket number to which the target first logical address range is mapped. The target first logical address range includes the starting logical address and the ending logical address.

[0094] Specifically, if HASH_TABLE_SIZE = 4, the starting logical address LMA of a target's first logical address range is 200KB, and the ending logical address is 204KB, then adding LMA and EMA together gives 404KB. Taking the modulo of 4 gives a remainder of 0, so the bucket number mapped to is 0, which is the first hash bucket.

[0095] It's worth noting that, for a given target first logical address range, a hash algorithm is first used to determine the hash value (bucket number). Then, within the bucket corresponding to this target first logical address range, the first target dirty data block that logically overlaps with this target first logical address range is searched. Compared to the existing method of traversing through a large number of dirty data blocks in a single dirty data block linked list, this method of first determining the bucket and then traversing the smaller number of dirty data blocks in the dirty data block linked list of that bucket (the hash table has multiple buckets, and each bucket has a dirty data block linked list) improves search efficiency.

[0096] 304. If the data transmission status of at least one first target dirty data block is "transmission completed", then acquire the target data with overlapping logical address ranges in at least one first target dirty data block.

[0097] 305. Obtain the search results of the first read search command based on the target data.

[0098] If the data transmission status of at least one first target dirty data block is "transmission complete", then acquire the target data with overlapping logical address ranges from at least one first target dirty data block.

[0099] Steps 304 to 305 in this embodiment are the same as those described above. Figure 2 Steps 204 to 205 in the illustrated embodiment are similar and will not be described in detail here.

[0100] Understandably, during cache-managed read lookups, while methods such as splitting the first logical address range into multiples of the base region size, and / or using hash algorithms to determine the target bucket corresponding to the target first logical address range, and then identifying the first target dirty data block whose logical address range overlaps with the target first logical address range, can improve the integrity, accuracy, and / or efficiency of the searched data to some extent, read lookup scenarios may still encounter situations where the integrity and accuracy of the search results corresponding to the read lookup command are relatively low. To address these issues, two methods can be used, which are described below:

[0101] (i) By accurately controlling the release status of the first target dirty data block, the situation where the data of the first target dirty data block is released without returning search results is reduced, thereby improving the completeness and accuracy of obtaining the search results corresponding to the read search command.

[0102] One method for accurately controlling the release status of the first target dirty data block is to set the release status of at least one first target dirty data block to a non-released state before acquiring target data with overlapping logical address ranges in at least one first target dirty data block, and to update the release status of at least one first target dirty data block from a non-released state to a releaseable state after acquiring target data with overlapping logical address ranges in at least one first target dirty data block.

[0103] If a first target dirty data block is hit by multiple read lookup commands, in order to ensure that each read lookup command can return the corresponding complete lookup result, it is necessary to use cached reference count to accurately update the release status of the first target dirty data block.

[0104] Specifically, each first target dirty data block has a corresponding cache reference count. When a first target dirty data block is hit by a read search command, its release status is changed to non-release. More specifically, when a first target dirty data block is hit by M read search commands, its cache reference count is increased by M. When N read search commands return the search results, the cache reference count is decreased by N. This continues until the cache reference count of the first target dirty data block is reduced to 0, at which point the release status of the first target dirty data block is updated from non-release to releaseable.

[0105] For example, if a first target dirty data block is only hit by read search command A, its cache reference count is 1. Only when the search result of read search command A is returned will the release status of the first target dirty data block change from non-released to released, and the data in the first target dirty data block be released. As another example, if the first target dirty data block is hit by both read search command A and read search command B, its cache reference count is 2. When only the search result of read search command A is returned, the cache reference count of the first target dirty data block is 1. Only when the search result of read search command B is returned will the cache reference count of the first target dirty data block be 0, and the data in the first target dirty data block be released.

[0106] (ii) Based on the overall hit rate of the read search command and the target data, obtain the search results of the first read search command to reduce the situation where only part of the hit data is returned without the user's knowledge, so as to improve the completeness and accuracy of obtaining the search results corresponding to the read search command.

[0107] The method for obtaining the search result of the first read search command based on the overall hit rate of the read search command and the target data can be as follows: if the overall hit rate of the read search command is a full hit, then the target data is used as the search result; and / or, if the overall hit rate of the read search command is a partial hit, then the data of the missing logical address range is obtained from the flash memory, and the data of the missing logical address range and the target data are used as the search result.

[0108] It is understood that the disk-based cache management in this application embodiment may include updating the data state of DirtyNodes and distinguishing between read and write caches, implementation methods for reusing, adding, modifying, or redefining fields, differences in the representation of read and write caches in DirtyNodes, and changes to cache reference counts, flush information fields, bitmaps, and other content. These details of cache management are described below:

[0109] (1) A Bitmap represents a collection of unit data, with each bit representing the unit data at the corresponding position. Write caches typically use Dirtybmp / Vailablebmp / Pendingbmp / Sendbmp / Donebmp. For ease of explanation, assume the maximum IO size supported by the disk is 128KB, and the maximum size of a DirtyNode is also 128KB. The Bitmap is generated based on the relative position of the DirtyNode's address range within 128KB; for example, Dirtybmp would be LEN << (SLBA % 32).

[0110] (2) The state of DirtyNode data can be updated within the disk by performing fast operations on the Bitmap (mainly binary AND-OR operations). To ensure that DirtyNode is compatible with the read cache, these fields can be reused, added to, modified, or redefined.

[0111] (3) First, a new flag field needs to be added to distinguish between read and write caches, because the update methods for read and write caches are different. The read cache can reuse fields such as logical address range, data address, data transfer status, and cache reference count. The read cache needs to record the logical address access for prefetching data so that the host read command can check whether it hits and the hit range. Only based on the hit range can the correct data address in the cache that needs to be returned to the host be correctly calculated.

[0112] (4) In the write cache, the data transfer status indicates whether the host is writing data to the cache. For example, status 1 indicates that the data is about to be written or is being written, and the data transfer is not yet complete. Status 0 indicates that the data writing is complete. In the read cache, this field needs to be redefined to use this field. The data transfer status indicates whether the NAND data is being written to the cache. For example, 1 indicates that the data is about to be prefetched or is being prefetched, and the data has not yet been read. Status 0 indicates that the data read from the NAND is complete.

[0113] (5) The read cache (data structure: dirty data block) can reuse the cache reference count field to represent hit information. In the write cache (data structure: dirty data block), this field represents the number of flushes required and the number of read hits. When a flush is required or a read hit occurs, the count is incremented by 1. After the flush is completed or the hit data transfer is finished, the count is decremented by 1. When the count reaches 0, the write cache can be released. The read cache only uses this field to represent hit information, i.e., the number of read hits. The count is incremented by the number of read hits to the cache and decremented after data transfer is completed. When the count reaches 0, the cache can be evicted according to a certain read cache strategy.

[0114] (6) The DirtyNode also has a flush information field. Since the read cache does not need to be flushed, this field is reserved for the read cache and is not used. The DirtyNode of the read cache does not generate a FlushEntry structure. Similarly, in the bitmap that manages the data state, Dirtybmp / Pendingbmp / Sentbmp / Donebmp are related to the write cache flushing. The read cache does not use them. Only Validbmp is used to represent the valid data in the read cache.

[0115] (7) After reading cache is represented by DirtyNode, its usage and management will change. Write cache DirtyNode generally represents the cache of a write command, while prefetching using read cache usually caches the data of multiple read commands in advance. For example, if a large logical address range of data needs to be read, the access needs to be split into multiple DirtyNodes to represent it. The specific data to be split depends on the speed of the host read command. Each DirtyNode of the read cache will be managed independently, and it will no longer be managed in the way of large block cache.

[0116] To facilitate understanding of this embodiment, a specific example is given below. Please refer to [link / reference]. Figure 4 , Figure 4 This is a flowchart illustrating a read lookup method for unified read / write cache management disclosed in an embodiment of this application. Figure 4 It is understood that a unified cache management system for read lookups (a type of cache management) can be implemented, and the specific technical implementation may include the following steps:

[0117] 1. First, after receiving a read command (read search command), the disk obtains the read range information;

[0118] 2. Next, check if the read range spans the 128KB (base area size) alignment range. If so, split it into two lookup ranges (target first logical address range). If not, no splitting is required.

[0119] 3. Next, calculate the hash bucket for each search range, and traverse the buckets to find a hit in the read cache (first dirty read data block) and the write cache (first dirty write data block);

[0120] 4. Furthermore, if a DirtyNode is found to be hit, its data transmission field (used to characterize the data transmission status) needs to be checked. If it is 1, it means that the data is not ready (transmission is not complete), and the cache cannot be used. At this time, the search should be stopped and the query should be stopped while waiting for the data to be ready. If it is 0 (transmission is complete), it means that the cache is available. At this time, the address range and address of the cache hit are recorded for subsequent data transmission, and the cache reference count of the hit Node is incremented by 1 to ensure that the cache is not released.

[0121] 5. Furthermore, after all search ranges have been hit, the overall hit rate of the read command should be calculated. This result can be used to determine whether data can be returned directly after a hit.

[0122] 6. Finally, read the hit search to conclude.

[0123] In this embodiment, a read lookup command can be obtained. Based on a first preset address mapping rule, a first logical address range is mapped to obtain at least one target first logical address range. In at least one target dirty data block in the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range is determined. The first target dirty data block is either a first read dirty data block or a first write dirty data block. If the data transmission status of at least one first target dirty data block is "transmission complete," then the target data with overlapping logical address ranges in at least one first target dirty data block is obtained. The lookup result of the first read lookup command is obtained based on the target data. The read cache and write cache are merged into the same data structure, and this data structure can manage data in both the read cache and write cache simultaneously. Therefore, when performing a read lookup operation, it is not necessary to search separately in the data structures corresponding to the read cache and the write cache; only the merged data structure needs to be searched. Only one lookup operation is required, resulting in a shorter overall lookup path and higher search efficiency. Secondly, if the first logical address range spans a range that is a multiple of the base region size, the first logical address range is split into parts that are multiples of the base region size. The resulting split parts of the first logical address range are all within the range of multiples of the base region size. Based on this, the first target dirty data block with overlapping addresses is determined. This reduces the possibility of data exceeding the alignment range of multiples of the base region size being unsearchable, thus improving the integrity and accuracy of the read search. Next, for a given target first logical address range, a hash algorithm is first used to determine the hash value (bucket number). Then, within the bucket corresponding to this target first logical address range, the first target dirty data block with logical address overlap is searched. Compared to the existing method of traversing a large number of dirty data blocks in a single dirty data block linked list, this method of first determining the bucket and then traversing the smaller number of dirty data blocks in the bucket's dirty data block linked list (the hash table has multiple buckets, each bucket has a dirty data block linked list) improves search efficiency. Furthermore, by accurately controlling the release status of the first target dirty data block, the situation where the data of the first target dirty data block is released without returning search results is reduced, thereby improving the completeness and accuracy of obtaining the search results corresponding to the read search command. Finally, based on the overall hit rate of the read search command and the target data, the search results of the first read search command are obtained, thereby reducing the situation where only partial hit data is returned without the user's knowledge, and further improving the completeness and accuracy of obtaining the search results corresponding to the read search command.

[0124] based on Figure 3The described embodiment allows for unified read and write cache management during cache lookup. However, in actual cache management, read and / or write cache insertion may also occur. Therefore, a method for unified read and / or write cache insertion is needed. A detailed embodiment of this solution, which can be selectively implemented during implementation, is provided below. Please refer to [link to relevant documentation]. Figure 5 , Figure 5 This is a flowchart illustrating another cache management method disclosed in an embodiment of this application. Another embodiment of the cache management method of this application includes:

[0125] 501. Obtain the target operation command, wherein the target operation command includes a read operation command and / or a write operation command, the read operation command is used to read data in the second logical address range, and the write operation command is used to write the updated data into the second logical address range.

[0126] In this embodiment, when performing cache management, a target operation command can be obtained, wherein the target operation command includes a read operation command and / or a write operation command. The read operation command is used to read data in the second logical address range, and the write operation command is used to write updated data into the second logical address range.

[0127] Specifically, for example, the second logical address range could be [20KB, 100KB] or [100KB, 150KB].

[0128] 502. Based on the second preset address mapping rule, perform address mapping on the second logical address range to obtain at least one target second logical address range after address mapping, so as to obtain at least one target operation dirty data block corresponding to the target operation command based on the at least one target second logical address range.

[0129] After obtaining the target operation command, the second logical address range can be mapped based on the second preset address mapping rule to obtain at least one target second logical address range after address mapping, so as to obtain at least one target operation dirty data block corresponding to the target operation command based on at least one target second logical address range.

[0130] Specifically, continuing with the example above, for a second logical address range of [20KB, 100KB], the second logical address range [20KB, 100KB] can be mapped based on the first preset address mapping rule to obtain a target second logical address range [20KB, 100KB], and a target operation dirty data block (corresponding to a logical address range of [20KB, 100KB]). For a second logical address range of [100KB, 150KB], the second logical address range [100KB, 150KB] can be mapped based on the second preset address mapping rule to obtain two target second logical address ranges [100KB, 128KB] and [128KB, 150KB], and the two target operation dirty data blocks (corresponding to logical address ranges of [100KB, 128KB] and [128KB, 150KB], respectively).

[0131] 503. In at least one initial dirty data block of the dirty data block linked list of the initial hash table, determine at least one second initial dirty data block whose logical address range overlaps with at least one target second logical address range, wherein the second initial dirty data block is either a second initial read dirty data block or a second initial write dirty data block.

[0132] Based on the second preset address mapping rule, the second logical address range is address mapped to obtain at least one target second logical address range after address mapping. After obtaining at least one target operation dirty data block corresponding to the target operation command based on at least one target second logical address range, at least one second initial dirty data block can be determined in at least one initial dirty data block of the dirty data block linked list of the initial hash table, where the logical address range overlaps with at least one target second logical address range. The second initial dirty data block is either the second initial read dirty data block or the second initial write dirty data block.

[0133] Specifically, for example, given a second logical address range of [20KB, 100KB], after obtaining a target second logical address range of [20KB, 100KB] after address mapping, it can be determined that there is a second target dirty data block (which could be a second read dirty data block or a second write dirty data block, with a logical address range of [0KB, 50KB]) overlapping with this target second logical address range of [20KB, 100KB], with an overlap range of [20KB, 50KB]. For a second logical address range of [100KB, 150KB], after obtaining two target second logical address ranges of [100KB, 128KB] and [128KB, 150KB] after address mapping, the method for determining at least one second target dirty data block whose logical address range overlaps with at least one target second logical address range is similar to the above and is not limited here.

[0134] In at least one initial dirty data block of the dirty data block linked list of the initial hash table, the method for determining at least one second initial dirty data block whose logical address range overlaps with at least one target second logical address range is similar to... Figure 2 Step 203 is similar and can be referred to in terms of logic; it will not be repeated here.

[0135] 504. Based on a preset invalidation rule, invalidate the data in the conflict area between at least one second initial dirty data block and at least one target operation dirty data block to obtain at least one second initial dirty data block and at least one target operation dirty data block after invalidation.

[0136] After determining at least one second initial dirty data block in the dirty data block linked list of the initial hash table where the logical address range overlaps with at least one target second logical address range, the data in the conflict area between at least one second initial dirty data block and at least one target operation dirty data block can be invalidated based on a preset invalidation rule, resulting in at least one second initial dirty data block and at least one target operation dirty data block after invalidation.

[0137] It is important to understand that the conflict region is the area where the logical addresses of the second initial dirty data block and the target operation dirty data block are the same.

[0138] It is worth mentioning that the preset invalidation rule is used to invalidate the data in the conflict area between the second initial dirty data block and the target operation dirty data block. Specifically, the invalidation method means that only the data of one of the two (the second initial dirty data block and the target operation dirty data block) is invalidated to ensure that there is only one data at each logical address, thereby ensuring data consistency.

[0139] 505. Insert at least one dirty data block of the target operation after invalidation into the dirty data block linked list of the initial hash table, so as to obtain the dirty data block linked list of the target hash table based on at least one second initial dirty data block after invalidation and at least one dirty data block of the target operation.

[0140] After obtaining at least one second initial dirty data block and at least one target operation dirty data block after invalidation, the at least one target operation dirty data block after invalidation can be inserted into the dirty data block linked list of the initial hash table to obtain the dirty data block linked list of the target hash table based on the at least one second initial dirty data block and at least one target operation dirty data block.

[0141] It is worth mentioning that after invalidating the data in the conflict area between the second initial dirty data block and the target operation dirty data block, data consistency can only be guaranteed based on at least one second initial dirty data block and at least one target operation dirty data block after invalidation.

[0142] It is also worth mentioning that actual read and write scenarios may involve complex situations. For example, there may be combined read and write scenarios involving inserting into the read cache, inserting into the write cache, and read lookup. Or, there may be a large number of times each of these operations is performed. By using a preset invalidation rule to invalidate only the data in one of the two dirty data blocks—the second initial dirty data block and the target operation dirty data block—the uniqueness and accuracy (the latest data) of the data in the conflict area can be guaranteed. Therefore, data consistency can be ensured.

[0143] It is also worth mentioning that, regarding the insertion methods for write cache and read cache, both write DirtyNode and read DirtyNode can be inserted into HashList, and read commands can search in HashList to resolve conflicts between read and write caches.

[0144] 506. Obtain the read search command, wherein the read search command is used to search for and read data in the first logical address range.

[0145] 507. Based on the first preset address mapping rule, perform address mapping on the first logical address range to obtain at least one target first logical address range after address mapping.

[0146] 508. In at least one target dirty data block of the dirty data block linked list of the target hash table, determine at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range, wherein the first target dirty data block is either a first read dirty data block or a first write dirty data block.

[0147] 509. If the data transmission status of at least one first target dirty data block is "transmission completed", then acquire the target data with overlapping logical address ranges in at least one first target dirty data block.

[0148] 510. Obtain the search results of the first read search command based on the target data.

[0149] Steps 506 to 510 in this embodiment are the same as those described above. Figure 3 Steps 301 to 305 in the illustrated embodiment are similar and will not be described in detail here.

[0150] To facilitate understanding of the read / write cache insertion method for unified read / write cache management in this embodiment, the differences between the read / write cache insertion method for unified read / write cache management in this embodiment and existing read / write cache insertion methods for unified read / write cache management are described below:

[0151] I. For the existing unified management of read / write cache insertion methods:

[0152] (1) For the existing unified write cache management write cache insertion method:

[0153] Write cache and read cache are managed separately through the SSD firmware's cache management module. Please refer to [link / reference] for details. Figure 6 , Figure 6 This is a schematic diagram of an existing SSD write cache management framework, by Figure 6 As can be seen, the WriteCacheManager is primarily responsible for the allocation, insertion, lookup, flushing, and release of write cache. Internally, data is organized in the form of a linked list of DirtyNodes. Each DirtyNode records information about a write operation, including the logical media address range, data address, and flushing position. Since the write command size does not exceed the MDTS (Maximum Data Transfer Size), the length of data recorded by a DirtyNode does not exceed the MDTS. Before a write command begins transmitting data, the WriteCacheManager allocates cache space for the command. After transmission is complete, the DirtyNode is inserted into the linked list. Each DirtyNode generates a corresponding FlushEntry and adds it to the flushing queue. When the cached data reaches a certain threshold, the WriteCacheManager selects a certain number of FlushEntries from the flushing queue and flushes them to the NAND flash. After the data flushing is complete, the DirtyNode and its corresponding cache space can be released.

[0154] (2) For the existing unified read cache management read cache insertion method:

[0155] Read caching is used in scenarios where the host issues specific I / O models, such as sequential reads or hotspot reads. Once the disk identifies this model, it prefetches data from the NAND flash into the cache. Based on the principle of spatial locality of data access, this increases the probability of the host hitting the cache, thus accelerating read commands. Please refer to [link to details]. Figure 7 , Figure 7 This is a schematic diagram of the existing SSD read cache management framework, by Figure 7 As can be seen, the ReadCacheManager is responsible for the allocation, lookup, and release of the read cache. Internally, the cache is typically organized using a circular queue of CacheBlocks. A CacheBlock is generally a large block of physically contiguous memory used to prefetch data for multiple read I / O operations. When the firmware detects a need to read the cache, the ReadCacheManager allocates a CacheBlock from the queue and then issues a read command to the NAND flash memory. Once the data has been prefetched from the NAND flash memory, the CacheBlock is marked as readable. The CacheBlock records information such as the logical medium address of the read cache, data length, data address, prefetch status, and hit count. The ReadCacheManager can evict and release a CacheBlock when it checks that the CacheBlock is readable and no read commands have been hit.

[0156] To ensure data consistency, the firmware caching system must adhere to the following principles: the write cache must keep the data up-to-date, and only one valid copy of the same LBA data exists in the cache (write cache or read cache). Frequent read and write cache lookups and updates occur during read and write command processing. For example, when a write command inserts data into the write cache, it first searches the DirtyNode linked list within the write cache. If a conflict is found, the write cache for the corresponding LBA range is updated to keep it up-to-date. Then, it searches the read cache to check for conflicts in the CacheBlock circular queue. If a conflict is found, the corresponding CacheBlock is evicted. Similarly, when processing a read command, it first searches the write cache. If the read range is hit, the data is returned directly from the write cache. If the write cache misses, it searches the read cache. If the data is hit, it is returned from the read cache. If neither is hit, the data is not in the cache and is finally read from the NAND flash. When initiating prefetching to insert data into the read cache, it first searches for conflicts within the write cache. If a conflict exists, prefetching is abandoned, and the write cache is evicted. If no conflict exists, the search continues in the read cache, and the old read cache is evicted. As can be seen from the above process, when processing read and write commands and updating read and write caches, the corresponding processing can only be carried out after the cache is searched.

[0157] From (1) and (2), it can be seen that for the existing unified read and write cache insertion method, the existing write cache management uses DirtyNode as the management unit and DirtyNode linked list as the management data structure, while the read cache management uses CacheBlock as the management unit and CacheBlock queue as the management data structure. Since the management granularity and management structure of the two caches are different, the caches cannot be managed uniformly, and without unified management, unified lookup is impossible.

[0158] It's worth noting that to ensure data consistency, read and write caches must be checked for conflicts and updated before insertion. Since old caches may exist in both read and write caches, current methods require searching (traversing) both caches for new cache insertion, resulting in long search paths. Existing unified methods, to ensure data consistency, require evicting both old write and read caches when a new write cache is inserted, and checking for the existence of a write cache and evicting the old read cache before inserting a new read cache. Traditional read-write cache separation management requires traversing both the write and read caches separately, leading to low efficiency in data consistency. Therefore, while ensuring disk data consistency, the long search path and high command processing latency result in significant delays. Furthermore, current read-write cache separation management technology increases software management complexity and maintenance costs, and the low read-write cache search efficiency is detrimental to improving system performance.

[0159] II. The read / write cache insertion method for unified read / write cache management in this embodiment:

[0160] The core idea of ​​the cache management design in this embodiment is to manage both read and write caches using a unified data structure called DirtyNode. DirtyNodes are then organized using a hash linked list structure. When inserting data into the read or write cache, the DirtyNode is first hashed according to certain rules to determine the hash bucket, and then the insertion operation is performed on the linked list within that hash bucket. This way, both read and write caches can reside on this linked list. Before insertion, only collisions need to be checked and resolved within the DirtyNode on the linked list to ensure cache data consistency. When a read or write command searches the cache, the same rules are applied: the command range is first hashed to determine the search linked list, and then a search is performed on that linked list. This simultaneously completes the search for both read and write caches.

[0161] It is worth mentioning that this embodiment proposes a unified read / write cache management method, which can improve data consistency efficiency. It aims to achieve unified read / write cache lookup while ensuring disk data consistency, resulting in shorter lookup paths, simplified read / write cache management complexity, reduced disk processing latency, and improved disk performance. Therefore, this embodiment can efficiently look up and update the cache, thereby improving disk system performance. Secondly, this embodiment designs a unified read / write cache management method for read / write cache lookup processing. While ensuring data consistency, it uniformly searches for read / write cache conflicts, improving cache lookup and update efficiency, thus improving disk read / write performance. Simultaneously, it represents and manages the read / write cache using the same data structure, achieving unified cache management, reducing cache management complexity, and facilitating maintenance by developers. Finally, unified read / write cache management enables simultaneous read / write cache lookups, improving cache lookup and update efficiency.

[0162] In this embodiment, a read lookup command can be obtained. Based on a first preset address mapping rule, a first logical address range is mapped to obtain at least one target first logical address range. In at least one target dirty data block in the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range is determined. The first target dirty data block is either a first read dirty data block or a first write dirty data block. If the data transmission status of at least one first target dirty data block is "transmission complete," then the target data with overlapping logical address ranges in at least one first target dirty data block is obtained. The lookup result of the first read lookup command is obtained based on the target data. The read cache and write cache are merged into the same data structure, and this data structure can manage data in both the read cache and write cache simultaneously. Therefore, when performing a read lookup operation, it is not necessary to search separately in the data structures corresponding to the read cache and the write cache; only the merged data structure needs to be searched. Only one lookup operation is required, resulting in a shorter overall lookup path and higher search efficiency. Secondly, if the first logical address range spans a range that is a multiple of the base region size, the overlapping first target dirty data block can be split and then determined. This reduces the possibility of data exceeding the alignment range of the base region size being unsearchable, thus improving the integrity and accuracy of read lookups. Next, by first determining the bucket and then traversing the bucket's dirty data block list (the hash table has multiple buckets, each with a dirty data block list), search efficiency is improved. Furthermore, by accurately controlling the release status of the first target dirty data block, the situation where the first target dirty data block is released without returning a search result is reduced, thus improving the integrity and accuracy of obtaining the search results corresponding to the read lookup command. Finally, based on the overall hit rate of the read lookup command and the target data, the search results of the first read lookup command are obtained, reducing the possibility of only partially returned hit data without the user's knowledge, further improving the integrity and accuracy of obtaining the search results corresponding to the read lookup command. Finally, by invalidating only the data in one of the two dirty data blocks—the second initial dirty data block and the target operation dirty data block—we can ensure that there is only one data block for each logical address, thereby guaranteeing data consistency.

[0163] based on Figure 5 In the described embodiment, during cache management, read and / or write cache insertion can be performed for unified management of read and write caches. The address mapping method and the specific invalidation handling method will differ depending on the inserted read and write caches. In order to further achieve the effect of data unification, it is necessary to design corresponding address mapping methods and specific invalidation handling methods for read cache insertion and write cache insertion.

[0164] In this embodiment, the address mapping method and the specific invalidation handling method for read cache insertion and write cache insertion are different, and they are described below:

[0165] 1. When inserting into the read cache, the address mapping method involves base region size multiple alignment processing (if the second logical address range is within the range of base region size multiples), and / or base region size multiple splitting and base region size multiple alignment processing (if the second logical address range crosses the range of base region size multiples). The invalid method is to mark the data in the conflict region of the second initial read dirty data block as invalid (if the hit is the second initial read dirty data block), and to mark the data in the conflict region of the target read operation dirty data block as invalid (if the hit is the second initial write dirty data block).

[0166] In this embodiment of the application, when inserting into the read cache, the insertion can be performed using the corresponding address mapping method and invalidation method. For details, please refer to [link to relevant documentation]. Figure 8 , Figure 8 This is a flowchart illustrating another cache management method disclosed in an embodiment of this application. Another embodiment of the cache management method of this application includes:

[0167] 801. Obtain a read operation command, wherein the read operation command is used to read data in the second logical address range.

[0168] In this embodiment, when performing cache management, a read operation command can be obtained, wherein the read operation command is used to read data in the second logical address range.

[0169] Specifically, for example, the second logical address range could be [20KB, 100KB] or [100KB, 150KB].

[0170] 802. If the second logical address range is within a multiple of the base region size, then the second logical address range is aligned to the multiple of the base region size; and / or if the second logical address range spans a multiple of the base region size, then the second logical address range is split and aligned to the multiple of the base region size to obtain at least one target second logical address range.

[0171] After receiving the read operation command, if the second logical address range is within a multiple of the base region size, the second logical address range can be aligned to the base region size multiple. And / or if the second logical address range spans a multiple of the base region size, the second logical address range can be split and aligned to the base region size multiple to obtain at least one target second logical address range.

[0172] Wherein, if the second logical address range is within a multiple of the base region size, then the second logical address range is aligned to the base region size multiple; and / or if the second logical address range spans a multiple of the base region size, then the second logical address range is split and aligned to the base region size multiple. A method to obtain at least one target second logical address range could be as follows: if the second logical address range is within a multiple of the base region size, then the starting second logical address is aligned downwards to the base region size multiple, and the ending second logical address is aligned upwards to the base region size multiple, resulting in an aligned second logical address range. Wherein, the aligned second logical address range is the target second logical address range; and / or, if the second logical address range spans a range that is a multiple of the base region size, then the second logical address range is split into multiple second logical address sub-ranges by the base region size, the starting second logical addresses of the multiple second logical address sub-ranges are aligned downwards by the base region size multiple, and the ending second logical addresses of the multiple second logical address sub-ranges are aligned upwards by the base region size multiple, to obtain multiple aligned second logical address sub-ranges, wherein the multiple aligned second logical address sub-ranges are multiple target second logical address ranges.

[0173] Specifically, for example, if the base region size is 128KB, when the read cache needs to prefetch 128KB of data but the range spans 128KB alignment, it can first be split, for example, the range [SLMA, ELMA] can be split into two parts: [SLMA, 128KB*N] and [128KB*N, ELMA]. Then, each part can be aligned downwards or downwards by 128KB, so the range becomes [128KB*(N-1), 128KB*N] and [128KB*N, 128KB*(N+1)]. This prefetch uses two read cache DirtyNodes to send two 128KB read commands to the NAND to obtain the data.

[0174] 803. Obtain at least one target second logical address range of data from the flash memory, and obtain at least one target read operation dirty data block based on the data of at least one target second logical address range.

[0175] After obtaining at least one target second logical address range, at least one target second logical address range of data can be obtained from the flash memory, and at least one target read operation dirty data block can be obtained based on the data of at least one target second logical address range.

[0176] Specifically, at least one target second logical address range of data can be obtained from the flash NAND, and at least one target read operation dirty data block can be created based on the data of at least one target second logical address range.

[0177] 804. In at least one initial dirty data block of the dirty data block linked list of the initial hash table, determine at least one second initial dirty data block whose logical address range overlaps with at least one target second logical address range, wherein the second initial dirty data block is either a second initial read dirty data block or a second initial write dirty data block.

[0178] After obtaining data from at least one target second logical address range from the flash memory and obtaining at least one target read operation dirty data block based on the data from at least one target second logical address range, at least one second initial dirty data block whose logical address range overlaps with at least one target second logical address range can be determined from at least one initial dirty data block in the dirty data block linked list of the initial hash table, wherein the second initial dirty data block is either a second initial read dirty data block or a second initial write dirty data block.

[0179] Specifically, the method for determining at least one second initial dirty block in at least one initial dirty block of the dirty block linked list of the initial hash table, where the logical address range overlaps with at least one target second logical address range, is similar to... Figure 5 Step 503 is similar and can be referred to in terms of logic; it will not be repeated here.

[0180] 805. Mark the data in the conflict region corresponding to at least one target read operation dirty data block in at least one second initial read dirty data block as invalid, and / or mark the data in the conflict region corresponding to at least one second initial write dirty data block in at least one target read operation dirty data block as invalid.

[0181] After determining at least one second initial dirty data block in the dirty data block linked list of the initial hash table where the logical address range overlaps with at least one target second logical address range, the data in the conflict region corresponding to at least one target read operation dirty data block in the at least one second initial read dirty data block can be marked as invalid, and / or the data in the conflict region corresponding to at least one second initial write dirty data block in the at least one target read operation dirty block can be marked as invalid.

[0182] It's important to understand that if the hit is the second initial dirty read block, marking the data in the conflict region of the second initial dirty read block (where the data in the conflict region is old) as invalid ensures that only one copy of the cache exists for a logical address, and that the data in that conflict region is the latest data. Since the data in the conflict region of the second initial dirty write block is the latest, while the data in the target dirty read block (data obtained from NAND) is not necessarily the latest, marking the data in the conflict region of the target dirty read block as invalid if the hit is the second initial dirty write block ensures that only one copy of the cache exists for a logical address, and that the data in that conflict region is the latest data.

[0183] 806. Insert at least one dirty data block of the target operation after invalidation into the dirty data block linked list of the initial hash table, so as to obtain the dirty data block linked list of the target hash table based on at least one second initial dirty data block after invalidation and at least one dirty data block of the target operation.

[0184] After marking the data in the conflict region corresponding to at least one target read operation dirty data block in at least one second initial dirty data block as invalid, and / or marking the data in the conflict region corresponding to at least one second initial write dirty data block in at least one target read operation dirty data block as invalid, the at least one target operation dirty data block after invalidation can be inserted into the dirty data block linked list of the initial hash table to obtain the dirty data block linked list of the target hash table based on the at least one second initial dirty data block after invalidation and the at least one target operation dirty data block.

[0185] It is important to understand that inserting at least one dirty data block of the target operation after invalidation into the dirty data block linked list of the initial hash table, and obtaining the dirty data block linked list of the target hash table based on at least one second initial dirty data block and at least one dirty data block of the target operation after invalidation, can improve the feasibility of data unification.

[0186] To facilitate understanding of this embodiment, a specific example is given below. Please refer to [link / reference]. Figure 9 , Figure 9 This is a flowchart illustrating a unified read cache insertion method for cache management disclosed in an embodiment of this application. Figure 9 It is known that a unified read cache insertion method (a type of cache management) can be implemented, and its specific technical implementation may include the following steps:

[0187] 1. When a disk platter identifies a specific read model, it initiates a prefetch (e.g., reads data from the second logical address range via a read operation command) to prepare for insertion into the read cache;

[0188] 2. Determine the logical address range (second logical address range) of the read cache based on the current location and read speed of the host read command, assuming it is [SLMA, ELMA];

[0189] 3. The start logical address (SLMA) and end logical address (ELMA) of the read cache are aligned to 128KB. The aligned range is then split into 128KB (base area size), and multiple DirtyNodes are requested. Each DirtyNode is initialized and its data transfer status is set to 1.

[0190] 4. After finding the linked list in the bucket using the hash function, start traversing to find collisions;

[0191] 5. If a write DirtyNode conflict occurs, the conflicting data on the read Node (the target dirty data block for the read operation) is marked as invalid, because the data in the write cache is definitely the latest, while the data on the NAND may not be the latest. If the read Node has no valid data after conflict resolution, i.e., Val idbmp is 0, then the cache of that Node is canceled.

[0192] 6. If a DirtyNode read conflict occurs, the conflicting data on the old read Node (second initial dirty data block) is marked as invalid to ensure that only one copy of the cache exists for a logical address;

[0193] 7. After traversing the linked list and resolving the collisions, the allocated DirtyNode (the target dirty data block for the read operation) can be inserted into the HashList, and then the NAND data can be read.

[0194] 8. After the data is read, the data transmission status is cleared to 0, and the cache insertion is complete.

[0195] 807. Obtain the read search command, wherein the read search command is used to search for and read data in the first logical address range.

[0196] 808. Based on the first preset address mapping rule, perform address mapping on the first logical address range to obtain at least one target first logical address range after address mapping.

[0197] 809. In at least one target dirty data block of the dirty data block linked list of the target hash table, determine at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range, wherein the first target dirty data block is either a first read dirty data block or a first write dirty data block.

[0198] 810. If the data transmission status of at least one first target dirty data block is "transmission completed", then acquire the target data with overlapping logical address ranges in at least one first target dirty data block.

[0199] 811. Obtain the search results of the first read search command based on the target data.

[0200] Steps 807 to 811 in this embodiment are the same as those described above. Figure 5 Steps 506 to 510 in the illustrated embodiment are similar and will not be described in detail here.

[0201] In this embodiment, a read lookup command can be obtained. Based on a first preset address mapping rule, a first logical address range is mapped to obtain at least one target first logical address range. In at least one target dirty data block in the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range is determined. The first target dirty data block is either a first read dirty data block or a first write dirty data block. If the data transmission status of at least one first target dirty data block is "transmission complete," then the target data with overlapping logical address ranges in at least one first target dirty data block is obtained. The lookup result of the first read lookup command is obtained based on the target data. The read cache and write cache are merged into the same data structure, and this data structure can manage data in both the read cache and write cache simultaneously. Therefore, when performing a read lookup operation, it is not necessary to search separately in the data structures corresponding to the read cache and the write cache; only the merged data structure needs to be searched. Only one lookup operation is required, resulting in a shorter overall lookup path and higher search efficiency. Secondly, if the first logical address range spans a range that is a multiple of the base region size, the overlapping first target dirty data block can be split and then determined. This reduces the possibility of data exceeding the alignment range of the base region size being unsearchable, thus improving the integrity and accuracy of read lookups. Next, by first determining the bucket and then traversing the bucket's dirty data block list (the hash table has multiple buckets, each with a dirty data block list), search efficiency is improved. Furthermore, by accurately controlling the release status of the first target dirty data block, the situation where the first target dirty data block is released without returning a search result is reduced, thus improving the integrity and accuracy of obtaining the search results corresponding to the read lookup command. Finally, based on the overall hit rate of the read lookup command and the target data, the search results of the first read lookup command are obtained, reducing the possibility of only partially returned hit data without the user's knowledge, further improving the integrity and accuracy of obtaining the search results corresponding to the read lookup command. Furthermore, by invalidating only the data in one of the two dirty data blocks—the second initial dirty data block and the target operation dirty data block—it is ensured that there is only one copy of the data at each logical address, thus guaranteeing data consistency. Finally, if the second initial read dirty data block is hit, the data in the conflict region of the second initial read dirty data block (where the data in the conflict region is old data) is marked as invalid. If the second initial write dirty data block is hit, the data in the conflict region of the target read operation dirty data block is marked as invalid. This ensures that there is only one copy of the cache for a logical address, and that the data in the conflict region is the latest data.

[0202] 2. When inserting into the write cache, the address mapping method involves non-split and non-aligned processing (if the second logical address range is within a multiple of the base area size), or split processing based on a multiple of the base area size (if the second logical address range spans a multiple of the base area size). An invalid method is to mark the data in the conflicting regions of the second initial dirty write data block and the second initial dirty read data block as invalid.

[0203] In this embodiment of the application, when inserting a write cache, the write cache can be inserted using the corresponding address mapping method and invalidation method. For details, please refer to [link to relevant documentation]. Figure 10 , Figure 10 This is a flowchart illustrating another cache management method disclosed in an embodiment of this application. Another embodiment of the cache management method of this application includes:

[0204] 1001. Obtain the write operation command, wherein the write operation command is used to write the updated data to the second logical address range.

[0205] In this embodiment, when performing cache management, a write operation command can be obtained, wherein the write operation command is used to write the updated data into the second logical address range.

[0206] Specifically, for example, the second logical address range could be [20KB, 100KB] or [100KB, 150KB].

[0207] 1002. If the second logical address range spans a range that is a multiple of the base region size, then the second logical address range is split into multiples of the base region size, and / or if the second logical address range is within a range that is a multiple of the base region size, then the second logical address range is not split and is not aligned, so as to obtain at least one target second logical address range.

[0208] After receiving the write operation command, if the second logical address range spans a range that is a multiple of the base region size, the second logical address range can be split into multiples of the base region size, and / or if the second logical address range is within a range that is a multiple of the base region size, the second logical address range can be left unsplit and unaligned to obtain at least one target second logical address range.

[0209] Specifically, for example, if the base region size is 128KB and the second logical address range is [20KB, 100KB], then the target second logical address range is [20KB, 100KB] because the second logical address range is within a multiple of the base region size; if the second logical address range is [100KB, 150KB], then the second logical address range can be split into two target second logical address ranges, namely [100KB, 128KB] and [128KB, 150KB], because the second logical address range spans a multiple of the base region size.

[0210] It's worth noting that splitting ensures that DirtyNodes whose logical address ranges span the alignment range of the base region (e.g., 128KB) are correctly inserted into the corresponding hash buckets, thus guaranteeing the accuracy of DirtyNode management and lookup. Without splitting, DirtyNodes spanning the alignment range of the base region (e.g., 128KB) would be inserted into the wrong hash bucket, leading to problems with lookup or conflict resolution. Therefore, splitting guarantees the correctness and validity of the HashList. Secondly, when writing to the cache requires inserting a HashList but the logical address range spans the 128KB alignment range, only splitting is needed before insertion; alignment is not required, achieving data consistency.

[0211] 1003. Obtain updated data for at least one target second logical address range, and obtain at least one target write operation dirty data block based on the updated data for at least one target second logical address range.

[0212] After obtaining at least one target second logical address range, updated data of at least one target second logical address range can be obtained, and at least one target write operation dirty data block can be obtained based on the updated data of at least one target second logical address range.

[0213] Specifically, the updated data of the target second logical address range can refer to the data obtained after modifying the original data of the target second logical address range.

[0214] 1004. In at least one initial dirty data block of the dirty data block linked list of the initial hash table, determine at least one second initial dirty data block whose logical address range overlaps with at least one target second logical address range, wherein the second initial dirty data block is either a second initial read dirty data block or a second initial write dirty data block.

[0215] After obtaining updated data for at least one target second logical address range and obtaining at least one target write operation dirty data block based on the updated data for at least one target second logical address range, at least one second initial dirty data block can be determined from at least one initial dirty data block in the dirty data block linked list of the initial hash table, wherein the logical address range overlaps with at least one target second logical address range, and the second initial dirty data block is either a second initial read dirty data block or a second initial write dirty data block.

[0216] Specifically, the method for determining at least one second initial dirty block in at least one initial dirty block of the dirty block linked list of the initial hash table, where the logical address range overlaps with at least one target second logical address range, is similar to... Figure 5 Step 503 is similar and can be referred to in terms of logic; it will not be repeated here.

[0217] 1005. Mark the data in the conflict region corresponding to at least one target write operation dirty data block in at least one second initial write dirty data block as invalid, and / or mark the data in the conflict region corresponding to at least one target write operation dirty data block in at least one second initial read dirty data block as invalid.

[0218] After determining at least one second initial dirty data block in the dirty data block linked list of the initial hash table where the logical address range overlaps with at least one target second logical address range, the data in the conflict region corresponding to at least one target write operation dirty data block in the at least one second initial write dirty data block can be marked as invalid, and / or the data in the conflict region corresponding to at least one target write operation dirty data block in the at least one second initial read dirty data block can be marked as invalid.

[0219] It is important to understand that since the data in the conflict region of the target write dirty data block is the latest, regardless of whether the hit is the second initial write dirty data block or the second initial read dirty data block, the data in the conflict region of the second initial write dirty data block or the second initial read dirty data block can be directly marked as invalid.

[0220] 1006. Insert at least one dirty data block of the target operation after invalidation into the dirty data block linked list of the initial hash table, so as to obtain the dirty data block linked list of the target hash table based on at least one second initial dirty data block after invalidation and at least one dirty data block of the target operation.

[0221] After marking the data in the conflicting regions corresponding to at least one target write operation dirty data block in at least one second initial dirty data block as invalid, and / or marking the data in the conflicting regions corresponding to at least one target write operation dirty data block in at least one second initial dirty data block as invalid, the at least one target operation dirty data block after invalidation can be inserted into the dirty data block linked list of the initial hash table to obtain the dirty data block linked list of the target hash table based on the at least one second initial dirty data block after invalidation and the at least one target operation dirty data block.

[0222] Step 1006 in this embodiment is the same as described above. Figure 6 Step 606 in the illustrated embodiment is similar and will not be described in detail here.

[0223] To facilitate understanding of this embodiment, a specific example is given below. Please refer to [link / reference]. Figure 11 , Figure 11 This is a flowchart illustrating a unified write cache insertion method for cache management disclosed in an embodiment of this application. Figure 11 It is understood that a unified write cache insertion process (a type of cache management) can be implemented for cache management. The specific technical implementation may include the following steps:

[0224] 1. After receiving a write command (write operation command), the disk begins to request DirtyNodes. If the write command range spans a 128KB (base area size) alignment range, it is split into two DirtyNodes (target second logical address range), and the fields on them are initialized, with the data transfer status set to 1.

[0225] 2. After finding the hash bucket using a hash function based on the logical address range on the DirtyNode, start traversing the linked list on the bucket to search for collisions;

[0226] 3. If a write DirtyNode (second initial dirty data block) conflict is found, the conflicting portion of the valid data on the write Node (second initial dirty data block) will be marked as invalid.

[0227] 4. If a read DirtyNode (second initial dirty data block) conflict is found, the conflicting part of the valid data on the read Node (second initial dirty data block) is marked as invalid. That is, the conflicting position on the Val idbmp of the read DirtyNode (second initial dirty data block) is cleared to 0, so that the read command cannot hit the data on the read cache.

[0228] 5. After traversing the linked list and resolving conflicts, the allocated DirtyNode (the target dirty data block for write operations) can be inserted into the HashList, and then the data transmission of commands can begin;

[0229] 6. Data transmission complete, data transmission status cleared to 0, cache insertion complete.

[0230] 1007. Obtain the read search command, which is used to search for and read data in the first logical address range.

[0231] 1008. Based on the first preset address mapping rule, perform address mapping on the first logical address range to obtain at least one target first logical address range after address mapping.

[0232] 1009. In at least one target dirty data block of the dirty data block linked list of the target hash table, determine at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range, wherein the first target dirty data block is either a first read dirty data block or a first write dirty data block.

[0233] 1010. If the data transmission status of at least one first target dirty data block is "transmission completed", then acquire the target data with overlapping logical address ranges in at least one first target dirty data block.

[0234] 1011. Obtain the search results of the first read search command based on the target data.

[0235] Steps 1007 to 1011 in this embodiment are the same as those described above. Figure 5 Steps 506 to 510 in the illustrated embodiment are similar and will not be described in detail here.

[0236] In this embodiment, a read lookup command can be obtained. Based on a first preset address mapping rule, a first logical address range is mapped to obtain at least one target first logical address range. In at least one target dirty data block in the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with at least one target first logical address range is determined. The first target dirty data block is either a first read dirty data block or a first write dirty data block. If the data transmission status of at least one first target dirty data block is "transmission complete," then the target data with overlapping logical address ranges in at least one first target dirty data block is obtained. The lookup result of the first read lookup command is obtained based on the target data. The read cache and write cache are merged into the same data structure, and this data structure can manage data in both the read cache and write cache simultaneously. Therefore, when performing a read lookup operation, it is not necessary to search separately in the data structures corresponding to the read cache and the write cache; only the merged data structure needs to be searched. Only one lookup operation is required, resulting in a shorter overall lookup path and higher search efficiency. Secondly, if the first logical address range spans a range that is a multiple of the base region size, the overlapping first target dirty data block can be split and then determined. This reduces the possibility of data exceeding the alignment range of the base region size being unsearchable, thus improving the integrity and accuracy of read lookups. Next, by first determining the bucket and then traversing the bucket's dirty data block list (the hash table has multiple buckets, each with a dirty data block list), search efficiency is improved. Furthermore, by accurately controlling the release status of the first target dirty data block, the situation where the first target dirty data block is released without returning a search result is reduced, thus improving the integrity and accuracy of obtaining the search results corresponding to the read lookup command. Finally, based on the overall hit rate of the read lookup command and the target data, the search results of the first read lookup command are obtained, reducing the possibility of only partially returned hit data without the user's knowledge, further improving the integrity and accuracy of obtaining the search results corresponding to the read lookup command. Furthermore, by invalidating only the data in the second initial dirty data block and the target operation dirty data block, it is ensured that there is only one data block per logical address, thus guaranteeing data consistency. Finally, the splitting process allows DirtyNodes whose logical address ranges span the alignment range of the base region size (e.g., 128KB) to be correctly inserted into the corresponding Hash buckets, ensuring the accuracy of DirtyNode management and retrieval. Moreover, when the write cache needs to insert data into the HashList but the logical address range spans the alignment range of the base region size, only splitting is required before insertion; alignment is not necessary, achieving data consistency.

[0237] The cache management method in the embodiments of this application has been described above. The cache management device in the embodiments of this application is described below. Please refer to [link / reference]. Figure 12 One embodiment of the cache management device in this application includes:

[0238] The obtaining unit 1201 is used to obtain a read search command, wherein the read search command is used to search for and read data in a first logical address range;

[0239] Address mapping unit 1202 is used to perform address mapping on the first logical address range based on a first preset address mapping rule to obtain at least one target first logical address range after address mapping.

[0240] The determining unit 1203 is used to determine, in at least one target dirty data block in at least one target dirty data block of the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range, wherein the first target dirty data block is either a first read dirty data block or a first write dirty data block.

[0241] The acquisition unit 1204 is used to acquire target data with overlapping logical address ranges in the at least one first target dirty data block if the data transmission status of the at least one first target dirty data block is transmission completed.

[0242] The obtaining unit 1201 is also used to obtain the search result of the first read search command based on the target data.

[0243] In this embodiment, a read lookup command can be obtained. Based on a first preset address mapping rule, the first logical address range is mapped to obtain at least one target first logical address range. In at least one target dirty data block in the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range is determined. The first target dirty data block is either a first read dirty data block or a first write dirty data block. If the data transmission status of the at least one first target dirty data block is "transmission complete," then the target data with overlapping logical address ranges in the at least one first target dirty data block is obtained. Based on the target data, the lookup result of the first read lookup command is obtained. The read cache and write cache are merged into the same data structure, and this data structure can manage data in both the read cache and write cache simultaneously. Therefore, when performing a read lookup operation, it is not necessary to search separately in the data structures corresponding to the read cache and the write cache; only the merged data structure needs to be searched. Only one lookup operation is required, resulting in a shorter overall lookup path and higher search efficiency.

[0244] The cache management device in the embodiments of this application is described in detail below. Please refer to [link / reference]. Figure 13 Another embodiment of the cache management device in this application includes:

[0245] The obtaining unit 1301 is used to obtain a read search command, wherein the read search command is used to search for and read data in a first logical address range;

[0246] Address mapping unit 1302 is used to perform address mapping on the first logical address range based on a first preset address mapping rule to obtain at least one target first logical address range after address mapping.

[0247] The determining unit 1303 is used to determine, in at least one target dirty data block in at least one target dirty data block of the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range, wherein the first target dirty data block is either a first read dirty data block or a first write dirty data block.

[0248] The acquisition unit 1304 is used to acquire target data with overlapping logical address ranges in the at least one first target dirty data block if the data transmission status of the at least one first target dirty data block is transmission completed.

[0249] The obtaining unit 1301 is also used to obtain the search result of the first read search command based on the target data.

[0250] The address mapping unit 1302 is specifically used to perform a multiple-of-the-base-region-size split on the first logical address range if the first logical address range spans a range that is a multiple of the base region size, to obtain multiple first logical address portion ranges, wherein the first logical address portion range is the target first logical address range; and / or, if the first logical range is within a multiple of the base region size, then the first logical range is used as the target first logical address range.

[0251] The determining unit 1303 is specifically used to determine at least one hash value corresponding to the at least one target first logical address range based on a hash algorithm, wherein the at least one hash value is used to map the at least one target bucket corresponding to the at least one target first logical address range in the target hash table, and in the at least one target bucket corresponding to the at least one target first logical address range, at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range is determined, wherein the target hash table includes multiple hash buckets, and the hash buckets include a dirty data block linked list.

[0252] The cache management device further includes: an invalidation unit 1305; and an insertion unit 1306.

[0253] The obtaining unit 1301 is further configured to obtain a target operation command, wherein the target operation command includes a read operation command and / or a write operation command, the read operation command is used to read data in the second logical address range, and the write operation command is used to write updated data into the second logical address range;

[0254] The address mapping unit 1302 is further configured to perform address mapping on the second logical address range based on the second preset address mapping rule to obtain at least one target second logical address range after address mapping, so as to obtain at least one target operation dirty data block corresponding to the target operation command based on the at least one target second logical address range;

[0255] The determining unit 1303 is further configured to determine, in at least one initial dirty data block of the dirty data block linked list of the initial hash table, at least one second initial dirty data block whose logical address range overlaps with the at least one target second logical address range, wherein the second initial dirty data block is either a second initial read dirty data block or a second initial write dirty data block.

[0256] The invalidation unit 1305 is used to invalidate the data in the conflict area between the at least one second initial dirty data block and the at least one target operation dirty data block based on a preset invalidation rule, so as to obtain at least one second initial dirty data block and at least one target operation dirty data block after invalidation.

[0257] The insertion unit 1306 is used to insert at least one target operation dirty data block after invalidation into the dirty data block linked list of the initial hash table, so as to obtain the dirty data block linked list of the target hash table based on at least one second initial dirty data block after invalidation and at least one target operation dirty data block.

[0258] The address mapping unit 1302 is specifically used to perform reference region size multiple alignment processing on the second logical address range if the second logical address range is within a multiple of the reference region size, and / or to perform reference region size multiple splitting and reference region size multiple alignment processing on the second logical address range if the second logical address range crosses the range of the reference region size multiple, so as to obtain the at least one target second logical address range, wherein the target operation command is a read operation command, the at least one target operation dirty data block is at least one target read operation dirty data block, obtain the data of the at least one target second logical address range from the flash memory, and obtain the at least one target read operation dirty data block based on the data of the at least one target second logical address range;

[0259] The invalidation unit 1305 is specifically used to mark the data in the conflict region corresponding to the at least one target read operation dirty data block in at least one second initial read dirty data block as invalid, and / or to mark the data in the conflict region corresponding to the at least one second initial write dirty data block in at least one target read operation dirty data block as invalid.

[0260] The address mapping unit 1302 is specifically configured to: if the second logical address range is within a multiple of the reference region size, then perform downward alignment of the starting second logical address by a multiple of the reference region size, and upward alignment of the ending second logical address by a multiple of the reference region size, to obtain an aligned second logical address range, wherein the aligned second logical address range is the target second logical address range; and / or, if the second logical address range spans a multiple of the reference region size, then perform splitting processing of the second logical address range by a multiple of the reference region size, to obtain multiple second logical address sub-ranges, perform downward alignment of the starting second logical addresses of the multiple second logical address sub-ranges by a multiple of the reference region size, and upward alignment of the ending second logical addresses of the multiple second logical address sub-ranges by a multiple of the reference region size, to obtain an aligned multiple second logical address sub-ranges, wherein the aligned multiple second logical address sub-ranges are multiple target second logical address ranges, and the second logical address range includes a starting second logical address and an ending second logical address.

[0261] The address mapping unit 1302 is specifically used to perform a base region multiple splitting process on the second logical address range if the second logical address range crosses a range that is a multiple of the base region size, and / or to perform a non-split and non-aligned process on the second logical address range if the second logical address range is within a range that is a multiple of the base region size, so as to obtain the at least one target second logical address range, obtain the updated data of the at least one target second logical address range, and obtain the at least one target write operation dirty data block based on the updated data of the at least one target second logical address range, wherein the target operation command is a write operation command, and the at least one target operation dirty data block is at least one target write operation dirty data block;

[0262] The invalidation unit 1305 is specifically used to mark the data in the conflict region corresponding to the at least one target write operation dirty data block in the at least one second initial write dirty data block as invalid, and / or to mark the data in the conflict region corresponding to the at least one target write operation dirty data block in the at least one second initial read dirty data block as invalid.

[0263] In this embodiment, each unit in the cache management device performs as described above. Figure 2 The operation of the cache management device in the illustrated embodiment will not be described in detail here.

[0264] Please refer to the following: Figure 14 One embodiment of the computer device 1400 in this application includes:

[0265] Central processing unit 1401, memory 1405, input / output interface 1404, wired or wireless network interface 1403, and power supply 1402.

[0266] Memory 1405 is either a short-term storage memory or a persistent storage memory;

[0267] The central processing unit 1401 is configured to communicate with the memory 1405 and execute instructions stored in the memory 1405 to perform the aforementioned operations. Figures 2 to 5 ,as well as Figures 8 to 11 The method in the illustrated embodiment.

[0268] This application also provides a computer-readable storage medium, which includes instructions that, when executed on a computer, cause the computer to perform the aforementioned actions. Figures 2 to 5 ,as well as Figures 8 to 11 The method in the illustrated embodiment.

[0269] This application also provides a computer program product containing instructions, which, when run on a computer, causes the computer to perform the aforementioned... Figures 2 to 5 ,as well as Figures 8 to 11 The method in the illustrated embodiment.

[0270] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.

[0271] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0272] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection between apparatuses or units through some interfaces, and may be electrical, mechanical, or other forms.

[0273] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0274] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0275] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

Claims

1. A cache management method characterized by, include: Obtain a read search command, wherein the read search command is used to search for and read data within a first logical address range; Based on the first preset address mapping rule, the first logical address range is mapped to obtain at least one target first logical address range after address mapping; In at least one target dirty data block in the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range is determined, wherein the first target dirty data block is either a first read dirty data block or a first write dirty data block. If the data transmission status of the at least one first target dirty data block is "transmission completed", then the target data with overlapping logical address ranges in the at least one first target dirty data block is obtained. The search results of the read search command are obtained based on the target data; Before determining, in at least one target dirty data block in the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range, the method further includes: Obtain a target operation command, wherein the target operation command includes a read operation command and / or a write operation command, the read operation command is used to read data in a second logical address range, and the write operation command is used to write updated data into the second logical address range; Based on the second preset address mapping rule, the second logical address range is mapped to obtain at least one target second logical address range after address mapping, so as to obtain at least one target operation dirty data block corresponding to the target operation command based on the at least one target second logical address range; In at least one initial dirty data block of the dirty data block linked list of the initial hash table, at least one second initial dirty data block is determined to have a logical address range that overlaps with the at least one target second logical address range, wherein the second initial dirty data block is either a second initial read dirty data block or a second initial write dirty data block. Based on a preset invalidation rule, the data in the conflict area between the at least one second initial dirty data block and the at least one target operation dirty data block is invalidated to obtain at least one second initial dirty data block and at least one target operation dirty data block after invalidation. At least one dirty data block of the target operation after invalidation is inserted into the dirty data block linked list of the initial hash table, so as to obtain the dirty data block linked list of the target hash table based on at least one second initial dirty data block after invalidation and at least one dirty data block of the target operation.

2. The method of claim 1, wherein, The step of mapping the first logical address range based on a first preset address mapping rule to obtain at least one target first logical address range after address mapping includes: If the first logical address range spans a range that is a multiple of the base region size, then the first logical address range is split into multiples of the base region size to obtain multiple first logical address portion ranges, wherein the first logical address portion range is the target first logical address range; And / or, If the first logical address range is within a multiple of the size of the reference region, then the first logical address range is taken as the target first logical address range.

3. The method of claim 1, wherein, The target hash table includes multiple hash buckets, and each hash bucket includes a linked list of dirty data blocks; The step of determining at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range in at least one target dirty data block in the dirty data block linked list of the target hash table includes: Based on a hash algorithm, at least one hash value corresponding to the at least one target first logical address range is determined, wherein the at least one hash value is used to map the at least one target bucket in the target hash table corresponding to the at least one target first logical address range; In at least one target bucket corresponding to the at least one target first logical address range, at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range is identified.

4. The method of claim 1, wherein, The target operation command is a read operation command, and the at least one target operation dirty data block is at least one target read operation dirty data block; The step of mapping the second logical address range based on a second preset address mapping rule to obtain at least one target second logical address range after address mapping, and obtaining at least one target operation dirty data block corresponding to the target operation command based on the at least one target second logical address range, includes: If the second logical address range is within a multiple of the base region size, then the second logical address range is aligned to the base region size multiple; and / or if the second logical address range spans the range of the base region size multiple, then the second logical address range is split and aligned to the base region size multiple to obtain the at least one target second logical address range. Obtain data from the at least one target second logical address range from the flash memory, and obtain the at least one target read operation dirty data block based on the data from the at least one target second logical address range; The step of invalidating the data in the conflict region between the at least one second initial dirty data block and the at least one target operation dirty data block based on a preset invalidation rule, to obtain at least one second initial dirty data block and at least one target operation dirty data block after invalidation, includes: Mark the data in the conflict region corresponding to the at least one target read operation dirty data block in at least one second initial dirty data block as invalid; And / or, The data in the conflict region corresponding to the at least one second initial write dirty data block in at least one target read operation dirty data block is marked as invalid.

5. The method of claim 4, wherein, The second logical address range includes a starting second logical address and an ending second logical address; If the second logical address range is within a multiple of the base region size, then the second logical address range is aligned to the base region size multiple; and / or if the second logical address range spans a multiple of the base region size, then the second logical address range is split and aligned to the base region size multiple to obtain the at least one target second logical address range, including: If the range of the second logical address is within a multiple of the size of the reference region, then the starting second logical address is downward aligned to a multiple of the size of the reference region, and the ending second logical address is upward aligned to a multiple of the size of the reference region, to obtain the aligned second logical address range, wherein the aligned second logical address range is the target second logical address range. And / or, If the second logical address range spans a range that is a multiple of the size of the reference region, then the second logical address range is split into multiples of the size of the reference region to obtain multiple second logical address portion ranges. The starting second logical addresses of the multiple second logical address ranges are aligned downwards by a multiple of the reference region size, and the ending second logical addresses of the multiple second logical address ranges are aligned upwards by a multiple of the reference region size, to obtain multiple aligned second logical address ranges, wherein the multiple aligned second logical address ranges are multiple target second logical address ranges.

6. The method of claim 1, wherein, The target operation command is a write operation command, and the at least one target operation dirty data block is at least one target write operation dirty data block; The step of mapping the second logical address range based on a second preset address mapping rule to obtain at least one target second logical address range after address mapping, and obtaining at least one target operation dirty data block corresponding to the target operation command based on the at least one target second logical address range, includes: If the second logical address range spans a range that is a multiple of the base region size, then the second logical address range is split into multiples of the base region size, and / or if the second logical address range is within a range that is a multiple of the base region size, then the second logical address range is not split and is not aligned, so as to obtain the at least one target second logical address range. Obtain updated data for the at least one target second logical address range, and obtain the at least one target write operation dirty data block based on the updated data for the at least one target second logical address range; The step of invalidating the data in the conflict region between the at least one second initial dirty data block and the at least one target operation dirty data block based on a preset invalidation rule, to obtain at least one second initial dirty data block and at least one target operation dirty data block after invalidation, includes: Mark the data in the conflict region corresponding to the at least one target write operation dirty data block in the at least one second initial dirty data block as invalid; And / or, The data in the conflict region corresponding to the at least one target write operation dirty data block in the at least one second initial read dirty data block is marked as invalid.

7. A cache management device, characterized by, include: The obtaining unit is used to obtain a read search command, wherein the read search command is used to search for and read data within a first logical address range; The address mapping unit is used to perform address mapping on the first logical address range based on a first preset address mapping rule to obtain at least one target first logical address range after address mapping. The determining unit is configured to determine, in at least one target dirty data block in at least one target dirty data block of the dirty data block linked list of the target hash table, at least one first target dirty data block whose logical address range overlaps with the at least one target first logical address range, wherein the first target dirty data block is either a first read dirty data block or a first write dirty data block. The acquisition unit is configured to acquire target data with overlapping logical address ranges in the at least one first target dirty data block if the data transmission status of the at least one first target dirty data block is "transmission completed". The obtaining unit is further configured to obtain the search result of the read search command based on the target data; The obtaining unit is further configured to obtain a target operation command, wherein the target operation command includes a read operation command and / or a write operation command, the read operation command is configured to read data in the second logical address range, and the write operation command is configured to write updated data into the second logical address range; The address mapping unit is further configured to perform address mapping on the second logical address range based on the second preset address mapping rule to obtain at least one target second logical address range after address mapping, so as to obtain at least one target operation dirty data block corresponding to the target operation command based on the at least one target second logical address range; The determining unit is further configured to determine, in at least one initial dirty data block in the dirty data block linked list of the initial hash table, at least one second initial dirty data block whose logical address range overlaps with the at least one target second logical address range, wherein the second initial dirty data block is either a second initial read dirty data block or a second initial write dirty data block. The cache management device further includes: Invalidation processing is used to invalidate the data in the conflict area between the at least one second initial dirty data block and the at least one target operation dirty data block based on a preset invalidation rule, so as to obtain at least one second initial dirty data block and at least one target operation dirty data block after invalidation processing. The acquisition unit is further configured to insert at least one target operation dirty data block after invalidation into the dirty data block linked list of the initial hash table, so as to obtain the dirty data block linked list of the target hash table based on at least one second initial dirty data block after invalidation and at least one target operation dirty data block.

8. A computer device, comprising: include: Central processing unit and memory; The memory is either a short-term storage memory or a persistent storage memory; The central processing unit is configured to communicate with the memory and execute instructions in the memory to perform the method according to any one of claims 1 to 6.

9. A computer-readable storage medium, characterized in that, The computer-readable storage medium includes instructions that, when executed on a computer, cause the computer to perform the method as described in any one of claims 1 to 6.

10. A computer program product comprising instructions or a computer program, characterized in that, When the computer program product is run on a computer, it causes the computer to perform the method as described in any one of claims 1 to 6.