A hard disk space management method and device, electronic equipment and readable medium

By clearing and managing the logical storage area of ​​the ZNS disk, and using the RocksDB database to store metadata, the FTL policy and GC function are automatically implemented, solving the complexity of users managing the reserved space of the ZNS disk themselves, and realizing simplified FTL and GC management.

CN119759265BActive Publication Date: 2026-06-16CHINA TELECOM CLOUD TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHINA TELECOM CLOUD TECH CO LTD
Filing Date
2024-12-05
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Users need to implement FTL strategies and GC functions themselves when using the reserved space in the ZNS disk, which has a relatively high barrier to entry.

Method used

This paper provides a disk space management method that clears logical storage areas, uses the RocksDB database to store metadata, and determines the target area and updates the storage status when writing data blocks, thereby automatically implementing FTL strategy and GC function.

🎯Benefits of technology

It achieves automated management of FTL strategy and GC function of ZNS disk, reducing the complexity of user operation.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN119759265B_ABST
    Figure CN119759265B_ABST
Patent Text Reader

Abstract

Embodiments of the present application provide a hard disk space management method and device, electronic equipment and readable medium, applied to a hard disk, the hard disk includes at least one logical storage area, empty the logical storage area of the hard disk, and store the storage usage state and / or storage data state of the logical storage area by using a preset database; The metadata of the logical storage area is a data block; In the process of writing the preset to-be-written data block into the hard disk, it is judged whether there is a target logical storage area in the hard disk; The target logical storage area stores data related to the to-be-written data block; If there is, the to-be-written data block is written into the target logical storage area, and the storage usage state and / or storage data state of the target logical storage area in the database is updated; Based on the database, the garbage data block in any logical storage area in the hard disk is cleared, and the automatic implementation of the FTL implementation strategy and the function of GC of the zns disk is realized.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of hard disk technology, and in particular to a hard disk space management method, a hard disk space management device, an electronic device, and a computer-readable medium. Background Technology

[0002] Solid-state drives (SSDs) typically reserve 15% of their space for garbage collection (GC) and bad block replacement. This reserved space is not visible to the user.

[0003] ZNS (Zone Namespace SSD) is a special type of SSD that exposes reserved space to the user, who can directly manage this space.

[0004] When using the reserved space on the ZNS disk, users need to manage the disk space themselves, implementing FTL (Flash Translate layer) strategies and GC functions, which has a relatively high learning curve. It should be noted that the FTL strategy refers to translating the user's logical LBA (Logical Block Addressing) address into the physical location on the disk, i.e., locating the location of the data to be operated on. Summary of the Invention

[0005] This invention provides a hard disk space management method, apparatus, electronic device, and computer-readable storage medium to solve the problem that when users use the reserved space in a ZNS disk, they need to manage the disk space themselves, implement FTL (Flash Translate layer) strategies and GC functions, which has a high barrier to entry.

[0006] This invention discloses a space management method for a hard disk, applied to a hard disk including at least one logical storage area, the method comprising:

[0007] The logical storage area of ​​the hard disk is cleared, and the storage usage status and / or storage data status of the logical storage area are stored in a preset database; the metadata of the logical storage area is a data block.

[0008] During the process of writing the preset data block to be written to the hard disk, it is determined whether a target logical storage area exists in the hard disk; the target logical storage area stores data related to the data block to be written.

[0009] If it exists, the data block to be written is written to the target logical storage area, and the storage usage status and / or storage data status of the target logical storage area in the database are updated;

[0010] Based on the database, remove garbage data blocks from any of the logical storage areas in the hard disk.

[0011] Optionally, the storage usage state includes an idle state; when the logical storage area is cleared, the logical storage area is in an idle state; the method includes:

[0012] If the target logical storage area does not exist in the hard disk, the data block to be written is written to any logical storage area that is in the idle state.

[0013] Optionally, the storage usage status further includes an active status; updating the storage usage status and / or storage data status of the target logical storage area in the database includes:

[0014] During the process of writing the data block to be written to the target logical storage area, the storage usage status of the target logical storage area is updated from the idle state to the active state;

[0015] After the data block to be written is written to the target logical storage area, the storage data status in the target logical storage area is updated.

[0016] Optionally, the stored data status includes a clearing identifier for at least one data block; the step of clearing garbage data blocks in any of the logical storage areas of the hard disk based on the database includes:

[0017] The logical storage area containing the garbage data blocks in the hard disk is designated as the logical storage area to be processed.

[0018] By modifying the clearing identifier of the garbage data block, the clearing identifier of the garbage data block indicates that the garbage data block is a data block to be cleared;

[0019] Based on the clearing identifier, obtain the proportion information of the data blocks to be cleared in the logical storage area to be processed;

[0020] Based on the proportion of data blocks to be cleared, determine whether to clear the data blocks to be cleared.

[0021] Optionally, determining whether to clear the data blocks to be cleared based on the proportion information of the data blocks to be cleared includes:

[0022] Based on the proportion information of the data blocks to be cleared, determine whether the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold.

[0023] If the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold, then the data blocks to be cleared are cleared.

[0024] Optionally, clearing the data block to be cleared includes:

[0025] Based on the clearing identifier of the data block, determine that the target data block in the logical storage area to be processed is not the data block to be cleared;

[0026] The target data block is transferred to another logical storage area on the hard disk, excluding the logical storage area to be processed.

[0027] Clear the storage area of ​​the logic to be processed.

[0028] Optionally, the method includes:

[0029] After the target data block is transferred to another logical storage area in the hard disk, the storage data state of the logical storage area that received the target data block is modified;

[0030] After the pending logical storage area is cleared, modify the storage usage status and / or the storage data status of the pending logical storage area.

[0031] This invention also discloses a space management device for a hard disk, applied to a hard disk, the hard disk including at least one logical storage area, the device comprising:

[0032] The clearing module is used to clear the logical storage area of ​​the hard disk and store the storage usage status and / or storage data status of the logical storage area using a preset database; the metadata of the logical storage area is a data block.

[0033] The target logical storage area determination module is used to determine whether a target logical storage area exists in the hard disk during the process of writing a preset data block to be written to the hard disk; the target logical storage area stores data related to the data block to be written.

[0034] The first writing module is used to write the data block to be written into the target logical storage area if it exists, and update the storage usage status and / or storage data status of the target logical storage area in the database.

[0035] The cleaning module is used to clean up garbage data blocks in any of the logical storage areas of the hard disk based on the database.

[0036] Optionally, the storage usage state includes an idle state; when the logical storage area is cleared, the logical storage area is in an idle state; the device includes:

[0037] The second write module is used to write the data block to be written to any logical storage area that is in the idle state if the target logical storage area does not exist in the hard disk.

[0038] Optionally, the storage usage state further includes an active state; the first write module includes:

[0039] The first state update submodule is used to update the storage usage state of the target logical storage area from the idle state to the active state during the process of the data block to be written to the target logical storage area.

[0040] The second state update submodule is used to update the storage data state in the target logical storage area after the data block to be written is written to the target logical storage area.

[0041] Optionally, the stored data status includes a clear identifier for at least one data block; the clearing module includes:

[0042] The logical storage area to be processed is a sub-module used to take the logical storage area containing the garbage data blocks in the hard disk as the logical storage area to be processed;

[0043] The modification submodule is used to modify the clearing identifier of the garbage data block so that the clearing identifier of the garbage data block indicates that the garbage data block is a data block to be cleared;

[0044] The submodule for obtaining the proportion of data blocks to be cleared is used to obtain the proportion of data blocks to be cleared in the logical storage area to be processed based on the clearing identifier.

[0045] The clearing judgment submodule is used to determine whether to clear the data block to be cleared based on the proportion information of the data block to be cleared.

[0046] Optionally, the clearing determination submodule includes:

[0047] The threshold determination unit is used to determine whether the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold based on the proportion information of the data blocks to be cleared.

[0048] The clearing unit is used to clear the data blocks to be cleared if the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold.

[0049] Optionally, the cleaning unit includes:

[0050] The target data block determination subunit is used to determine, based on the clearing identifier of the data block, a target data block in the logical storage area to be processed that is not the data block to be cleared;

[0051] A transfer subunit is used to transfer the target data block to other logical storage areas in the hard disk besides the logical storage area to be processed.

[0052] The clear sub-unit is used to clear the logic storage area to be processed.

[0053] Optionally, the device includes:

[0054] The first state modification module is used to modify the storage data state of the logical storage area that received the target data block after the target data block is transferred to another logical storage area in the hard disk;

[0055] The second state modification module is used to modify the storage usage state and / or the storage data state of the logical storage area to be processed after the logical storage area to be processed is cleared.

[0056] This invention also discloses an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus;

[0057] The memory is used to store computer programs;

[0058] When the processor executes a program stored in the memory, it implements the method described in the embodiments of the present invention.

[0059] This invention also discloses one or more computer-readable media storing instructions that, when executed by one or more processors, cause the processors to perform the methods described in this invention.

[0060] The embodiments of the present invention have the following advantages:

[0061] In this embodiment of the invention, the hard disk includes at least one logical storage area. The logical storage area of ​​the hard disk is cleared, and the storage usage status and / or storage data status of the logical storage area are stored using a preset database. The metadata of the logical storage area is data blocks. During the process of writing preset data blocks to be written to the hard disk, it is determined whether a target logical storage area exists on the hard disk. The target logical storage area stores data related to the data blocks to be written. If it exists, the data blocks to be written are written to the target logical storage area, and the storage usage status and / or storage data status of the target logical storage area in the database are updated. Based on the database, garbage data blocks in any logical storage area of ​​the hard disk are cleared. A ZNS disk space management scheme based on RocksDB is provided, which realizes the automatic implementation of the FTL implementation strategy and GC function of the ZNS disk. Attached Figure Description

[0062] Figure 1 This is a flowchart illustrating the steps of a hard disk space management method provided in an embodiment of the present invention;

[0063] Figure 2 This is a schematic diagram illustrating the relationship between a zone and a chunk provided in an embodiment of the present invention;

[0064] Figure 3 This is a state diagram of a logical storage area provided in an embodiment of the present invention;

[0065] Figure 4 This is a schematic diagram of writing a data block to be written to a target logical storage area according to an embodiment of the present invention;

[0066] Figure 5 This is a flowchart illustrating a GC function provided in an embodiment of the present invention;

[0067] Figure 6 This is a structural block diagram of a hard disk space management device provided in an embodiment of the present invention;

[0068] Figure 7 This is a block diagram of an electronic device provided in an embodiment of the present invention;

[0069] Figure 8 This is a schematic diagram of a computer-readable medium provided in an embodiment of the present invention. Detailed Implementation

[0070] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0071] To facilitate understanding of the technical solutions and effects of the embodiments of the present invention, the prior art of the present invention will be briefly described below.

[0072] Solid-state drives (SSDs) are widely used in the storage field. SSDs typically reserve 15% of their space for garbage collection (GC) and bad block replacement; this reserved space is not visible to the user.

[0073] ZNS (Zone Namespace SSD) is a special type of SSD that exposes reserved space to the user, who can directly manage this space.

[0074] A ZNS disk consists of multiple zones (logical storage areas). It's important to note that a zone must be opened before data can be written to it. A standard zone can range from tens of gigabytes to hundreds of megabytes.

[0075] In a ZNS disk, each individual zone allows data to be written append-only, meaning data can only be added to the end of the zone, not overwritten. Zones also support random read operations, allowing users to read data randomly from any position within the zone. Data writes within a zone must be performed sequentially.

[0076] When using the reserved space on the ZNS disk, users need to manage the disk space themselves, implementing FTL (Flash Translate layer) strategies and GC functions, which has a relatively high learning curve. It should be noted that the FTL strategy refers to translating the user's logical LBA (Logical Block Addressing) address into the physical location on the disk, i.e., locating the location of the data to be operated on.

[0077] Reference Figure 1 This diagram illustrates a flowchart of a hard disk space management method provided in an embodiment of the present invention. The method is applied to a hard disk, which includes at least one logical storage area, and specifically includes the following steps:

[0078] Step 101: Clear the logical storage area of ​​the hard disk, and use a preset database to store the storage usage status and / or storage data status of the logical storage area; the metadata of the logical storage area is a data block;

[0079] In this embodiment of the invention, the ZNS hard disk includes at least one logical storage area.

[0080] Reference Figure 2This diagram illustrates the relationship between zones and chunks according to an embodiment of the present invention. ZNS hard drives manage metadata at the zone level. A zone can consist of multiple chunks; therefore, the metadata of a zone is composed of the metadata of multiple chunks; that is, unique data blocks of metadata for logical storage areas. Specifically:

[0081]

[0082]

[0083] Wherein, `zone_id` is the unique identifier of the logical storage area, `chunk_id` is the unique identifier of a chunk, `chunk_len` represents the length of this chunk, `offset_of_zone` represents the offset position of this chunk within the zone, `is_purged` indicates whether this chunk has been purged (purging means that the space of this chunk can be released), `is_sealed` indicates whether this chunk has been sealed (sealing means that this chunk can no longer be written to), `empty` indicates empty, and `append data` usually means adding new content to the end of the file without overwriting existing data in the file. In this embodiment of the invention, a preset database can be used to store the storage usage status and / or storage data status of the logical storage area. The storage usage status refers to whether the logical storage area is in an idle state, an active state, or a full state, and the storage data status refers to the length of the data blocks in the logical storage area and the clearing identifier of the data blocks.

[0084] In this embodiment of the invention, all zones are reset (reset operation), clearing the original data. This involves formatting the zones; after formatting, the metadata of each zone is empty, and this empty metadata is stored in a database called RocksDB. RocksDB is a high-performance embedded key-value storage database used to store and manage zone metadata.

[0085] In a ZNS disk, only one data block (chunk) can be written to a zone at a time. This is because in ZNS SSDs, append writes are performed zone by zone, and a zone cannot be interrupted by other data blocks during the write process. Only after a chunk is sealed can that zone be written to by other chunks. Sealed means that once a chunk has completed a write operation, it is "sealed," which means that the data in that chunk is finalized and will not be modified again.

[0086] Reference Figure 3 The diagram illustrates a state diagram of a logical storage area provided in an embodiment of the present invention.

[0087] In this scheme, the zone status is divided into three types: idle: The zone is currently not being written to by any chunk and can be written to by new chunks. active: The zone is being written to by a chunk and cannot be written to by other chunks at this time. full: The zone is full and cannot be written to by any more chunks.

[0088] In this embodiment of the invention, these empty metadata are stored in a database named RocksDB, that is, the empty metadata is persisted to the database. A new record is created in the RocksDB database. This record is used to store metadata related to a certain data block (such as a chunk), but at present this metadata does not contain any actual data content or information; it is empty or a placeholder.

[0089] Specifically, this process may include the following steps:

[0090] Define a key: First, you need to define a unique key that will be used to identify this metadata record in RocksDB. A key is typically a string or binary sequence that uniquely represents the associated data block.

[0091] Create a null value: Then, create a null value, which may be an empty string, an empty array, an empty object, or other data structure that represents "no content".

[0092] Write to RocksDB: This writes the key and null value as a key-value pair to RocksDB. This process is called "persistence" because it means that the data is saved to non-volatile storage and will not be lost even if the system restarts.

[0093] In a specific example, the process of clearing a zone is as follows: First, the ZNS disk's API is called to retrieve a list of all zones. Then, the zone's reset API is called to clear the zone's data. The metadata of each zone is persisted to RocksDB, and after formatting, the metadata of each zone is empty. At this point, all zones are added to the idle state.

[0094] Step 102: During the process of writing the preset data block to be written to the hard disk, it is determined whether there is a target logical storage area in the hard disk; the target logical storage area stores data related to the data block to be written.

[0095] In this embodiment of the invention, during the process of writing a preset data block to be written to the hard disk, it is determined whether a target logical storage area exists in the hard disk; the target logical storage area stores data related to the data block to be written.

[0096] Step 103: If it exists, write the data block to be written into the target logical storage area, and update the storage usage status and / or storage data status of the target logical storage area in the database.

[0097] In this embodiment of the invention, if a target logical storage area exists in the hard disk, the data block to be written is written to the target logical storage area, and the storage usage status and / or storage data status of the target logical storage area in the database are updated.

[0098] In some embodiments of the present invention, the storage usage state includes an idle state; the logical storage area is in an idle state when the logical storage area is cleared; the method includes:

[0099] If the target logical storage area does not exist in the hard disk, the data block to be written is written to any logical storage area that is in the idle state.

[0100] Reference Figure 4 This illustration shows a schematic diagram of writing a data block to be written to a target logical storage area according to an embodiment of the present invention.

[0101] For read and write operations, the business logic is unaware of which zone the data is specifically written to. Two functions, append and read, are defined to perform data write and read operations on a chunk-by-chunk basis.

[0102] The `int append(uint64_t chunk_id, void *data, uint64_t data_length)` function appends data to a specified chunk. `chunk_id` is the unique identifier of the chunk, `data` is a pointer to the data to be written, and `data_length` is the length of the data. The function returns an integer, which may indicate a success or error code. `uint64_t` is a data type representing an unsigned 64-bit integer. `void*` is a pointer type that can point to data of any type.

[0103] The `int read(uint64_t chunk_id, void *data, uint64_t read_length)` function reads data from a specified chunk. `chunk_id` is a unique identifier for the chunk, `data` is a pointer to a buffer used to store the read data, and `read_length` is the length of the data to be read. The function returns an integer, which may indicate either a success or error code.

[0104] In this embodiment of the invention, these two functions prepare for Garbage Collection (GC). GC is a mechanism for SSD storage space management. Since GC operations may move data chunks to different zones, a way is needed to abstract and shield the specific physical location of the data. That is, because the location of a chunk may change, upper-layer applications or business logic should not depend on the specific physical location of the data chunk. Instead, the system should provide an abstraction that allows applications to access data through a chunk identifier (such as chunk_id) without needing to know where the data is actually stored on the hard drive. This abstraction is called "shielding," and it hides the underlying implementation details, making upper-layer applications simpler and more stable. The system provides an abstraction layer for upper-layer applications, so that applications do not need to care about which zone the data is specifically stored on the hard drive when accessing data. The application only needs to use the chunk identifier (chunk_id) to request access to the data, and the system will be responsible for finding and returning the correct data based on this identifier.

[0105] When a write request for a chunk is received, the following steps are taken to process it: Before writing data, the system checks whether a zone has already been allocated for this chunk. If no zone has been allocated for the chunk, the system retrieves a new zone from the list of idle zones to store the chunk.

[0106] In a specific example, the process of writing data to the hard drive is as follows: The user calls the `Append(uint64_t chunk_id, void *data, uint64_t data_len)` interface. First, it checks if a zone already stores the data for this chunk. If so, it continues writing to this zone. After successful writing, it updates the length of the chunk in this zone and then persists it to RocksDB. If this chunk is being written to for the first time, it needs to retrieve a zone from the zone's idle list, write the chunk to the zone, change the zone's status to active, and add it to the active list. After successful data writing, the zone's metadata is persisted to RocksDB. When sealing a chunk, it only needs to find the corresponding zone, mark the chunk's `is_sealed` value as `true` in the zone's metadata, and persist it to RocksDB. Then, it checks if the zone has enough remaining space. If it does, it changes the zone's status to idle; otherwise, it changes the zone's status to full.

[0107] In some embodiments of the present invention, the storage usage state further includes an active state; updating the storage usage state and / or storage data state of the target logical storage area in the database includes:

[0108] During the process of writing the data block to be written to the target logical storage area, the storage usage status of the target logical storage area is updated from the idle state to the active state;

[0109] After the data block to be written is written to the target logical storage area, the storage data status in the target logical storage area is updated.

[0110] In this embodiment of the invention, after data is written, the system needs to update the metadata. Metadata updates mainly include updating the status of the chunk in the database (e.g., whether it is in an active write state) and the length of the chunk. The database indicates the status and length of the chunks in the zone.

[0111] Step 104: Based on the database, clear the garbage data blocks in any of the logical storage areas of the hard disk.

[0112] In this embodiment of the invention, garbage data blocks in any logical storage area of ​​the hard disk can be cleared according to the database.

[0113] In some embodiments of the present invention, the stored data status includes a clearing identifier for at least one data block; the step of clearing garbage data blocks in any of the logical storage areas of the hard disk based on the database includes:

[0114] The logical storage area containing the garbage data blocks in the hard disk is designated as the logical storage area to be processed.

[0115] By modifying the clearing identifier of the garbage data block, the clearing identifier of the garbage data block indicates that the garbage data block is a data block to be cleared;

[0116] Based on the clearing identifier, obtain the proportion information of the data blocks to be cleared in the logical storage area to be processed;

[0117] Based on the proportion of data blocks to be cleared, determine whether to clear the data blocks to be cleared.

[0118] In this embodiment of the invention, the stored data status includes at least one clearing identifier for a data block; a logical storage area containing garbage data blocks on the hard disk can be designated as a logical storage area to be processed. By modifying the clearing identifier of the garbage data block, the clearing identifier of the garbage data block indicates that the garbage data block is a data block to be cleared; based on the clearing identifier, the proportion information of data blocks to be cleared in the logical storage area to be processed is obtained; based on the proportion information of data blocks to be cleared, it is determined whether to clear the data blocks to be cleared.

[0119] In some embodiments of the present invention, determining whether to clear the data block to be cleared based on the proportion information of the data block to be cleared includes:

[0120] Based on the proportion information of the data blocks to be cleared, determine whether the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold.

[0121] If the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold, then the data blocks to be cleared are cleared.

[0122] In this embodiment of the invention, when determining whether to clear data blocks based on the proportion information of the data blocks to be cleared, it is determined whether the proportion of data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold. If the proportion of data blocks to be cleared in the logical storage area to be processed exceeds the preset proportion threshold, then the data blocks to be cleared are cleared.

[0123] In some embodiments of the present invention, clearing the data block to be cleared includes:

[0124] Based on the clearing identifier of the data block, determine that the target data block in the logical storage area to be processed is not the data block to be cleared;

[0125] The target data block is transferred to another logical storage area on the hard disk, excluding the logical storage area to be processed.

[0126] Clear the storage area of ​​the logic to be processed.

[0127] In this embodiment of the invention, when clearing a data block to be cleared, it is necessary to determine, based on the clearing identifier of the data block, a target data block that is not the data block to be cleared in the logical storage area to be processed. The target data block is then transferred to another logical storage area on the hard disk, excluding the logical storage area to be processed, and the logical storage area to be processed is cleared.

[0128] In some embodiments of the present invention, the method includes:

[0129] After the target data block is transferred to another logical storage area in the hard disk, the storage data state of the logical storage area that received the target data block is modified;

[0130] After the pending logical storage area is cleared, modify the storage usage status and / or the storage data status of the pending logical storage area.

[0131] In this embodiment of the invention, after the target data block is transferred to another logical storage area in the hard disk, the storage data state of the logical storage area that received the target data block is modified; after the logical storage area to be processed is cleared, the storage usage state and / or storage data state of the logical storage area to be processed is modified.

[0132] Reference Figure 5 The diagram illustrates a flowchart of a GC function provided in an embodiment of the present invention.

[0133] When a chunk needs to be deleted, simply modify the zone's metadata to set the chunk's `is_purged` property to `true`. This makes deletion very simple and efficient. However, since the data space isn't actually deleted, it can result in a lot of junk data within the zone. `is_purged` is the cleanup identifier.

[0134] In this embodiment of the invention, the ratio of valid data to space is calculated through the zone's metadata. When the valid data in a zone is less than 50%, GC needs to be started to move the valid data to a new zone, and then the old zone is reset to release space.

[0135] Figure 5The zone_id=1 stores 3 chunks, of which chunk=1 and chunk=3 have been marked for deletion. This is when GC is started, the valid chunk=2 is moved to zone_id=2, and then zone_id=1 is reset. This reclaims the space of zone_id=1. After the data is moved, the modified metadata of zone_id=1 and the modified metadata of zone_id=2 are persisted to rocksdb.

[0136] In a specific example, the GC function's process is as follows:

[0137] When deleting a chunk, simply locate the corresponding zone, mark `chunkis_purged=true` in the zone's metadata, and persist it to RocksDB. At this point, the total available space can be recalculated. If the available space is less than 50%, garbage collection (GC) begins on that zone. The available data from this zone is read and written to a new zone. After successful writing, the zone is reset, its metadata is cleared, and written back to RocksDB. The zone's state is then set to idle, thus completing the zone's space reclamation.

[0138] In this embodiment of the invention, the hard disk includes at least one logical storage area. The logical storage area of ​​the hard disk is cleared, and the storage usage status and / or storage data status of the logical storage area are stored using a preset database. The metadata of the logical storage area is data blocks. During the process of writing preset data blocks to be written to the hard disk, it is determined whether a target logical storage area exists on the hard disk. The target logical storage area stores data related to the data blocks to be written. If it exists, the data blocks to be written are written to the target logical storage area, and the storage usage status and / or storage data status of the target logical storage area in the database are updated. Based on the database, garbage data blocks in any logical storage area of ​​the hard disk are cleared. A ZNS disk space management scheme based on RocksDB is provided, which realizes the automatic implementation of the FTL implementation strategy and GC function of the ZNS disk.

[0139] In this embodiment of the invention, a simple and efficient management scheme for ZNS disk space is provided at the software level. This scheme ensures the reliability of metadata by storing metadata in rocksdb. A batch of read and write interfaces for ZNS disks are encapsulated. These interfaces are no longer based on LBA read and write, but directly provide read and write interfaces for chunks.

[0140] It should be noted that, for the sake of simplicity, the method embodiments are all described as a series of actions. However, those skilled in the art should understand that the embodiments of the present invention are not limited to the described order of actions, because according to the embodiments of the present invention, some steps can be performed in other orders or simultaneously. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are preferred embodiments, and the actions involved are not necessarily essential to the embodiments of the present invention.

[0141] Reference Figure 6 This diagram illustrates a structural block diagram of a hard disk space management device provided in an embodiment of the present invention. The device is applied to a hard disk, which includes at least one logical storage area and may specifically include the following modules:

[0142] The clearing module 601 is used to clear the logical storage area of ​​the hard disk and store the storage usage status and / or storage data status of the logical storage area using a preset database; the metadata of the logical storage area is a data block.

[0143] The target logical storage area determination module 602 is used to determine whether a target logical storage area exists in the hard disk during the process of writing a preset data block to be written to the hard disk; the target logical storage area stores data related to the data block to be written.

[0144] The first writing module 603 is used to write the data block to be written into the target logical storage area if it exists, and update the storage usage status and / or storage data status of the target logical storage area in the database.

[0145] The cleaning module 604 is used to clean up garbage data blocks in any of the logical storage areas of the hard disk based on the database.

[0146] In an optional embodiment of the present invention, the storage usage state includes an idle state; the logical storage area is in an idle state when the logical storage area is cleared; the device includes:

[0147] The second write module is used to write the data block to be written to any logical storage area that is in the idle state if the target logical storage area does not exist in the hard disk.

[0148] In an optional embodiment of the present invention, the storage usage state further includes an active state; the first write module includes:

[0149] The first state update submodule is used to update the storage usage state of the target logical storage area from the idle state to the active state during the process of the data block to be written to the target logical storage area.

[0150] The second state update submodule is used to update the storage data state in the target logical storage area after the data block to be written is written to the target logical storage area.

[0151] In an optional embodiment of the present invention, the stored data status includes a clear identifier for at least one data block; the clearing module includes:

[0152] The logical storage area to be processed is a sub-module used to take the logical storage area containing the garbage data blocks in the hard disk as the logical storage area to be processed;

[0153] The modification submodule is used to modify the clearing identifier of the garbage data block so that the clearing identifier of the garbage data block indicates that the garbage data block is a data block to be cleared;

[0154] The submodule for obtaining the proportion of data blocks to be cleared is used to obtain the proportion of data blocks to be cleared in the logical storage area to be processed based on the clearing identifier.

[0155] The clearing judgment submodule is used to determine whether to clear the data block to be cleared based on the proportion information of the data block to be cleared.

[0156] In an optional embodiment of the present invention, the clearing determination submodule includes:

[0157] The threshold determination unit is used to determine whether the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold based on the proportion information of the data blocks to be cleared.

[0158] The clearing unit is used to clear the data blocks to be cleared if the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold.

[0159] In an optional embodiment of the present invention, the cleaning unit includes:

[0160] The target data block determination subunit is used to determine, based on the clearing identifier of the data block, a target data block in the logical storage area to be processed that is not the data block to be cleared;

[0161] A transfer subunit is used to transfer the target data block to other logical storage areas in the hard disk besides the logical storage area to be processed.

[0162] The clear sub-unit is used to clear the logic storage area to be processed.

[0163] In an optional embodiment of the present invention, the device includes:

[0164] The first state modification module is used to modify the storage data state of the logical storage area that received the target data block after the target data block is transferred to another logical storage area in the hard disk;

[0165] The second state modification module is used to modify the storage usage state and / or the storage data state of the logical storage area to be processed after the logical storage area to be processed is cleared.

[0166] As the device embodiment is basically similar to the method embodiment, the description is relatively simple, and relevant parts can be found in the description of the method embodiment.

[0167] In addition, embodiments of the present invention also provide an electronic device, such as... Figure 7 As shown, it includes a processor 701, a communication interface 702, a memory 703, and a communication bus 704, wherein the processor 701, the communication interface 702, and the memory 703 communicate with each other through the communication bus 704.

[0168] Memory 703 is used to store computer programs;

[0169] When processor 701 executes a program stored in memory 703, it performs the following steps:

[0170] The logical storage area of ​​the hard disk is cleared, and the storage usage status and / or storage data status of the logical storage area are stored in a preset database; the metadata of the logical storage area is a data block.

[0171] During the process of writing the preset data block to be written to the hard disk, it is determined whether a target logical storage area exists in the hard disk; the target logical storage area stores data related to the data block to be written.

[0172] If it exists, the data block to be written is written to the target logical storage area, and the storage usage status and / or storage data status of the target logical storage area in the database are updated;

[0173] Based on the database, remove garbage data blocks from any of the logical storage areas in the hard disk.

[0174] In an optional embodiment of the present invention, the storage usage state includes an idle state; the logical storage area is in an idle state when the logical storage area is cleared; the method includes:

[0175] If the target logical storage area does not exist in the hard disk, the data block to be written is written to any logical storage area that is in the idle state.

[0176] In an optional embodiment of the present invention, the storage usage state further includes an active state; updating the storage usage state and / or storage data state of the target logical storage area in the database includes:

[0177] During the process of writing the data block to be written to the target logical storage area, the storage usage status of the target logical storage area is updated from the idle state to the active state;

[0178] After the data block to be written is written to the target logical storage area, the storage data status in the target logical storage area is updated.

[0179] In an optional embodiment of the present invention, the stored data status includes at least one data block clearing identifier; the step of clearing garbage data blocks in any of the logical storage areas of the hard disk based on the database includes:

[0180] The logical storage area containing the garbage data blocks in the hard disk is designated as the logical storage area to be processed.

[0181] By modifying the clearing identifier of the garbage data block, the clearing identifier of the garbage data block indicates that the garbage data block is a data block to be cleared;

[0182] Based on the clearing identifier, obtain the proportion information of the data blocks to be cleared in the logical storage area to be processed;

[0183] Based on the proportion of data blocks to be cleared, determine whether to clear the data blocks to be cleared.

[0184] In an optional embodiment of the present invention, determining whether to clear the data blocks to be cleared based on the proportion information of the data blocks to be cleared includes:

[0185] Based on the proportion information of the data blocks to be cleared, determine whether the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold.

[0186] If the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold, then the data blocks to be cleared are cleared.

[0187] In an optional embodiment of the present invention, clearing the data block to be cleared includes:

[0188] Based on the clearing identifier of the data block, determine that the target data block in the logical storage area to be processed is not the data block to be cleared;

[0189] The target data block is transferred to another logical storage area on the hard disk, excluding the logical storage area to be processed.

[0190] Clear the storage area of ​​the logic to be processed.

[0191] In an optional embodiment of the present invention, the method includes:

[0192] After the target data block is transferred to another logical storage area in the hard disk, the storage data state of the logical storage area that received the target data block is modified;

[0193] After the pending logical storage area is cleared, modify the storage usage status and / or the storage data status of the pending logical storage area.

[0194] The communication bus mentioned above can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. This communication bus can be divided into address bus, data bus, control bus, etc. For ease of illustration, only one thick line is used to represent it in the diagram, but this does not mean that there is only one bus or one type of bus.

[0195] The communication interface is used for communication between the aforementioned terminal and other devices.

[0196] The memory may include random access memory (RAM) or non-volatile memory, such as at least one disk storage device. Optionally, the memory may also be at least one storage device located remotely from the aforementioned processor.

[0197] The processors mentioned above can be general-purpose processors, including central processing units (CPUs), network processors (NPs), etc.; they can also be digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.

[0198] like Figure 8 As shown, in another embodiment of the present invention, a computer-readable storage medium 801 is also provided, which stores instructions that, when executed on a computer, cause the computer to perform a hard disk space management method as described in the above embodiments.

[0199] In another embodiment of the present invention, a computer program product containing instructions is also provided, which, when run on a computer, causes the computer to execute a hard disk space management method described in the above embodiments.

[0200] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk (SSD)).

[0201] It should be noted that, in this document, relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0202] The various embodiments in this specification are described in a related manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0203] The above description is merely a preferred embodiment of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention are included within the scope of protection of the present invention.

Claims

1. A hard disk space management method, characterized in that, Applied to a hard disk, wherein the hard disk includes at least one logical storage area, and the hard disk is a region namespace solid-state drive, the method includes: The logical storage area of ​​the hard disk is cleared, and the storage usage status and / or storage data status of the logical storage area are stored in a preset database; the metadata of the logical storage area is data blocks; the storage data status refers to the length of the data block in the logical storage area and the clear identifier of the data block; During the process of writing the preset data block to be written to the hard disk, it is determined whether a target logical storage area exists in the hard disk; the target logical storage area stores data related to the data block to be written. If it exists, the data block to be written is written to the target logical storage area, and the storage usage status and / or storage data status of the target logical storage area in the database are updated; Based on the database, remove garbage data blocks from any of the logical storage areas in the hard disk; The storage usage status includes idle status, active status, and full status; updating the storage usage status and / or storage data status of the target logical storage area in the database includes: During the process of writing the data block to be written to the target logical storage area, the storage usage status of the target logical storage area is updated from the idle state to the active state; After the data block to be written is written to the target logical storage area, the storage data status in the target logical storage area is updated.

2. The method according to claim 1, characterized in that, The storage usage state includes an idle state; when the logical storage area is cleared, the logical storage area is in an idle state; the method includes: If the target logical storage area does not exist in the hard disk, the data block to be written is written to any logical storage area that is in the idle state.

3. The method according to claim 1, characterized in that, The stored data status includes at least one data block clearing identifier; the step of clearing garbage data blocks in any of the logical storage areas of the hard disk based on the database includes: The logical storage area containing the garbage data blocks in the hard disk is designated as the logical storage area to be processed. By modifying the clearing identifier of the garbage data block, the clearing identifier of the garbage data block indicates that the garbage data block is a data block to be cleared; Based on the clearing identifier, obtain the proportion information of the data blocks to be cleared in the logical storage area to be processed; Based on the proportion of data blocks to be cleared, determine whether to clear the data blocks to be cleared.

4. The method according to claim 3, characterized in that, The step of determining whether to clear the data blocks to be cleared based on the proportion information of the data blocks to be cleared includes: Based on the proportion information of the data blocks to be cleared, determine whether the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold. If the proportion of the data blocks to be cleared in the logical storage area to be processed exceeds a preset proportion threshold, then the data blocks to be cleared are cleared.

5. The method according to claim 4, characterized in that, The process of clearing the data block to be cleared includes: Based on the clearing identifier of the data block, determine that the target data block in the logical storage area to be processed is not the data block to be cleared; The target data block is transferred to another logical storage area on the hard disk, excluding the logical storage area to be processed. Clear the storage area of ​​the logic to be processed.

6. The method according to claim 5, characterized in that, The method includes: After the target data block is transferred to another logical storage area in the hard disk, the storage data state of the logical storage area that received the target data block is modified; After the pending logical storage area is cleared, modify the storage usage status and / or the storage data status of the pending logical storage area.

7. A hard disk space management device, characterized in that, Applied to a hard disk, the hard disk including at least one logical storage area, the hard disk being a region namespace solid-state drive, the device comprising: The clearing module is used to clear the logical storage area of ​​the hard disk and store the storage usage status and / or storage data status of the logical storage area using a preset database; the metadata of the logical storage area is data blocks; the storage data status refers to the length of the data block in the logical storage area and the clearing identifier of the data block; The target logical storage area determination module is used to determine whether a target logical storage area exists in the hard disk during the process of writing a preset data block to be written to the hard disk; the target logical storage area stores data related to the data block to be written. The write module is used to write the data block to be written into the target logical storage area if it exists, and update the storage usage status and / or storage data status of the target logical storage area in the database. The cleaning module is used to clean up garbage data blocks in any of the logical storage areas of the hard disk based on the database; The storage usage states include idle state, active state, and full state; the write module includes: The first state update submodule is used to update the storage usage state of the target logical storage area from the idle state to the active state during the process of the data block to be written to the target logical storage area. The second state update submodule is used to update the storage data state in the target logical storage area after the data block to be written is written to the target logical storage area.

8. An electronic device, characterized in that, It includes a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; The memory is used to store computer programs; When the processor executes a program stored in the memory, it implements the method as described in any one of claims 1-6.

9. One or more computer-readable media having instructions stored thereon that, when executed by one or more processors, cause the processors to perform the method as described in any one of claims 1-6.