Hard disk-based storage capacity determination method, storage system, server, and medium
By rate limiting the data sent from the host to the cache and flushing the cached data to the hard drive, the problems of reduced user experience and data security caused by inaccurate hard drive storage capacity are solved, achieving a more accurate volume mounting mode and ensuring data security.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INSPUR SUZHOU INTELLIGENT TECH CO LTD
- Filing Date
- 2024-11-29
- Publication Date
- 2026-06-26
AI Technical Summary
In storage systems, because the data cached in memory includes both written and deleted data, there is a large difference between the actual storage capacity of the hard drive and the storage capacity visible to the user, which reduces the user experience and data security.
By obtaining the current used capacity and capacity threshold of the hard drive, when the current used capacity exceeds the threshold, the data sent from the host to the cache is rate-limited, the data rate and the number of requests are adjusted, the host write IO response time is extended, the cached data is flushed to the hard drive, and the hard drive's used capacity is updated to determine the volume's mount mode.
It extends the host write I/O response time, avoids data write failures, improves data security and user experience, and ensures system stability and responsiveness.
Smart Images

Figure CN119576237B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of computer technology, and in particular to a method for determining storage capacity based on a hard disk, a storage system, a server, and a medium. Background Technology
[0002] In a storage system, the host first writes data to a memory cache, and then the data in the memory cache is written to the hard disk in a flush manner.
[0003] Once the amount of data stored on the hard drive reaches a certain threshold, the volume switches to read-only mode. However, the comparison between the stored data on the hard drive and the threshold corresponds to the stored data visible to the user. Since the data cached in memory includes both written and deleted data, the final stored data written to the hard drive may be less than the threshold. Because the volume has already switched to read-only mode, this results in a significant difference between the stored data visible to the user and the actual storage capacity of the hard drive. Furthermore, when the volume is in read-only mode, data written to the memory cache by the host may fail during the write process, reducing data security and degrading the user experience.
[0004] Therefore, improving the user experience and data writing security are issues that urgently need to be addressed by those skilled in the art. Summary of the Invention
[0005] The purpose of this invention is to provide a method for determining storage capacity based on hard disks, a storage system, a server, and a medium to address the problems of reduced user experience and reduced data security.
[0006] To address the aforementioned technical problems, this invention provides a method for determining storage capacity based on a hard disk, comprising:
[0007] Get the current used capacity and the externally visible capacity threshold of the hard drive;
[0008] When the current usage capacity exceeds the capacity threshold, the data sent from the host to the cache is rate-limited to obtain rate-limited data, which is then sent to the cache.
[0009] The cached current storage data is flushed to the hard disk, and the current used capacity of the hard disk is updated so that the mount mode of the volume can be determined based on the updated current used capacity of the hard disk and the capacity threshold.
[0010] On the one hand, rate-limiting is performed on the data sent from the host to the cache to obtain rate-limited data, including:
[0011] Obtain the critical usage capacity of the hard disk and the rate at which the host sends current data to the cache; wherein the critical usage capacity is greater than the capacity threshold;
[0012] The first target current limiting level is determined based on the current usage capacity and the critical usage capacity.
[0013] The rate of the current data is adjusted according to the rate limiting rate corresponding to the first target rate limiting level in order to complete the rate limiting process of the host sending data to the cache.
[0014] On the other hand, rate-limiting is applied to the data sent from the host to the cache to obtain rate-limited data, including:
[0015] Obtain the number of unit requests and the threshold number of requests sent from the host to the cache; wherein the number of unit requests is greater than the threshold number of requests;
[0016] The number of rate-limited requests is determined based on the unit request count and the threshold request count;
[0017] The data sent from the host to the cache is distributed according to the number of rate-limiting requests to complete the rate-limiting process for the data sent from the host to the cache.
[0018] On the other hand, rate-limiting is applied to the data sent from the host to the cache to obtain rate-limited data, including:
[0019] Obtain the initial rate and critical rate limit of the current data sent from the host to the cache;
[0020] The initial rate of the current data is adjusted according to the critical rate limiting rate to complete the rate limiting process of the host sending data to the cache.
[0021] On the other hand, flushing the cached current storage data to the hard disk includes:
[0022] Obtain the cache reserved space of the hard disk and the write speed of the hard disk; wherein, the storage data capacity of the cache reserved space is greater than the storage data capacity of the cache; the write speed of the hard disk corresponds to a rate greater than the rate limit corresponding to the data sent from the host to the cache;
[0023] Receive the currently stored data from the cache;
[0024] The flush time of the hard drive is determined based on the current stored data and the write speed;
[0025] The current stored data is flushed to the cache reserved space of the hard disk according to the flush time.
[0026] On the other hand, updating the current used capacity of the hard drive includes:
[0027] Obtain the data capacity corresponding to the written data and deleted data of the currently stored data in the cache;
[0028] The first data capacity is obtained by adding the current used capacity of the hard disk to the data capacity of the written data;
[0029] The updated current usage capacity of the hard drive is obtained by subtracting the first data capacity from the data capacity of the deleted data.
[0030] On the other hand, determining the volume mount mode based on the updated current used capacity of the hard drive and the capacity threshold includes:
[0031] If the current used capacity of the updated hard disk is greater than or equal to the capacity threshold, then the mount mode of the volume is determined to be read-only mode.
[0032] If the current used capacity of the updated hard drive is less than the capacity threshold, then the mount mode of the volume is determined to be read-write mode.
[0033] Correspondingly, after determining the volume's mount mode, the following steps are also included:
[0034] When the volume is mounted in read-only mode, switch the read-write business mode to query mode;
[0035] When the volume is mounted in read / write mode, the updated current used capacity of the hard disk is subtracted from the capacity threshold to obtain the second data capacity.
[0036] The mapping relationship between data capacity range and rate limiting level is obtained in advance; wherein, the data capacity range and the rate limiting level have a one-to-one mapping relationship.
[0037] The target data capacity range corresponding to the second data capacity is determined in the mapping relationship;
[0038] Determine the corresponding second target rate limiting level based on the target data capacity range;
[0039] The rate-limited data is sent to the cache according to the rate-limiting rate corresponding to the second target rate-limiting level.
[0040] To address the aforementioned technical problems, the present invention also provides a storage system, including a host and a hard disk; wherein the cache is located on the host;
[0041] The host is used to execute the steps of the hard disk-based storage capacity determination method described above.
[0042] To address the aforementioned technical problems, the present invention also provides a server, comprising:
[0043] Memory, used to store computer programs;
[0044] A processor, used to implement the steps of the disk-based storage capacity determination method as described above when executing the computer program.
[0045] To address the aforementioned technical problems, the present invention also provides a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps of the disk-based storage capacity determination method as described above.
[0046] The beneficial effects of this invention are twofold. Firstly, when the current used capacity of the hard drive exceeds a capacity threshold, the invention does not immediately switch the volume's mount mode. Instead, it performs rate limiting on the data sent from the host to the cache. This reduces the amount of data processed by rate limiting, extending the host's write I / O response time. Simultaneously, before changing the volume's mount mode to read-only, rate limiting delays and reduces the amount of data written to the cache by the host, allowing time for the cache to compile current stored data for flushing to the hard drive. This prevents write failures of the corresponding data in the cache when the volume switches to read-only mode, improving data security. Secondly, when flushing the current stored data from the cache to the hard drive, considering that cached data includes both written and deleted data, the invention calculates the final stored data actually written to the hard drive as the updated current used capacity. Based on the updated current used capacity of the hard drive and the capacity threshold, the final volume mount mode is determined. This avoids a significant discrepancy between the user-visible stored data and the actual hard drive storage capacity, which could lead to a degraded user experience. This invention fully considers the possibility that the current used capacity of the updated hard drive will be less than the current used capacity of the hard drive before the update after the deleted data in the memory cache is greater than the written data, and sets the mount mode of the premature volume off as read-only mode, thereby improving the user experience.
[0047] Secondly, based on the current usage capacity and the critical usage capacity, a first target rate limiting level is determined. The rate limiting rate corresponding to this first target level is used to adjust the current data rate, thus completing the rate limiting process. This achieves rate limiting by adjusting the data traffic rate, extending the host's write I / O response time, delaying and reducing the amount of data written to the cache, and simplifying the rate limiting operation. The rate limiting of data sent from the host to the cache is influenced by reducing the number of data requests. The rate limiting mechanism dynamically adjusts based on the number of requests to achieve data rate limiting, protecting backend data from excessive requests and ensuring system stability and responsiveness. By adjusting the initial speed to the critical rate limiting level in one step, the highest rate limiting level is achieved, maximizing the extension of the host's write I / O response time. The current usage capacity of the hard drive is updated based on the current cached data. Considering that the current cached data is flushed to the hard drive, the actual data stored on the hard drive is used as the updated current usage capacity. This allows for the final determination of the volume mount mode based on the updated current usage capacity and capacity threshold, ensuring that the true volume mount mode fully considers the original usage capacity of the hard drive and the cached data. The write-down time is determined by the current write speed. Based on this time, the amount of data written to the write cache is controlled to ensure that the write-down cache capacity does not exceed the hard drive's cache reserve space, protecting data from being overwritten and improving data security. The process of determining the volume's mount mode, and in the corresponding read-only mode, changing the host's read / write operations to query operations, and in the corresponding read / write mode, performing preventative rate limiting, improves the user experience.
[0048] In addition, the present invention also provides a storage system, server, and medium that have the same beneficial effects as the above-described method for determining storage capacity based on hard disks. Attached Figure Description
[0049] To more clearly illustrate the embodiments of the present invention, the accompanying drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0050] Figure 1 A flowchart illustrating a method for determining storage capacity based on a hard disk, provided as an embodiment of the present invention;
[0051] Figure 2 A schematic diagram illustrating a method for quickly flushing cached data to disk, provided in an embodiment of the present invention;
[0052] Figure 3 A structural diagram of a hard disk-based storage capacity determination device provided in an embodiment of the present invention;
[0053] Figure 4 This is a structural diagram of a server provided in an embodiment of the present invention. Detailed Implementation
[0054] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the protection scope of the present invention.
[0055] The core of this invention is to provide a method for determining storage capacity based on hard disks, a storage system, a server, and a medium to solve the problems of reduced user experience and reduced data security.
[0056] To enable those skilled in the art to better understand the present invention, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.
[0057] In the storage industry, especially in all-flash storage systems, append-only writes are generally used to achieve wear leveling for solid-state drives (SSDs). When calculating capacity, the capacity will vary within a range, which represents the amount of cached data in the cache module. The data in the cache may be overwriting previous data or writing new data; these amounts cannot be predicted in advance. An L1P1 insertion typically refers to the data cache within the L1 cache (Level 1 Cache). The L1 cache is part of the Central Processing Unit (CPU) and is used to store recently accessed data for fast access. The L1 cache is usually divided into an instruction cache and a data cache, with L1P1 referring to the first level of the L1 data cache. It could be two actions, I(L1P1) and I(P1L1), or it could be split into three actions: D(P0L0), I(L1P1), and I(P1L1). If it's split into two actions, the corresponding pool capacity statistics will only show an increase. For operations split into three actions, the three actions are executed randomly during write and flush. It's possible that I(L1P1) and I(P1L1) are executed first, followed by D(P0L0); or it's possible that D(P0L0) is executed first, followed by I(L1P1) and I(P1L1). If D(P0L0) is executed first, the capacity statistics are relatively accurate. However, if I(L1P1), I(P1L1), and then D(P0L0) are executed first, the capacity statistics may increase. When the amount of data deleted exceeds the amount of data written, the capacity statistics at a given moment are inaccurate. All flash pools reserve a certain amount of capacity to accommodate cached data. When the capacity reaches the read-only condition, the volume becomes read-only. However, it's possible that the volume hasn't actually met the read-only condition, yet it's provided to the user as a read-only volume, leading to a poor user experience and unnecessary trouble. Even if the read-only condition hasn't been met, if the volume on the storage pool is already read-only, write operations at the host level will fail. The disk-based storage capacity determination method provided in this invention can solve the above-mentioned technical problems.
[0058] Figure 1 A flowchart of a method for determining storage capacity based on a hard disk, as provided in an embodiment of the present invention, is shown below. Figure 1 As shown, the method includes:
[0059] S11: Obtain the current used capacity and the externally visible capacity threshold of the hard drive;
[0060] S12: When the current usage capacity exceeds the capacity threshold, the data sent from the host to the cache is rate-limited and the rate-limited data is then sent to the cache.
[0061] S13: Flush the cached current storage data to the hard disk and update the current used capacity of the hard disk so as to determine the volume mount mode based on the updated current used capacity of the hard disk and the capacity threshold.
[0062] Specifically, the current used capacity of the hard drive corresponds to the capacity storage area visible to the user on the hard drive. The externally visible capacity threshold is a preset threshold. In a conventional solution, when the current used capacity exceeds the capacity threshold, the mount mode of the corresponding volume needs to be changed to read-only mode to facilitate query operations by the host's services. It should be noted that this embodiment considers that the data written to the hard drive from the corresponding cache is not included in the statistics. There may be two situations: the cache write data is greater than or equal to the deleted data, and the deleted data is greater than the write data. If the deleted data is greater than the write data, when it is flushed to the hard drive, the corresponding current used capacity will be less than the used capacity in step S11. In this case, because the volume mount mode is changed to read-only too early, the actual used capacity has not reached the capacity threshold, reducing the user experience.
[0063] In step S12, when the current usage capacity exceeds the capacity threshold, the volume mounting mode is not modified immediately. Instead, the data sent from the host to the cache is rate-limited to obtain the rate-limited data, which is then sent to the cache. It should be noted that in this embodiment, the rate-limiting of data sent from the host to the cache extends the host's write I / O response time, delays and reduces the amount of data written to the cache, and provides sufficient time for the cache to statistically analyze the currently stored data and flush it to the hard drive. This rate-limiting process can limit data traffic or the number of requests; no specific limitation is made here. Corresponding to the data traffic limiting process, it can be a one-step limit to the highest level (minimum data traffic), or the corresponding rate-limiting level can be determined based on the difference between the current usage capacity and the critical capacity. The rate-limiting levels are divided into different levels, with different data traffic limits at different levels, to ensure that the real-time usage capacity of the hard drive is statistically analyzed while maintaining an appropriate data sending speed from the host.
[0064] In some embodiments, rate-limiting is performed on the data sent from the host to the cache to obtain rate-limited data, including:
[0065] Obtain the critical usage capacity of the hard drive and the rate at which the host sends current data to the cache; where the critical usage capacity is greater than the capacity threshold.
[0066] The first target current limiting level is determined based on the current usage capacity and the critical usage capacity.
[0067] The rate of the current data is adjusted according to the rate limiting rate corresponding to the first target rate limiting level in order to complete the rate limiting process of the host sending data to the cache.
[0068] Specifically, the process involves obtaining the hard drive's critical usage capacity. It's important to note that the critical usage capacity is the maximum allowable data write capacity of the hard drive, which is greater than a capacity threshold. The first target rate limiting level is determined based on the difference between the current usage capacity and the critical usage capacity. Here, a mapping relationship is pre-established between capacity data (difference), rate limiting level, and the rate limiting rate corresponding to each rate limiting level. The smaller the capacity data, the higher the corresponding rate limiting level and the larger the corresponding rate limiting rate. After determining the rate limiting level based on the difference in capacity data, the first target rate limiting level is determined. The rate limiting rate corresponding to the first target rate limiting level is then determined according to the pre-established mapping relationship. Finally, the rate of the current data is adjusted according to the rate limiting rate of the first target rate limiting level to complete the rate limiting process.
[0069] This embodiment determines a first target rate limiting level based on the current usage capacity and the critical usage capacity, and then adjusts the rate of the current data using the rate limiting rate corresponding to the first target rate limiting level to complete the rate limiting process. This achieves rate limiting by adjusting the data traffic rate, extending the host write I / O response time, delaying and reducing the data written to the host cache, and simplifying the rate limiting operation.
[0070] In other embodiments, rate-limiting is applied to the data sent from the host to the cache to obtain rate-limited data, including:
[0071] Get the number of unit requests and the threshold number of requests sent from the host to the cache; where the number of unit requests is greater than the threshold number of requests.
[0072] The number of rate-limited requests is determined based on the number of requests per unit and the threshold number of requests.
[0073] The data sent from the host to the cache is distributed according to the number of rate-limited requests to complete the rate-limiting process for data sent from the host to the cache.
[0074] Specifically, when processing business logic, the server frequently accesses certain data. Therefore, the number of requests to the host is reduced. This request count refers to limiting the number of requests allowed into the system per second at the request entry point; any additional requests will be rejected. The unit request count corresponds to the number of requests transmitted per second, which is the current number of requests entering the system from the request entry point. Currently, a threshold request count is used to limit the unit request count. Adjusting the unit request count to the threshold request count serves as the rate-limiting request count. Therefore, data is sent to the cache by the host according to the rate-limited request count to complete the rate-limiting process.
[0075] This embodiment provides a rate limiting mechanism that reduces the number of data requests, thereby affecting the data sent from the host to the cache. The rate limiting mechanism dynamically adjusts based on the number of requests to achieve rate limiting of data, protecting backend data from excessive requests and ensuring system stability and responsiveness.
[0076] In other embodiments, rate-limiting is applied to the data sent from the host to the cache to obtain rate-limited data, including:
[0077] Get the initial rate and critical rate limit of the current data sent from the host to the cache;
[0078] The initial rate of current data is adjusted based on the critical rate limiting rate to complete the rate limiting process of data sent from the host to the cache.
[0079] Specifically, the rate limiting process here refers to adjusting the initial rate of the current data sent from the host to the cache to the critical rate limit in one step. In other words, the current data is sent according to the critical rate limit. The critical rate limit is the minimum rate, which corresponds to the highest level of rate limiting.
[0080] This embodiment provides a method to adjust the initial speed to the critical current limiting rate in one step to achieve the highest level of current limiting, thereby maximizing the extension of the host's write input / output (IO) response time.
[0081] In step S13, the currently stored data based on the cache is flushed to the hard disk. The flushing process calculates the flushing time based on the amount of data on the backend disk and in the cache. This flushing time is not limited; it can be determined by the data volume and write speed, or by the configured commit interval, the minimum number of pages per commit, and the maximum interval between commits. It should be noted that the flushing speed can also be adjusted to reduce the flushing time.
[0082] Update the current used capacity of the hard drive based on the current stored data flushed to the hard drive.
[0083] In some embodiments, updating the current used capacity of the hard drive includes:
[0084] Get the data capacity corresponding to the written data and deleted data of the currently stored cached data;
[0085] The first data capacity is obtained by adding the current used capacity of the hard drive to the amount of data to be written.
[0086] Subtracting the first data capacity from the data capacity of the deleted data yields the updated current usable capacity of the hard drive.
[0087] Specifically, considering that the current stored data includes both written and deleted data, it is necessary to obtain the data capacity corresponding to the written and deleted data of the current stored data, and determine the updated current used capacity of the hard drive based on the current used capacity of the hard drive, the data capacity of the written data, and the data capacity of the deleted data.
[0088] The current used capacity of the hard drive is added to the amount of data written to obtain the first data capacity; the first data capacity is subtracted from the amount of data deleted to obtain the updated current used capacity of the hard drive.
[0089] This embodiment provides a method to update the current used capacity of the hard drive based on the current stored data in the cache. Considering that after the current stored data in the cache is flushed to the hard drive, the final stored data actually written to the hard drive is counted as the current used capacity of the updated hard drive. This is to enable the final determination of the volume mount mode based on the current used capacity of the updated hard drive and the capacity threshold, so that the true volume mount mode fully considers the original used capacity of the hard drive and the stored data in the cache.
[0090] The system updates the current used capacity of the hard drive and determines the volume's mount mode based on the updated current used capacity and capacity threshold to obtain the capacity in real time. After flushing the disk, it checks whether the volume meets the read-only condition. If it does, it is set to read-only mode; otherwise, it remains in read-write mode.
[0091] In some embodiments of an all-flash system, volume capacity usage is statistically analyzed at the pool level. As volume usage increases, more and more data is written. When the pool capacity reaches the read-only condition, the steps of the write cache module executing Insert I (L1P1), Insert I (P1L1), and then Delete D (P0L0) are uncertain. Therefore, the increase and decrease of the storage pool capacity are uncertain. So, after the capacity reaches the read-only condition, the pool capacity is not initially set to read-only. At this time, the rate limiting of the storage device's response to the host is set to the highest level, slowing down the storage's response to host I / O. This creates a back pressure effect on the host, increasing the number of input / output operations per second (IP / Output Operations Per Second) issued by the host. Secondary IOPS are controlled within a very low range. At the same time, a flag is set, and both the driver cache and write cache are started to flush to disk immediately. The capacity is obtained in real time between flushing and flushing. After flushing, it is checked whether the pool has reached read-only status. If it has, the pool status is set to read-only. If not, the rate limiting condition is set again according to the capacity, and the flag is cleared. The above steps are repeated when the read-only condition is reached again until the pool truly reaches the read-only condition, and the pool status is set to read-only.
[0092] This invention provides a method for determining storage capacity based on a hard disk. The method obtains the current used capacity of the hard disk and a publicly visible capacity threshold. When the current used capacity exceeds the capacity threshold, it performs rate limiting on data sent from the host to the cache to obtain rate-limited data, which is then sent to the cache. The current stored data in the cache is then flushed to the hard disk, and the current used capacity of the hard disk is updated. This allows for determining the volume's mount mode based on the updated current used capacity and the capacity threshold. On one hand, when the current used capacity of the hard disk exceeds the capacity threshold, this invention does not immediately switch the volume's mount mode. Instead, it performs rate limiting on the data sent from the host to the cache. This reduces the amount of data processed by rate limiting, extending the host's write I / O response time. Simultaneously, before changing the volume's mount mode to read-only mode, rate limiting delays and reduces the amount of data written to the cache by the host, providing time for the cache to statistically analyze the current stored data. This facilitates flushing the data to the hard disk and prevents write failures of the corresponding data in the host's cache when the volume switches to read-only mode, thus improving data security. On the other hand, the current cached data is flushed to the hard drive. Considering that the cached data includes both written and deleted data, the final data actually written to the hard drive is calculated as the current used capacity of the updated hard drive. The volume mount mode is then determined based on the current used capacity of the updated hard drive and a capacity threshold. This avoids a significant discrepancy between the user-visible storage data and the actual hard drive capacity, which could negatively impact the user experience. This invention fully considers the possibility that the current used capacity of the updated hard drive will be less than the current used capacity of the original hard drive if the deleted data in the memory cache exceeds the written data, thus improving the user experience by using a read-only mount mode for premature volume flushing.
[0093] In some embodiments, flushing the cached current stored data to disk includes:
[0094] Obtain the hard drive's cache reserve space and hard drive write speed; where the storage data capacity of the cache reserve space is greater than the storage data capacity of the cache; the hard drive write speed corresponds to a rate greater than the rate limit corresponding to the data sent from the host to the cache;
[0095] Receive the currently stored data from the cache;
[0096] The hard drive's refresh time is determined based on the current stored data and write speed;
[0097] The cache space reserved for the current stored data to be flushed to the hard drive based on the flush time.
[0098] Specifically, the system acquires the hard drive's cache reservation space. This cache reservation space (Over-Provisioning, OP) is the portion of the capacity that is not accessible to the user. Its main function is to improve the performance and durability of the SSD, as this space is invisible to the user. By reserving space, the SSD controller can better manage the wear and tear of the flash memory (NAND) during its lifespan, thereby extending the hard drive's lifespan.
[0099] The hard drive write speed is obtained, and the data capacity of the cache reserved space is greater than the data capacity of the cache. In order to complete the disk flushing as quickly as possible and prevent the host from issuing a fast rate limit to the cache, which would result in the cache starting to store a lot of new data, the hard drive write speed here is greater than the rate limit issued by the host to the cache.
[0100] The hard drive flushing time is determined based on the current stored data in the cache and the write speed, i.e., current stored data / write speed = flushing time. Based on the flushing time, cache space is reserved for flushing the current stored data to the hard drive.
[0101] The flush time provided in this embodiment is determined by the current write speed. The amount of write to the write cache is controlled according to the flush time, ensuring that the capacity of the flush cache does not exceed the cache reserved space of the hard disk, protecting the data from being overwritten and improving data security.
[0102] In other embodiments, flushing the cached current stored data to disk includes:
[0103] Obtain the hard drive's cache reserve space and initial write speed;
[0104] Determine the amount of data written to and deleted from the cache's current storage data;
[0105] If the amount of data to be deleted is greater than the amount of data to be written, the initial write speed is adjusted according to the speed adjustment step size to obtain the adjusted write speed;
[0106] The hard drive's refresh time is determined based on the adjusted write speed and the current stored data in the cache;
[0107] The cache space reserved for the current stored data to be flushed to the hard drive based on the flush time.
[0108] Specifically, considering the reduction in hard drive refresh time, it's necessary to accelerate hard drive write speed. This acceleration is based on a comparison between the amount of data being written to the cache and the amount of data being deleted. If the amount of deleted data exceeds the amount of data being written, the initial write speed is adjusted by a specific step size to obtain the adjusted write speed. Since deleting more data than writing data will reduce the corresponding usable hard drive capacity, it's also necessary to accelerate the refresh speed to complete the process as early as possible.
[0109] The hard drive refresh time is determined based on the adjusted write speed and the current stored data in the cache. The current stored data is then refreshed to the cache reserved space on the hard drive according to the refresh time. The determination of the refresh time and the refresh process of the cache reserved space on the hard drive are the same as in the above embodiment, and will not be repeated here.
[0110] This embodiment provides a method to adjust the initial write speed according to the speed adjustment step size when the amount of data to be deleted is greater than the amount of data to be written, thereby improving the hard drive's refresh time and ensuring data security.
[0111] In some embodiments, determining the volume mount mode based on the updated current used capacity of the hard drive and a capacity threshold includes:
[0112] If the current used capacity of the updated hard drive is greater than or equal to the capacity threshold, then the volume's mount mode is determined to be read-only.
[0113] If the current used capacity of the updated hard drive is less than the capacity threshold, then the volume's mount mode is determined to be read-write mode.
[0114] Correspondingly, after determining the volume's mount mode, the following steps are also included:
[0115] When the volume is mounted in read-only mode, switch the read-write business mode to query mode;
[0116] When the volume is mounted in read / write mode, the second data capacity is obtained by subtracting the current used capacity of the updated hard drive from the capacity threshold.
[0117] The mapping relationship between data capacity range and rate limiting level is obtained in advance; where the data capacity range and rate limiting level have a one-to-one mapping relationship.
[0118] Determine the target data capacity range corresponding to the second data capacity in the mapping relationship;
[0119] Determine the corresponding second target rate limiting level based on the target data capacity range;
[0120] The rate-limited data is sent to the cache according to the rate-limiting rate corresponding to the second target rate-limiting level.
[0121] Specifically, regarding the determination of the volume mount mode, if the current used capacity of the updated hard drive is greater than or equal to the capacity threshold, the volume mount mode is determined to be read-only; if the current used capacity of the updated hard drive is less than the capacity threshold, the volume mount mode is determined to be read-write.
[0122] After determining the volume's mount mode, if it's read-only, the host's read / write service mode is switched to query mode, providing only query services. If it's read / write mode, it means the updated hard drive's current usage capacity does not exceed the publicly visible capacity threshold. To prevent the actual hard drive usage capacity from exceeding the publicly visible capacity threshold when subsequent cached data is flushed to the hard drive due to the host sending data to the cache according to the original data flow, rate limiting is required if the current capacity threshold is not exceeded.
[0123] The updated hard drive's current used capacity is subtracted from the capacity threshold to obtain the second data capacity. A mapping relationship exists between the second data capacity and the data capacity range and the rate limiting level, with a one-to-one correspondence between the data capacity range and the rate limiting level. This mapping relationship can be the same as or different from the embodiment describing the determination of the first target rate limiting level described above; it is not limited here. Furthermore, the rate limiting rate corresponding to different rate limiting levels can be smaller than the rate limiting rate corresponding to the first target rate limiting level described above. Considering that the currently visible capacity threshold has not been exceeded, preventative rate limiting operations are performed.
[0124] Here, the target data capacity range corresponding to the second data capacity is determined based on the mapping relationship, and the corresponding second target rate limiting level is determined based on the target data capacity range. The rate-limited data (which corresponds to the rate-limited data in step S12) is then sent to the cache at the corresponding rate limiting rate.
[0125] This embodiment provides a process for determining the mount mode of a volume, and in the corresponding read-only mode, changes the host's read / write service to a query service, and in the corresponding read / write mode, performs preventative rate limiting to improve the user experience.
[0126] When the pool's capacity reaches the read-only condition (the used capacity of the volume in the pool exceeds the pool's publicly displayed capacity), rate limiting is applied. The storage system has reserved space for operations (OPs), and this OP space will exceed the cache capacity. The flush time is calculated based on the amount of data on the backend disk and in the cache. This time is used to control the amount of data written to the cache, ensuring that the total cache capacity does not exceed the reserved OP space. The OP space can accommodate the data within the cache. However, it's also necessary to ensure that once the pool's capacity for providing services to the outside world is reached, it maintains a read-only state to protect data from being overwritten or written to disk, preventing data loss in the event of a power outage.
[0127] Once the pool capacity reaches the read-only condition, a flag is set. Upon reaching the read-only condition, the cache and write cache modules are immediately flushed to disk. At this time, the host's write I / O response time will be extended, but external services can still be provided without causing host service downtime. After flushing to disk, the capacity is retrieved again to check if read-only status has been reached. If it has, the pool's status is set to read-only; otherwise, rate limiting conditions are set based on the capacity, keeping the rate limit within a reasonable range. If the read-only condition is indeed reached, the status of all volumes in the pool is set to read-only, and these volumes can only provide query services externally.
[0128] This makes capacity statistics more accurate and avoids situations where the pool is already read-only when the actual read-only condition has not been met.
[0129] Figure 2 This is a schematic diagram illustrating a fast data flushing process for cached data provided in an embodiment of the present invention, as shown below. Figure 2 As shown, when the metadata write cache module is full, a flag is set, the driver cache and write cache are flushed with data, and the capacity is recalculated. Based on the current capacity, it is decided whether to set the volume on the pool to read-only state.
[0130] Furthermore, the present invention also provides a storage system including a host and a hard disk; wherein the cache is located on the host;
[0131] The host is used to perform the steps of the above-described method for determining storage capacity based on hard disks.
[0132] For an introduction to the storage system provided by the present invention, please refer to the above method embodiments. The present invention will not be described in detail here, but it has the same beneficial effects as the above-described method for determining storage capacity based on hard disks.
[0133] The foregoing has described in detail various embodiments of the hard disk-based storage capacity determination method. Based on this, the present invention also discloses a hard disk-based storage capacity determination apparatus corresponding to the above method. Figure 3This is a structural diagram of a hard disk-based storage capacity determination device provided in an embodiment of the present invention. Figure 3 As shown, the storage capacity determination device based on hard disk includes:
[0134] The acquisition module 11 is used to acquire the current used capacity and the externally visible capacity threshold of the hard drive;
[0135] The rate limiting module 12 is used to rate limit the data sent from the host to the cache when the current usage capacity is greater than the capacity threshold, and then send the rate-limited data to the cache.
[0136] The flushing processing module 13 is used to flush the cached current storage data to the hard disk and update the current used capacity of the hard disk so as to determine the mount mode of the volume based on the updated current used capacity of the hard disk and the capacity threshold.
[0137] Since the embodiments of the device part correspond to the embodiments described above, please refer to the embodiments described in the method part for the embodiments of the device part, and will not be repeated here.
[0138] For a description of the storage capacity determination device based on a hard disk provided by the present invention, please refer to the above method embodiments. The present invention will not be described again here, but it has the same beneficial effects as the above-described storage capacity determination method based on a hard disk.
[0139] Figure 4 A structural diagram of a server provided in an embodiment of the present invention, such as... Figure 4 As shown, the server includes:
[0140] Memory 21 is used to store computer programs;
[0141] Processor 22 is used to implement the steps of a hard disk-based storage capacity determination method when executing a computer program.
[0142] The processor 22 may include one or more processing cores, such as a quad-core processor or an octa-core processor. The processor 22 may be implemented using at least one of the following hardware forms: Digital Signal Processor (DSP), Field-Programmable Gate Array (FPGA), or Programmable Logic Array (PLA). The processor 22 may also include a main processor and a coprocessor. The main processor, also known as the Central Processing Unit (CPU), is used to process data in the wake-up state; the coprocessor is a low-power processor used to process data in the standby state. In some embodiments, the processor 22 may integrate a Graphics Processing Unit (GPU), which is responsible for rendering and drawing the content to be displayed on the screen. In some embodiments, the processor 22 may also include an Artificial Intelligence (AI) processor, which handles computational operations related to machine learning.
[0143] The memory 21 may include one or more computer-readable storage media, which may be non-transitory. The memory 21 may also include high-speed random access memory and non-volatile memory, such as one or more disk storage devices or flash memory devices. In this embodiment, the memory 21 is used to store at least the following computer program 211, which, after being loaded and executed by the processor 22, is capable of implementing the relevant steps of the disk-based storage capacity determination method disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 21 may also include an operating system 212 and data 213, etc., and the storage method may be temporary storage or permanent storage. The operating system 212 may include Windows, Unix, Linux, etc. The data 213 may include, but is not limited to, the data involved in the disk-based storage capacity determination method, etc.
[0144] In some embodiments, the server may further include a display screen 23, an input / output interface 24, a communication interface 25, a power supply 26, and a communication bus 27.
[0145] Those skilled in the field can understand, Figure 4 The structure shown does not constitute a limitation on the server and may include more or fewer components than illustrated.
[0146] The processor 22 implements the hard disk-based storage capacity determination method provided in any of the above embodiments by calling instructions stored in the memory 21.
[0147] For an introduction to the server provided by this invention, please refer to the above method embodiments. This invention will not be described in detail here, but it has the same beneficial effects as the above-described method for determining storage capacity based on hard disk.
[0148] Furthermore, the present invention also provides a computer-readable storage medium storing a computer program, which, when executed by processor 22, implements the steps of the above-described disk-based storage capacity determination method.
[0149] It is understood that if the methods in the above embodiments are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and executes all or part of the steps of the methods in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0150] For an introduction to the computer-readable storage medium provided by the present invention, please refer to the above method embodiments. The present invention will not be described again here, but it has the same beneficial effects as the above-described method for determining storage capacity based on hard disk.
[0151] Furthermore, the present invention also provides a computer program product, including a computer program / instructions that, when executed by a processor, implement the steps of a method for determining storage capacity based on a hard disk.
[0152] For an introduction to the computer program product provided by the present invention, please refer to the above method embodiments. The present invention will not be described in detail here, but it has the same beneficial effects as the above-described method for determining storage capacity based on hard disk.
[0153] The present invention has provided a detailed description of a hard disk-based storage capacity determination method, storage system, server, and medium. The various embodiments in the specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the apparatus disclosed in the embodiments, since it corresponds to the method disclosed in the embodiments, the description is relatively simple, and relevant parts can be referred to in the method section. It should be noted that those skilled in the art can make several improvements and modifications to the present invention without departing from the principles of the invention, and these improvements and modifications also fall within the protection scope of the present invention.
[0154] It should also be noted that, in this specification, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes the element.
Claims
1. A method for determining storage capacity based on a hard disk, characterized in that, include: Get the current used capacity and the externally visible capacity threshold of the hard drive; When the current usage capacity exceeds the capacity threshold, the data sent from the host to the cache is rate-limited to obtain rate-limited data, which is then sent to the cache. The cached current storage data is flushed to the hard disk, and the current used capacity of the hard disk is updated so that the mount mode of the volume can be determined based on the updated current used capacity of the hard disk and the capacity threshold. Correspondingly, flushing the cached current storage data to the hard disk includes: Obtain the cache reserved space of the hard disk and the write speed of the hard disk; wherein, the storage data capacity of the cache reserved space is greater than the storage data capacity of the cache; the write speed of the hard disk corresponds to a rate greater than the rate limit corresponding to the data sent from the host to the cache; Receive the currently stored data from the cache; The flush time of the hard drive is determined based on the current stored data and the write speed; The cache reserved space for flushing the current stored data to the hard disk is determined according to the flush time. Correspondingly, updating the current used capacity of the hard drive includes: Obtain the data capacity corresponding to the written data and deleted data of the currently stored data in the cache; The first data capacity is obtained by adding the current used capacity of the hard disk to the data capacity of the written data; The updated current usage capacity of the hard drive is obtained by subtracting the first data capacity from the data capacity of the deleted data.
2. The method for determining storage capacity based on a hard disk according to claim 1, characterized in that, Rate-limiting is applied to the data sent from the host to the cache to obtain the rate-limited data, including: Obtain the critical usage capacity of the hard disk and the rate at which the host sends current data to the cache; wherein the critical usage capacity is greater than the capacity threshold; The first target current limiting level is determined based on the current usage capacity and the critical usage capacity. The rate of the current data is adjusted according to the rate limiting rate corresponding to the first target rate limiting level in order to complete the rate limiting process of the host sending data to the cache.
3. The method for determining storage capacity based on a hard disk according to claim 1, characterized in that, Rate-limiting is applied to the data sent from the host to the cache to obtain the rate-limited data, including: Obtain the number of unit requests and the threshold number of requests sent from the host to the cache; wherein the number of unit requests is greater than the threshold number of requests; The number of rate-limited requests is determined based on the unit request count and the threshold request count; The data sent from the host to the cache is distributed according to the number of rate-limiting requests to complete the rate-limiting process for the data sent from the host to the cache.
4. The method for determining storage capacity based on a hard disk according to claim 1, characterized in that, Rate-limiting is applied to the data sent from the host to the cache to obtain the rate-limited data, including: Obtain the initial rate and critical rate limit of the current data sent from the host to the cache; The initial rate of the current data is adjusted according to the critical rate limiting rate to complete the rate limiting process of the host sending data to the cache.
5. The method for determining storage capacity based on a hard disk according to claim 1, characterized in that, The mount mode of the volume is determined based on the updated current used capacity of the hard drive and the capacity threshold, including: If the current used capacity of the updated hard disk is greater than or equal to the capacity threshold, then the mount mode of the volume is determined to be read-only mode. If the current used capacity of the updated hard drive is less than the capacity threshold, then the mount mode of the volume is determined to be read-write mode. Correspondingly, after determining the volume's mount mode, the following steps are also included: When the volume is mounted in read-only mode, switch the read-write business mode to query mode; When the volume is mounted in read / write mode, the updated current used capacity of the hard disk is subtracted from the capacity threshold to obtain the second data capacity. The mapping relationship between data capacity range and rate limiting level is obtained in advance; wherein, the data capacity range and the rate limiting level have a one-to-one mapping relationship. The target data capacity range corresponding to the second data capacity is determined in the mapping relationship; Determine the corresponding second target rate limiting level based on the target data capacity range; The rate-limited data is sent to the cache according to the rate-limiting rate corresponding to the second target rate-limiting level.
6. A storage system, characterized in that, Includes the host and hard drive; the cache is located on the host. The host is used to perform the steps of the hard disk-based storage capacity determination method according to any one of claims 1 to 5.
7. A server, characterized in that, include: Memory, used to store computer programs; A processor, configured to implement the steps of the disk-based storage capacity determination method as described in any one of claims 1 to 5 when executing the computer program.
8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, implements the steps of the disk-based storage capacity determination method as described in any one of claims 1 to 5.