Disk wear leveling method and device in distributed cluster, electronic equipment and medium
By migrating directly from the write cache of the source disk to the target disk in a distributed cluster, the problems of shortened lifespan and low migration efficiency caused by uneven disk wear are solved, achieving more efficient data migration and extended disk lifespan.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ZHEJIANG UNIVIEW TECH CO LTD
- Filing Date
- 2022-07-11
- Publication Date
- 2026-06-23
AI Technical Summary
In distributed clusters, uneven disk wear leads to shortened lifespan and low data migration efficiency. Existing technologies increase the number of disk writes by reading data from the source disk's write cache and then migrating it to the target disk, thus affecting performance and lifespan.
Migrating directly from the source disk's write cache to the target disk avoids evicting and flushing data to the source disk, reducing the number of times the source disk is flushed and improving data migration efficiency.
It slows down disk lifespan reduction, improves data migration efficiency, reduces write load on the source disk, and optimizes disk performance.
Smart Images

Figure CN117420943B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of disk wear leveling technology, and more particularly to a disk wear leveling method, apparatus, electronic device, and medium in a distributed cluster. Background Technology
[0002] Disks, as a common data storage medium, are widely used in distributed clusters. Since disks have a limited write / erase lifespan, it's crucial to distribute workloads as evenly as possible within a distributed storage cluster. However, due to differences in partitioning attributes or storage node roles, disks experience varying degrees of wear and tear. Therefore, effectively leveling disk wear and migrating data efficiently when disks become worn are critical issues in distributed cluster applications.
[0003] The current main solution is to migrate all or part of the data from storage nodes with high wear and tear to other storage nodes. Specifically, hot data is calculated based on the system's own heat index, the data to be migrated is found, read from the source disk and written to the target disk, and the data on the source disk is deleted.
[0004] However, the above approach requires recording the write count for each cache entry, which can create an additional write burden, especially for small objects, and can also impact disk performance. Furthermore, during data migration, data must first be written to the source disk and then read from it to migrate to other disks. This process increases the number of disk writes, shortening disk lifespan and increasing data migration time, thus reducing data migration efficiency. Summary of the Invention
[0005] This invention provides a disk wear leveling method, apparatus, electronic device, and medium in a distributed cluster, which can directly migrate data from the write cache of the source disk to the target disk without having to flush the write cache data to the source disk before data migration. This reduces the number of write operations to the source disk, helps slow down disk lifespan reduction, and improves data migration efficiency.
[0006] According to one aspect of the present invention, a disk wear leveling method in a distributed cluster is provided, the method comprising:
[0007] Identify the source disk for data migration and the target disk for data migration from the cloud storage distributed cluster;
[0008] The data to be migrated is determined from the write cache corresponding to the source disk;
[0009] The data to be migrated is migrated from the write cache corresponding to the source disk to the target disk.
[0010] Optionally, the data to be migrated is determined from the write cache corresponding to the source disk, including:
[0011] Determine the number of times the data to be stored in the write cache is modified within a target time period; wherein, the target time period is after the data to be stored is written to the write cache and before the data to be stored is evicted in the write cache; the data to be stored is data that has not been written to the source disk;
[0012] The data to be migrated is determined from the data to be stored based on the number of times it has been rewritten.
[0013] Optionally, the number of times the data has been overwritten is recalculated after the write service of the write cache associated device is restarted.
[0014] Optionally, before migrating the data to be migrated from the write cache corresponding to the source disk to the target disk, the method further includes:
[0015] Determine the integrity of the data to be migrated;
[0016] If the data to be migrated is incomplete, the corresponding supplementary data to be migrated is read from the source disk, and the supplementary data to be migrated is filled into the write cache to form a complete array stripe data with the data to be migrated.
[0017] Accordingly, migrating the data to be migrated from the write cache corresponding to the source disk to the target disk includes:
[0018] The array stripe data in the write cache is migrated from the write cache corresponding to the source disk to the target disk.
[0019] Optionally, determining the integrity of the data to be migrated includes:
[0020] Determine the comparison result between the size of the data to be migrated in the write cache and the preset array block size;
[0021] If the data to be migrated is smaller than the preset array block, then the data to be migrated is determined to be incomplete.
[0022] Optionally, the data to be migrated is determined from the write cache corresponding to the source disk, including:
[0023] Determine the capacity of the data to be stored in the write cache;
[0024] Data to be stored that has a capacity greater than a preset capacity threshold is identified as data to be migrated.
[0025] Optionally, the source disk for data migration and the target disk for data migration can be determined from the cloud storage distributed cluster, including:
[0026] The source disk for data migration and the target disk for data migration are determined from the cloud storage distributed cluster based on disk wear parameters; wherein, the disk wear parameters include at least one of the following: number of erase / write cycles, remaining lifespan, and remaining memory.
[0027] According to another aspect of the present invention, a disk wear leveling device in a distributed cluster is provided, comprising:
[0028] The disk determination module is used to determine the source disk for data migration and the target disk for data migration from the cloud storage distributed cluster;
[0029] The data determination module is used to determine the data to be migrated from the write cache corresponding to the source disk;
[0030] The data migration module is used to migrate the data to be migrated from the write cache corresponding to the source disk to the target disk.
[0031] According to another aspect of the present invention, a disk wear leveling electronic device in a distributed cluster is provided, the electronic device comprising:
[0032] At least one processor; and
[0033] A memory communicatively connected to the at least one processor; wherein,
[0034] The memory stores a computer program that can be executed by the at least one processor, the computer program being executed by the at least one processor to enable the at least one processor to perform the disk wear leveling method in a distributed cluster according to any embodiment of the present invention.
[0035] According to another aspect of the present invention, a computer-readable storage medium is provided, the computer-readable storage medium storing computer instructions for causing a processor to execute and implement the disk wear leveling method in a distributed cluster as described in any embodiment of the present invention.
[0036] The technical solution of this invention involves determining the source disk for data migration and the target disk for data migration from a cloud storage distributed cluster; determining the data to be migrated from the write cache corresponding to the source disk; and migrating the data to be migrated from the write cache corresponding to the source disk to the target disk. This technical solution enables direct migration of data from the write cache of the source disk to the target disk, eliminating the need to flush the data to be migrated from the write cache to the source disk before data migration. This reduces the number of write operations to the source disk, helps slow down the shortening of the source disk's lifespan, and improves data migration efficiency.
[0037] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of the present invention, nor is it intended to limit the scope of the invention. Other features of the invention will become readily apparent from the following description. Attached Figure Description
[0038] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0039] Figure 1 This is a flowchart of a disk wear leveling method in a distributed cluster according to Embodiment 1 of the present invention;
[0040] Figure 2 This is an internal software logic diagram of a single node in a distributed cluster provided according to Embodiment 1 of the present invention;
[0041] Figure 3 This is a flowchart of a disk wear leveling method in a distributed cluster according to Embodiment 2 of the present invention;
[0042] Figure 4 This is a schematic diagram of a disk wear leveling device in a distributed cluster according to Embodiment 3 of the present invention;
[0043] Figure 5 This is a schematic diagram of the structure of an electronic device that implements a disk wear leveling method in a distributed cluster according to an embodiment of the present invention. Detailed Implementation
[0044] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.
[0045] It should be noted that the terms "first," "second," "target," etc., used in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that embodiments of the invention described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.
[0046] Example 1
[0047] Figure 1 This is a flowchart of a disk wear leveling method in a distributed cluster according to Embodiment 1 of the present invention. This embodiment is applicable to situations where wear leveling is performed on disks in a distributed cluster. The method can be executed by a disk wear leveling device in the distributed cluster, which can be implemented in hardware and / or software. This disk wear leveling device can be configured in an electronic device with data processing capabilities. Figure 1 As shown, the method includes:
[0048] S110 determines the source disk for data migration and the target disk for data migration from the cloud storage distributed cluster.
[0049] The technical solution in this embodiment is applicable to cloud storage distributed cluster application scenarios. A cloud storage distributed cluster refers to a system that uses cluster applications, grid technology, or distributed file systems to aggregate a large number of various types of storage devices on a network to work collaboratively and jointly provide data storage and service access functions. Specifically, a cloud storage distributed cluster contains multiple nodes. Figure 2 This is a software logic diagram of a single node in a distributed cluster provided in Embodiment 1 of the present invention. Wherein, LV represents a disk partition, Cache represents a write cache, RAID represents a disk array, DISK represents a disk, and IO represents input / output. Figure 2As shown, there is a caching interface between the upper-layer business logic and the disk, specifically divided into a write cache and a read cache. This cache is the necessary path for all business data. In other words, all write operations must first go through the write cache before reaching the disk. Furthermore, to improve business performance, an array cache can be configured. Specifically, before a business data is written to disk, it needs to be written to the array write cache first. The write cache typically uses a portion of memory to sort and consolidate data, thereby improving write performance.
[0050] In this context, the source disk can refer to the disk from which data is migrated, and the destination disk can refer to the disk from which data is migrated. It should be noted that the source and destination disks can be located on the same node or different nodes within the cloud storage distributed cluster. This embodiment does not impose any restrictions on the location of the source and destination disks.
[0051] In this embodiment, the source disk and target disk need to be determined from the cloud storage distributed cluster first. Optionally, determining the source disk for data migration and the target disk for data migration from the cloud storage distributed cluster includes: determining the source disk for data migration and the target disk for data migration from the cloud storage distributed cluster based on disk wear parameters; wherein, the disk wear parameters include at least one of the following: erase / write cycles, remaining lifetime, and remaining memory.
[0052] Here, disk wear parameters refer to parameters used to characterize the degree of disk wear. Specifically, disk wear parameters may include at least one of write / erase cycles, remaining lifetime, and remaining memory. In this embodiment, disk wear parameters can first be determined using disk wear detection tools or discrete statistical algorithms, and then source and target disks can be determined from the cloud storage distributed cluster based on the disk wear parameters. For example, disks with a high number of write / erase cycles, short remaining lifetime, and / or low remaining memory can be identified as source disks, and disks with a low number of write / erase cycles, long remaining lifetime, and / or high remaining memory can be identified as target disks.
[0053] S120 determines the data to be migrated from the write cache corresponding to the source disk.
[0054] In this context, "data to be migrated" refers to data awaiting migration. In this embodiment, the data to be migrated can be determined based on the data hit rate in the write cache corresponding to the source disk, or based on the data capacity in the write cache corresponding to the source disk. It should be noted that the data hit rate is closely related to the number of times the data has been modified. Given a fixed total number of disk writes, the more times a certain piece of data in the write cache has been modified, the higher its hit rate. Furthermore, larger data volumes occupy more disk memory. Data with a high hit rate is considered "hot data," while larger data volumes are considered "large blocks of data." For example, data with a high hit rate in the write cache (hot data) can be identified as data to be migrated, or data with a large data volume in the write cache (large blocks of data) can be identified as data to be migrated.
[0055] S130 migrates the data to be migrated from the write cache corresponding to the source disk to the target disk.
[0056] In this embodiment, after determining the source disk, target disk, and data to be migrated, the data to be migrated can be moved from the write cache corresponding to the source disk to the target disk. For example, using... Figure 2 For example, suppose there are two disks, A and B, in the disk, where disk A is the source disk and disk B is the target disk. During data migration, the data to be migrated from disk A's write cache can be moved to disk B without first flushing the data in disk A's write cache to disk A before migration. This reduces one flush operation on disk A, which helps slow down disk A's lifespan and improves data migration efficiency. For data that has been migrated from the source disk, the target disk can be found through the record mapping table, and the migrated data can be located accordingly. It should be noted that the data written in the write cache is data that has not been written to the RAID array.
[0057] The technical solution of this invention involves determining the source disk for data migration and the target disk for data migration from a cloud storage distributed cluster; determining the data to be migrated from the write cache corresponding to the source disk; and migrating the data to be migrated from the write cache corresponding to the source disk to the target disk. This technical solution enables direct migration of data from the write cache of the source disk to the target disk, eliminating the need to flush the data to be migrated from the write cache to the source disk before data migration. This reduces the number of write operations to the source disk, helps slow down the shortening of the source disk's lifespan, and improves data migration efficiency.
[0058] In this embodiment, optionally, determining the data to be migrated from the write cache corresponding to the source disk includes: determining the number of times the data to be stored in the write cache has been modified within a target time period; wherein, the target time period is after the data to be stored is written to the write cache and before the data to be stored is evicted in the write cache; the data to be stored is data that has not been written to the source disk; and determining the data to be migrated from the data to be stored based on the number of times it has been modified.
[0059] The target time period can refer to a pre-defined period used to statistically analyze the number of times data is rewritten. Specifically, the target time period is from after the data to be stored is written to the write cache to before the data to be stored is evicted in the write cache. The data to be stored can refer to data waiting to be written to disk, i.e., data that has not yet been written to the source disk. Optionally, the data to be stored being evicted in the write cache means that a preset number of array stripes have been stored in the write cache after the data to be stored has reached full capacity, or that the storage memory has reached a preset memory threshold. The preset number can refer to a pre-defined number of stripes in the disk array that are in a full stripe state. A full stripe state refers to stripes in the disk array that are in a full stripe state. The preset memory threshold can refer to a pre-defined upper limit for data memory.
[0060] In this embodiment, the data to be migrated can be determined based on the number of times it has been overwritten. Specifically, it is first necessary to determine the number of times the data to be stored in the write cache has been overwritten after being written to the write cache and before being evicted in the write cache. It should be noted that when write cache data is written to the array and disk, evicting can be triggered when a preset number of array stripes are full or the storage memory reaches a preset memory threshold. The evicted data will be written to the physical disk. For example, the number of overwrites can be determined by disk wear detection tools or discrete statistical algorithms. Then, the data to be migrated can be determined from the data to be stored based on the number of overwrites. For example, data to be stored with a high number of overwrites (hot data) can be identified as data to be migrated to avoid shortening the disk life due to repeated overwriting of hot data.
[0061] This solution allows frequently accessed data to be migrated, effectively avoiding the problem of shortened disk lifespan caused by repeated data rewriting.
[0062] In this embodiment, optionally, the number of times the cache is overwritten is recalculated after the write service of the write cache associated device is restarted.
[0063] In this context, the write cache associated device can refer to a node device in a distributed cluster associated with the write cache. Existing technologies typically use temporary memory to periodically refresh to non-volatile media, identifying hot data to be migrated by recording the number of times the data has been modified. This method impacts disk performance and incurs additional write overhead. In this embodiment, the number of modifications can be recalculated after the write operation of the write cache associated device restarts. It should be noted that triggering eviction of data in the write cache can cause the write operation of the write cache associated device to restart. For example, assuming data is written to the write cache after the write operation of the write cache associated device restarts, if this data is modified again before being evicted from the write cache to the backend physical disk, only one write cache hit is recorded, i.e., the modification count is recorded as 1.
[0064] This solution, through this configuration, allows the number of times the cache has been modified to be recalculated after the write service of the write cache associated device is restarted, without needing to write to non-volatile media or incurring additional write operation burden.
[0065] In this embodiment, optionally, determining the data to be migrated from the write cache corresponding to the source disk includes: determining the capacity of the data to be stored in the write cache; and determining the data to be stored with a capacity greater than a preset capacity threshold as the data to be migrated.
[0066] The preset capacity threshold can refer to a pre-defined upper limit for data storage capacity. In this embodiment, the data to be migrated can be determined based on the capacity of the data to be stored. Specifically, firstly, the capacity of the data to be stored in the write cache is determined, and then this capacity is compared with the preset capacity threshold. Data with a capacity greater than the preset capacity threshold is selected as the data to be migrated, that is, large blocks of data are selected as the data to be migrated.
[0067] This solution allows large blocks of data to be migrated, effectively avoiding the problem of shortened disk lifespan caused by excessive data writing.
[0068] Example 2
[0069] Figure 3This is a flowchart of a disk wear leveling method in a distributed cluster provided in Embodiment 2 of the present invention. This embodiment is based on the above embodiment and optimized. Specifically, the optimization is as follows: before migrating the data to be migrated from the write cache corresponding to the source disk to the target disk, the method further includes: determining the integrity of the data to be migrated; if the data to be migrated is incomplete, reading the corresponding supplementary data to be migrated from the source disk, filling the write cache with the supplementary data to be migrated, and forming a complete array stripe data with the data to be migrated; correspondingly, migrating the data to be migrated from the write cache corresponding to the source disk to the target disk includes: migrating the array stripe data in the write cache from the write cache corresponding to the source disk to the target disk.
[0070] like Figure 3 As shown, the method in this embodiment specifically includes the following steps:
[0071] S310 determines the source disk for data migration and the target disk for data migration from the cloud storage distributed cluster.
[0072] S320 determines the data to be migrated from the write cache corresponding to the source disk.
[0073] S330, confirm the integrity of the data to be migrated.
[0074] In this embodiment, before data migration, it is also necessary to determine the integrity of the data to be migrated, that is, to determine whether the data to be migrated in the write cache is complete. Optionally, determining the integrity of the data to be migrated includes: determining the comparison result between the size of the data to be migrated in the write cache and the preset array block size; if the data to be migrated is smaller than the preset array block size, then it is determined that the data to be migrated is incomplete.
[0075] Here, an array stripe can refer to the smallest storage unit of a disk array in the write cache. The preset array stripe size can refer to the pre-defined capacity corresponding to a full array stripe. In this embodiment, after determining the data to be migrated, the size of the data to be migrated can be compared with the preset array stripe size, and the integrity of the data to be migrated can be determined based on the comparison result. Specifically, if the data to be migrated is smaller than the preset array stripe size, it can be determined that the data to be migrated is incomplete; otherwise, it indicates that the data to be migrated is complete. For example, assuming the size of the data to be migrated is 1KB and the size of the preset array stripe is 4KB, the size of the data to be migrated is smaller than the size of the preset array stripe, therefore it can be determined that the data to be migrated is incomplete, that is, the data to be migrated in the write cache is not completely modified.
[0076] S340: If the data to be migrated is incomplete, the corresponding supplementary data to be migrated is read from the source disk and filled into the write cache, forming a complete array stripe data with the data to be migrated.
[0077] The supplementary data to be migrated can refer to the source disk data that, together with the incomplete data to be migrated, forms the complete array stripe data. Array stripe data can refer to the data stored within an array stripe. When data is not completely modified but only partially modified, incomplete data to be migrated may occur. In this embodiment, if the data to be migrated is incomplete, the corresponding supplementary data to be migrated needs to be read from the source disk and filled into the write cache to form the complete array stripe data. For example, assuming the complete array stripe data size is 4KB, and only a partial modification has been made to the data, with the modified data size being 1KB, the 1KB of modified data in the write cache is incomplete data to be migrated. In this case, the corresponding unmodified 3KB supplementary data to be migrated needs to be read from the source disk and filled into the write cache to form the complete 4KB array stripe data.
[0078] The S350 migrates array stripe data in the write cache from the write cache corresponding to the source disk to the target disk.
[0079] In this embodiment, after the write cache obtains the complete array stripe data, the array stripe data can be migrated from the write cache corresponding to the source disk to the target disk, and the cache block corresponding to the data to be migrated in the write cache can be released after the data migration is completed.
[0080] If the data to be migrated is complete, the S360 will migrate the data from the write cache corresponding to the source disk to the target disk.
[0081] In this embodiment, if the data to be migrated is complete, there is no need to read the data from the source disk; the data to be migrated can be directly migrated from the write cache corresponding to the source disk to the target disk.
[0082] The technical solution of this invention determines the integrity of the data to be migrated before migrating it from the write cache corresponding to the source disk to the target disk. If the data to be migrated is incomplete, corresponding supplementary data to be migrated is read from the source disk and filled into the write cache, forming a complete array stripe data with the data to be migrated. The array stripe data in the write cache is then migrated from the write cache corresponding to the source disk to the target disk. This technical solution, while mitigating disk lifespan reduction and improving data migration efficiency, also effectively avoids data migration errors caused by incomplete data to be migrated.
[0083] Example 3
[0084] Figure 4This is a schematic diagram of a disk wear leveling device in a distributed cluster according to Embodiment 3 of the present invention. This device can execute the disk wear leveling method in a distributed cluster provided in any embodiment of the present invention, and has the corresponding functional modules and beneficial effects for executing the method. Figure 4 As shown, the device includes:
[0085] The disk determination module 410 is used to determine the source disk for data migration and the target disk for data migration from the cloud storage distributed cluster.
[0086] Data determination module 420 is used to determine the data to be migrated from the write cache corresponding to the source disk;
[0087] The data migration module 430 is used to migrate the data to be migrated from the write cache corresponding to the source disk to the target disk.
[0088] Optionally, the data determination module 420 includes:
[0089] The overwrite count determination unit is used to determine the number of times the data to be stored in the write cache is overwritten within a target time period; wherein, the target time period is after the data to be stored is written to the write cache and before the data to be stored is evicted in the write cache; the data to be stored is data that has not been written to the source disk;
[0090] The first data determination unit is used to determine the data to be migrated from the data to be stored based on the number of times it has been rewritten.
[0091] Optionally, the number of times the data has been overwritten is recalculated after the write service of the write cache associated device is restarted.
[0092] Optionally, the device further includes:
[0093] The data integrity determination module is used to determine the integrity of the data to be migrated before migrating the data to be migrated from the write cache corresponding to the source disk to the target disk;
[0094] The data filling module is used to read the corresponding supplementary data to be migrated from the source disk if the data to be migrated is incomplete, and fill the supplementary data to be migrated into the write cache to form a complete array stripe data with the data to be migrated;
[0095] Correspondingly, the data migration module 430 is specifically used for:
[0096] The array stripe data in the write cache is migrated from the write cache corresponding to the source disk to the target disk.
[0097] Optionally, the data integrity determination module is specifically used for:
[0098] Determine the comparison result between the size of the data to be migrated in the write cache and the preset array block size;
[0099] If the data to be migrated is smaller than the preset array block, then the data to be migrated is determined to be incomplete.
[0100] Optionally, the data determination module 420 includes:
[0101] A data capacity determination unit is used to determine the capacity of the data to be stored in the write cache;
[0102] The second data determination unit is used to determine the data to be stored that has a capacity greater than a preset capacity threshold as the data to be migrated.
[0103] Optionally, the disk determination module 410 is specifically used for:
[0104] The source disk for data migration and the target disk for data migration are determined from the cloud storage distributed cluster based on disk wear parameters; wherein, the disk wear parameters include at least one of the following: number of erase / write cycles, remaining lifespan, and remaining memory.
[0105] The disk wear leveling device in a distributed cluster provided in this embodiment of the invention can execute the disk wear leveling method in a distributed cluster provided in any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the method.
[0106] Example 4
[0107] Figure 5 A schematic diagram of an electronic device 10 that can be used to implement embodiments of the present invention is shown. The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device can also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the invention described and / or claimed herein.
[0108] like Figure 5As shown, the electronic device 10 includes at least one processor 11 and a memory, such as a read-only memory (ROM) 12 or a random access memory (RAM) 13, communicatively connected to the at least one processor 11. The memory stores computer programs executable by the at least one processor. The processor 11 can perform various appropriate actions and processes based on the computer program stored in the ROM 12 or loaded from storage unit 18 into the RAM 13. The RAM 13 may also store various programs and data required for the operation of the electronic device 10. The processor 11, ROM 12, and RAM 13 are interconnected via a bus 14. An input / output (I / O) interface 15 is also connected to the bus 14.
[0109] Multiple components in electronic device 10 are connected to I / O interface 15, including: input unit 16, such as keyboard, mouse, etc.; output unit 17, such as various types of displays, speakers, etc.; storage unit 18, such as disk, optical disk, etc.; and communication unit 19, such as network card, modem, wireless transceiver, etc. Communication unit 19 allows electronic device 10 to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.
[0110] Processor 11 can be a variety of general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various processors running machine learning model algorithms, digital signal processors (DSPs), and any suitable processor, controller, microcontroller, etc. Processor 11 performs the various methods and processes described above, such as disk wear leveling methods in a distributed cluster.
[0111] In some embodiments, the disk wear leveling method in a distributed cluster can be implemented as a computer program tangibly contained in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program can be loaded and / or installed on electronic device 10 via ROM 12 and / or communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the disk wear leveling method in a distributed cluster described above can be performed. Alternatively, in other embodiments, processor 11 can be configured to perform the disk wear leveling method in a distributed cluster by any other suitable means (e.g., by means of firmware).
[0112] Various embodiments of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific standard products (ASSPs), systems-on-a-chip (SoCs), payload-programmable logic devices (CPLDs), computer hardware, firmware, software, and / or combinations thereof. These various embodiments may include implementations in one or more computer programs that can be executed and / or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general-purpose programmable processor, capable of receiving data and instructions from a storage system, at least one input device, and at least one output device, and transmitting data and instructions to the storage system, the at least one input device, and the at least one output device.
[0113] Computer programs used to implement the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, such that when executed by the processor, the computer programs cause the functions / operations specified in the flowcharts and / or block diagrams to be performed. The computer programs may be executed entirely on a machine, partially on a machine, or as a standalone software package, partially on a machine and partially on a remote machine, or entirely on a remote machine or server.
[0114] In the context of this invention, a computer-readable storage medium can be a tangible medium that may contain or store a computer program for use by or in conjunction with an instruction execution system, apparatus, or device. A computer-readable storage medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination thereof. Alternatively, a computer-readable storage medium may be a machine-readable signal medium. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fibers, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof.
[0115] To provide interaction with a user, the systems and techniques described herein can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user; and a keyboard and pointing device (e.g., a mouse or trackball) through which the user provides input to the electronic device. Other types of devices can also be used to provide interaction with the user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form (including sound input, voice input, or tactile input).
[0116] The systems and technologies described herein can be implemented in computing systems that include backend components (e.g., as data servers), or computing systems that include middleware components (e.g., application servers), or computing systems that include frontend components (e.g., user computers with graphical user interfaces or web browsers through which users can interact with implementations of the systems and technologies described herein), or any combination of such backend, middleware, or frontend components. The components of the system can be interconnected via digital data communication of any form or medium (e.g., communication networks). Examples of communication networks include local area networks (LANs), wide area networks (WANs), blockchain networks, and the Internet.
[0117] A computing system can include clients and servers. Clients and servers are generally located far apart and typically interact through a communication network. The client-server relationship is created by computer programs running on the respective computers and having a client-server relationship with each other. The server can be a cloud server, also known as a cloud computing server or cloud host, which is a hosting product within the cloud computing service system to address the shortcomings of traditional physical hosts and VPS services, such as high management difficulty and weak business scalability.
[0118] It should be understood that the various forms of processes shown above can be used, with steps reordered, added, or deleted. For example, the steps described in this invention can be executed in parallel, sequentially, or in different orders, as long as the desired result of the technical solution of this invention can be achieved, and this is not limited herein.
[0119] The specific embodiments described above do not constitute a limitation on the scope of protection of this invention. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can be made according to design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this invention should be included within the scope of protection of this invention.
Claims
1. A disk wear leveling method in a distributed cluster, characterized in that, include: Identify the source disk for data migration and the target disk for data migration from the cloud storage distributed cluster; The data to be migrated is determined from the write cache corresponding to the source disk; Determine the integrity of the data to be migrated; If the data to be migrated is incomplete, the corresponding supplementary data to be migrated is read from the source disk, and the supplementary data to be migrated is filled into the write cache to form a complete array stripe data with the data to be migrated. The array stripe data in the write cache is migrated from the write cache corresponding to the source disk to the target disk; The process of determining the data to be migrated from the write cache corresponding to the source disk includes: Determine the number of times the data to be stored in the write cache is modified within a target time period; wherein, the target time period is after the data to be stored is written to the write cache and before the data to be stored is evicted in the write cache; the data to be stored is data that has not been written to the source disk; The data to be migrated is determined from the data to be stored based on the number of times it has been rewritten.
2. The method according to claim 1, characterized in that, The number of times the data has been overwritten is recalculated after the write service of the write cache associated device is restarted.
3. The method according to claim 1, characterized in that, Determining the integrity of the data to be migrated includes: Determine the comparison result between the size of the data to be migrated in the write cache and the preset array block size; If the data to be migrated is smaller than the preset array block, then the data to be migrated is determined to be incomplete.
4. The method according to claim 1, characterized in that, The data to be migrated is determined from the write cache corresponding to the source disk, including: Determine the capacity of the data to be stored in the write cache; Data to be stored that has a capacity greater than a preset capacity threshold is identified as data to be migrated.
5. The method according to claim 1, characterized in that, Identify the source disk for data migration and the target disk for data migration from the cloud storage distributed cluster, including: The source disk for data migration and the target disk for data migration are determined from the cloud storage distributed cluster based on disk wear parameters; wherein, the disk wear parameters include at least one of the following: number of erase / write cycles, remaining lifespan, and remaining memory.
6. A disk wear leveling device in a distributed cluster, characterized in that, The device includes: The disk determination module is used to determine the source disk for data migration and the target disk for data migration from the cloud storage distributed cluster; The data determination module is used to determine the data to be migrated from the write cache corresponding to the source disk; A data integrity determination module is used to determine the integrity of the data to be migrated; The data filling module is used to read the corresponding supplementary data to be migrated from the source disk if the data to be migrated is incomplete, and fill the supplementary data to be migrated into the write cache to form a complete array stripe data with the data to be migrated; The data migration module is used to migrate the array stripe data in the write cache from the write cache corresponding to the source disk to the target disk; The data determination module is used for: Determine the number of times the data to be stored in the write cache is modified within a target time period; wherein, the target time period is after the data to be stored is written to the write cache and before the data to be stored is evicted in the write cache; the data to be stored is data that has not been written to the source disk; The data to be migrated is determined from the data to be stored based on the number of times it has been rewritten.
7. A disk wear leveling electronic device in a distributed cluster, characterized in that, The electronic device includes: At least one processor; and A memory communicatively connected to the at least one processor; wherein, The memory stores a computer program that can be executed by the at least one processor, the computer program being executed by the at least one processor to enable the at least one processor to perform the disk wear leveling method in a distributed cluster as described in any one of claims 1-5.
8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions that cause a processor to execute the disk wear leveling method in a distributed cluster as described in any one of claims 1-5.