A disk array expansion method and device, electronic equipment and medium

By reading and reconstructing normal disk data before RAID expansion, the problems of topology fragmentation and data consistency caused by disk anomalies are solved, and the continuity and data integrity of online expansion are achieved in abnormal scenarios.

CN122086331BActive Publication Date: 2026-06-26SHANDONG YUNHAI GUOCHUANG CLOUD COMPUTING EQUIP IND INNOVATION CENT CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANDONG YUNHAI GUOCHUANG CLOUD COMPUTING EQUIP IND INNOVATION CENT CO LTD
Filing Date
2026-04-22
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

In the current RAID online expansion process, disk anomalies (such as failure or offline) can cause topology fragmentation, leading to I/O routing conflicts and data consistency failures. The lack of atomic rollback and breakpoint resumption mechanisms can result in expansion interruptions and data loss.

Method used

Before expansion, normal disk data (excluding abnormal disks) is read from the disk array. Based on this data, the data is reconstructed and written to the expanded disk array to ensure data consistency and fault tolerance. By rationally allocating and reading/writing striped data, the reliability of the expansion is improved.

Benefits of technology

In scenarios of disk anomalies, ensure the continuity and data consistency of online RAID expansion, improve expansion reliability, and avoid topology inconsistencies and data loss.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122086331B_ABST
    Figure CN122086331B_ABST
Patent Text Reader

Abstract

The application discloses a disk array expansion method and device, electronic equipment and medium, and relates to the technical field of computers. The method comprises the following steps: determining a corresponding stripe set of a to-be-generated stripe in a post-expansion disk array in a pre-expansion disk array according to a preset expansion rule; in the case that there is an abnormal disk in the pre-expansion disk array, reading data corresponding to a first type of data block in the stripe set from the pre-expansion disk array; obtaining data corresponding to a second type of data block in the to-be-generated stripe based on data corresponding to all first type of data blocks; and writing the data corresponding to all second type of data blocks in the to-be-generated stripe into the disks corresponding to the second type of data blocks in the post-expansion disk array respectively, so as to complete the expansion of the pre-expansion disk array. Through the application, the expansion can be continuously performed in the case of disk abnormalities.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer technology, and in particular to a disk array expansion method, apparatus, electronic device and medium. Background Technology

[0002] Redundant Array of Independent Disks (RAID) improves storage performance and reliability by increasing parity redundancy. RAID technology combines multiple disks to provide a single, larger-capacity logical disk, concurrent input / output (I / O) read / write capabilities, and data redundancy. With the continuous growth of data volumes, existing RAID arrays are often expanded online to meet capacity and performance upgrade demands. However, in related technologies, if a disk fails or goes offline during online RAID expansion, the expansion will terminate, causing a break in the old and new stripe topology, triggering I / O routing conflicts, and lacking atomic rollback and breakpoint resumption mechanisms, making it highly susceptible to data inconsistency or even loss. Therefore, ensuring the continuity and data consistency of RAID online expansion under disk anomaly scenarios is a key focus. Summary of the Invention

[0003] This application provides a disk array expansion method, apparatus, electronic device, and medium to ensure the continuity and data consistency of RAID online expansion in the event of disk anomalies.

[0004] This application provides a method for expanding a disk array, the method comprising:

[0005] According to the preset expansion rules, determine the stripe set in the disk array before expansion that corresponds to the stripe to be generated in the disk array after expansion. The data in the stripe set includes the data in the stripe to be generated.

[0006] If there are abnormal disks in the disk array before expansion, read the data corresponding to the first type of data block in the stripe set from the disk array before expansion. The disk corresponding to the first type of data block in the disk array before expansion is a disk other than the abnormal disk. The disk array before expansion has the ability to recover data from the abnormal disk.

[0007] Based on the data corresponding to all the first type of data blocks, the data corresponding to the second type of data blocks in the stripe to be generated is obtained. The disks corresponding to the second type of data blocks in the expanded disk array are other disks except for the abnormal disks.

[0008] The data corresponding to each of the second type of data blocks in the stripe to be generated is written to the disks corresponding to each second type of data block in the expanded disk array to complete the expansion of the disk array before expansion.

[0009] This application also provides a disk array expansion device, including:

[0010] The first determining module is used to determine the set of stripes in the disk array before expansion that correspond to the stripes to be generated in the disk array after expansion, according to the preset expansion rules. The data in the stripe set includes the data in the stripes to be generated.

[0011] The read module is used to read data corresponding to the first type of data block in the stripe set from the disk array before expansion when there is an abnormal disk in the disk array before expansion. The disk corresponding to the first type of data block in the disk array before expansion is a disk other than the abnormal disk. The disk array before expansion has the ability to recover data from the abnormal disk.

[0012] The second determining module is used to obtain the data corresponding to the second type of data blocks in the stripe to be generated based on the data corresponding to all the first type of data blocks respectively. The disks corresponding to the second type of data blocks in the expanded disk array are other disks except for the abnormal disks.

[0013] The write module is used to write the data corresponding to each second type of data block in the stripe to be generated to the disk corresponding to each second type of data block in the expanded disk array, so as to complete the expansion of the disk array before expansion.

[0014] This application also provides an electronic device, including: a memory for storing a computer program; and a processor for executing the computer program to implement the steps of any of the disk array expansion methods described above.

[0015] This application also provides a computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of any of the disk array expansion methods described above.

[0016] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the steps of any of the disk array expansion methods described above.

[0017] This application enables the reading of data from the first type of data blocks in a normal disk when there are abnormal disks in the disk array before expansion. Based on the data corresponding to each of the first type of data blocks, the second type of data is reconstructed and written to the corresponding disk. This ensures data consistency during the disk array expansion process and provides fault tolerance. By rationally allocating and reading / writing striped data, the reliability of the expansion is improved, and online expansion of the disk array is achieved in the event of disk abnormalities. Attached Figure Description

[0018] To more clearly illustrate the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0019] Figure 1 This is a schematic diagram illustrating a scenario where a disk failure occurs during the disk array expansion process, as provided in an embodiment of this application.

[0020] Figure 2 A flowchart of a disk array expansion method provided in this application embodiment;

[0021] Figure 3 This is a schematic diagram of the data layout in a preset storage area provided in the embodiments of this application;

[0022] Figure 4 A schematic diagram illustrating the expansion process of stripe No. 2 provided in an embodiment of this application;

[0023] Figure 5 A schematic diagram illustrating the expansion process of stripe No. 3 provided in an embodiment of this application;

[0024] Figure 6 A schematic diagram illustrating the expansion process of stripe number 4 provided in an embodiment of this application;

[0025] Figure 7 A schematic diagram illustrating the determination of data blocks to be migrated, provided for an embodiment of this application;

[0026] Figure 8 This is an overall flowchart of online disk array expansion provided in an embodiment of this application;

[0027] Figure 9 This is a schematic diagram of a disk array expansion device provided in an embodiment of this application;

[0028] Figure 10 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0029] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the protection scope of this application.

[0030] It should be noted that, in the description of this application, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. The terms "first," "second," etc., in this application are used to distinguish similar objects and are not used to describe a specific order or sequence.

[0031] To enable those skilled in the art to better understand the present application, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0032] First, the application scenarios of the embodiments of this application will be introduced by way of example.

[0033] Redundant Array of Independent Disks (RAID) improves storage performance and reliability by increasing parity redundancy. RAID technology uses the logical combination of multiple physical disks to achieve storage capacity expansion, improved I / O concurrency performance, and data redundancy protection. RAID arrays, through striped data distribution, mirroring backup, or parity redundancy mechanisms, ensure data security without being limited by the capacity and performance of a single disk. RAID 5 introduces single-parity (P-parity) blocks to recover data from the failed disk using remaining service data and parity data when a single disk fails. RAID 6 further adds double-parity (P+Q-parity) blocks, supporting data recovery from the simultaneous failure of two disks.

[0034] With the exponential growth of data volumes in data center and other application scenarios, existing RAID storage capacity is insufficient to meet storage demands. For example, a data center initially configured with a RAID 5 array (such as a 2TB usable capacity array composed of three 1TB hard drives) may experience a growth in storage demand to 4TB as business expands, at which point the disk array faces insufficient storage capacity. In this situation, RAID arrays require flexible and efficient online expansion capabilities. RAID online expansion refers to the process of increasing storage capacity by adding new hard drives without data loss. Online expansion is performed while the RAID array is running, allowing new hard drives to be added and redistributed across them without affecting existing data and business operations, thereby increasing storage capacity. Furthermore, RAID expansion is not only used for capacity upgrades but also for performance optimization. By increasing the number of member disks, data is distributed across more disks for parallel read and write operations, improving the I / O throughput of RAID 5, RAID 6, and other similar levels to meet the storage needs of high-concurrency services.

[0035] In related technologies, RAID online expansion typically includes the following steps: First, new disk integration is performed, where a new hard drive is connected to the array using hot-add technology, completing the identification and initial configuration of the new disk. Second, striped data reconstruction is carried out, which involves reading data blocks from the original disks strip by strip according to the striped data distribution rules of the expanded disk array, and reorganizing the data distribution logic according to the rules. Third, for redundancy levels such as RAID5 and RAID6, dynamic parity recalculation is performed. The global parity value is recalculated based on the expanded data block set to ensure that the parity data is consistent with the new data layout. Finally, data-parity collaborative writing is performed, which involves atomically writing the reorganized data blocks and the updated parity blocks to the expanded member disks according to the new RAID geometry, completing the logical mapping update of the physical storage space.

[0036] However, during RAID expansion, sudden member disk failures or disk offline events will force the expansion process to terminate. At this point, the expanded heterogeneous storage area (including newly added member disks) and the original storage area that has not yet been migrated will create a severe mismatch between logical volume metadata and physical topology. This topology fragmentation can lead to problems such as I / O path splitting, data consistency collapse, and irreversible expansion. I / O path splitting refers to the inability of read / write requests to simultaneously adapt to the old and new stripe data distribution rules, resulting in routing conflicts and inaccessible data. Data consistency collapse refers to the break in the checksum chain between migrated data blocks and the uninitialized areas of newly added disks, leading to silent data errors. Irreversible expansion refers to the lack of an atomic metadata update mechanism, making it impossible to roll back the array to its pre-expansion stable state or continue the expansion operation based on the remaining intermediate state.

[0037] Figure 1 This is a schematic diagram illustrating a scenario where a disk failure occurs during disk array expansion. Figure 1 This example illustrates disk array expansion using a RAID 5 array that has been expanded from 3 disks (disks 0-2) to 6 disks (disks 0-5). The example focuses on the second stripe to be generated in the expanded array. Figure 1 The storage contains stripes consisting of data blocks D5-D9. The stripe to be generated corresponds to the stripe set in the disk array before expansion, which includes the third, fourth, and fifth stripes (stripe numbers 2, 3, and 4 respectively). For example... Figure 1 As shown, during the expansion process, disk 0 fails, the expansion operation stops abnormally and becomes irreversible, resulting in data consistency failure and the expansion cannot continue.

[0038] Therefore, ensuring the continuity and data consistency of RAID online expansion in the event of disk anomalies is a key focus at present.

[0039] It should be noted that the disk array expansion method provided in this embodiment of the invention can be executed by a disk array expansion device. This device can be implemented as part or all of an electronic device through software, hardware, or a combination of both. The electronic device can be a server or a terminal. In this embodiment, the server can be a single server or a server cluster composed of multiple servers. The terminal can be a smartphone, personal computer, tablet computer, wearable device, or other intelligent hardware device such as a smart robot. The following method embodiments will use an electronic device as the execution subject for illustration.

[0040] According to an embodiment of the present invention, a disk array expansion method embodiment is provided. It should be noted that the steps shown in the flowchart in the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions. Furthermore, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in a different order than that shown here.

[0041] Figure 2 This is a flowchart of a disk array expansion method provided according to an embodiment of the present invention. Figure 2 As shown, the process includes:

[0042] S101, according to the preset expansion rules, determine the set of stripes to be generated in the expanded disk array that correspond to the set of stripes in the disk array before expansion.

[0043] The data in the strip set includes the data in the strips to be generated.

[0044] Specifically, a stripe refers to a logical data unit in a RAID disk array that splits data into fixed sizes (e.g., 64KB, 128KB) and stores it in parallel across multiple physical disks. It is the core carrier for achieving RAID parallel I / O and capacity aggregation. A stripe to be generated refers to a logical data unit that needs to be newly created in the expanded disk array. For example, if a disk array is expanded from 3 disks to 5 disks, the stripe numbered 4 in the expanded array is the stripe to be generated.

[0045] A stripe set refers to a combination of stripes in the original disk array that have a data association with the stripe to be generated. In other words, the data in the stripe set includes all the data required to generate the stripe. For example, stripe 4 to be generated integrates the data from stripes 2 and 3 in the original disk array; stripes 2 and 3 constitute the stripe set corresponding to stripe 4.

[0046] Preset expansion rules include, but are not limited to, stripe mapping relationships and data block allocation rules. For example, taking the expansion from 3 disks to 5 disks as an example, the preset expansion rules could be that each new stripe after expansion corresponds to the two original stripes before the expansion, and data blocks are associated in stripe sequence. Another example is that the new stripe sequence number N corresponds to the original stripe sequence number... (Round down) .

[0047] S102, if there is an abnormal disk in the disk array before expansion, read the data corresponding to the first type of data block in the stripe set from the disk array before expansion.

[0048] Among them, the disks corresponding to the first type of data blocks in the disk array before expansion are other disks besides the abnormal disks, and the disk array before expansion has the ability to recover data from the abnormal disks.

[0049] Specifically, abnormal disks refer to disks in the disk array that experienced issues such as disk damage or offline status before the expansion.

[0050] S103, based on the data corresponding to all the first type of data blocks, obtain the data corresponding to the second type of data blocks in the strip to be generated.

[0051] Among them, the second type of data blocks correspond to disks other than the abnormal disks in the expanded disk array.

[0052] Specifically, the first type of data block refers to the data blocks stored on the normal disks in the disk array before expansion, within the stripe set. The second type of data block refers to the data blocks in the stripe to be generated, which are then allocated to the normal disks in the disk array after expansion. It can be understood that the data in the second type of data block can be either business data or verification data.

[0053] For example, the data in the first type of data block is integrated, split, or verified, and a second type of data block is generated according to the stripe size of the expanded disk array. For instance, data 1 and data 2 of stripe 2 are integrated with data 3 and data 4 of stripe 3, and then split and reassembled according to the stripe rule that the expanded disk array contains 5 disks. At the same time, a new P / Q check block is calculated to form the second type of data block (including the check block) to be generated for the stripe.

[0054] S104. Write the data corresponding to each second type of data block in the stripe to be generated to the disk corresponding to each second type of data block in the expanded disk array, so as to complete the expansion of the disk array before expansion.

[0055] For example, based on the logical address allocation of the expanded disk, each second type of data block is written to the corresponding normal disk. After the writing is completed, the array metadata is updated, and the expansion of the stripe to be generated is marked as complete. Then, all stripes to be generated are processed in sequence.

[0056] In this embodiment, when there are abnormal disks in the disk array before expansion, the data of the first type of data blocks in the normal disk is read, and the second type of data is generated based on the data corresponding to all the first type of data blocks and written to the corresponding disk. This not only ensures data consistency during the disk array expansion process, but also has fault tolerance capability. By rationally allocating and reading / writing striped data, the reliability of expansion is improved, and online expansion of the disk array is realized in the case of disk abnormality.

[0057] In some embodiments, based on the foregoing embodiments, the disk array before expansion includes a parity disk, and the number of abnormal disks in the disk array before expansion is less than or equal to the number of parity disks.

[0058] In this way, the disk array before expansion includes a parity disk, and the number of abnormal disks does not exceed the number of parity disks. This ensures that the disk array is always within the effective redundancy protection range during the expansion process, has data recovery capability, and avoids problems such as unrecoverable data, expansion interruption, and topology inconsistency caused by the number of faulty disks exceeding the redundancy capacity.

[0059] In addition, when the disk array before expansion is a mirrored disk array, the disk array expansion method provided in this application embodiment can also be used to repair the data in the abnormal disk, thereby realizing the expansion of the disk array.

[0060] In some embodiments, based on any of the foregoing embodiments, the data corresponding to the second type of data blocks in the strip to be generated is obtained based on the data corresponding to all the first type of data blocks, specifically including the following steps:

[0061] a1 determines the data type of the data in the first type of data block of the first stripe.

[0062] The first strip is one of the strips in the set of strips.

[0063] Specifically, the data types include, but are not limited to, business data and parity data. For example, in a RAID5 stripe, data 1 and data 2 are business data, and P1 is parity data.

[0064] For example, the data type of the data in the data block can be identified by the data type identifier in the data block.

[0065] For example, the data type in the data block can be deduced by using the preset stripe data distribution rules corresponding to the disk array before expansion. For instance, in stripe 2 of a RAID5 array consisting of 3 disks, according to the "parity block cyclic arrangement" rule, the parity block is cyclically distributed on disk 0, disk 1, and disk 2. The parity block corresponding to stripe 2 is located on disk 2. Therefore, the data blocks on disk 0 and disk 1 are user data blocks, and the data blocks on disk 2 are parity blocks.

[0066] a2, based on the data type of the first type of data block in the first stripe and the preset stripe data distribution rules corresponding to the disk array before expansion, determine the data type of the third type of data block in the first stripe.

[0067] The third type of data block consists of data blocks from the abnormal disk before the expansion.

[0068] Specifically, the preset stripe data distribution rules refer to the preset data arrangement rules of the disk array before expansion, which define the location allocation logic of business data and verification data in the stripe.

[0069] For example, taking a RAID5 array as an example, the disk array before expansion contains 3 disks (two data disks and one parity disk). The default stripe data distribution rule of the RAID5 array is "the stripe size (also called the stripe width) is 3, and the parity blocks are arranged cyclically according to the disk number". For example, the P parity data of stripe 1 is on disk 1, and the P parity data of stripe 2 is on disk 2.

[0070] For example, taking a RAID6 array as an example, the disk array before expansion contains 4 disks. The default stripe data distribution rule of RAID6 is "the stripe size is 4, the P parity data is fixed on the second to last disk, and the Q parity data is on the last disk".

[0071] a3. After determining the data type of the third type of data block in all stripes in the stripe set, the data corresponding to the second type of data block in the stripe to be generated is obtained based on the data in the data block of the stripe set whose data type is a preset data type.

[0072] For example, the preset data type can be business data or user data.

[0073] In one possible implementation, after determining the data type of the third type of data blocks in all stripes of the stripe set, and before obtaining the data corresponding to the second type of data blocks in the stripe to be generated based on the data in the data blocks of the stripe set whose data type is a preset data type, the method provided in this application embodiment further includes the following:

[0074] If the data type of the data in the third type of data block of the first strip is a preset data type, the data in the third type of data block of the first strip is repaired according to the preset repair rules based on the data in the first type of data block of the first strip, and the data in the third type of data block of the first strip is obtained.

[0075] Specifically, the preset repair rules are used to repair the data in the third type of data blocks on the abnormal disk using the data in the first type of data blocks on the normal disk.

[0076] For example, taking a RAID5 array as an example, the preset repair rule can be the XOR operation recovery method, that is, the data in the third data block can be obtained by performing an XOR operation on the data in any two data blocks.

[0077] For example, taking a RAID6 array as an example, the preset repair rule can be the Reed-Solomon algorithm recovery method, that is, the data in the third type of data block in the abnormal disk can be recovered by using P check data, Q check data and any n-2 business data (n is the total number of disks in the disk array). For example, the first stripe contains data from four disks (data A from disk 1, data B from disk 2, P-parity data from disk 3, and Q-parity data from disk 4). Assuming disk 2 is an abnormal disk and data B is business data (i.e., a preset data type), the complete data of data B can be obtained by solving the Reed-Solomon polynomial equation using data A, P-parity data, Q-parity data, and the data A. The specific implementation process is as follows: First, based on the Reed-Solomon coding principle used in RAID6, a Galois field polynomial is constructed with the data in each disk's data block as coefficients. The known data A, P-parity data, and Q-parity data are substituted into the corresponding parity equations to form a system of linear equations containing the unknown data B. Then, the system of equations is solved within a finite field. Redundant terms are eliminated through matrix elimination or inverse calculation, uniquely solving for the value corresponding to data B. Finally, the calculated result is used as the data in the corresponding data block of disk 2, completing the reconstruction of business data B on the abnormal disk, thus obtaining the complete data to be migrated in the first stripe.

[0078] In this way, by determining whether the data type in the third type of data block in the first strip is a preset data type, repair is triggered only when the data in the third type of data block is business data, thus avoiding resource waste, ensuring the integrity of business data during the expansion process, providing a complete and reliable data source for the generation of the second type of data block, further enhancing the fault tolerance of the expansion process, ensuring the consistency and integrity of data after expansion, and reducing the risk of expansion failure due to abnormal disks.

[0079] In one possible implementation, in a3 above, the data corresponding to the second type of data block in the strip to be generated is obtained in the following way:

[0080] b1, obtain the first position information of the first data block in the first stripe corresponding to the stripe set, and the second position information of the second data block in the second stripe corresponding to the stripe set.

[0081] The first data block is the first data block in the strip to be generated whose data type is a preset data type, and the second data block is the last data block in the strip to be generated whose data type is a preset data type.

[0082] Specifically, the first data block is the first data block in the stripe to be generated with a preset data type, serving as the reference for locating the starting position of the data to be migrated in the disk array before expansion.

[0083] The first stripe refers to the original stripe in the stripe set that has a data association with the first data block, serving as the data source for the first data block. For example, if the stripe set is stripe 2 and stripe 3, and the data of the first data block originates from stripe 2, then stripe 2 is the first stripe. The first location information refers to the specific storage location of the data in the first data block within its corresponding first stripe, including but not limited to the data block number within the stripe and the disk logical address.

[0084] The second data block refers to the last data block in the strip to be generated whose data type is a preset data type, and is used as a reference to locate the termination position of the data to be migrated.

[0085] The second stripe is the stripe in the stripe set that has a data association with the second data block, serving as the data source for the second data block. The second location information refers to the specific storage location of the data within the second data block in the corresponding second stripe.

[0086] b2, based on the preset data type, the first position information and the second position information, determines the data block to be migrated corresponding to the strip to be generated from the data blocks of the preset data type in the strip set.

[0087] Optionally, based on a preset data type, first position information, and second position information, the data block to be migrated corresponding to the stripe to be generated is determined from the data blocks of the preset data type in the stripe set. This specifically includes the following steps:

[0088] First, based on the first and second location information, all data blocks corresponding to the first and second data blocks in the strip set are determined.

[0089] Then, the first data block, the second data block, and all data blocks corresponding to the first data block and the second data block in the stripe set are determined as candidate data blocks corresponding to the stripe to be generated.

[0090] Specifically, candidate data blocks are all data blocks in the strip set from the original position of the first data block in the strip set to the original position of the second data block in the strip set.

[0091] Finally, candidate data blocks with the preset data type are identified as the data blocks to be migrated corresponding to the stripes to be generated.

[0092] For example, the strip set is sorted by strip number, and the "first strip - data block number" corresponding to the first position information is used as the starting point, and the "second strip - data block number" corresponding to the second position information is used as the ending point. All data blocks with the preset data type in the continuous range are selected as data blocks to be migrated.

[0093] For example, starting from the second data block of strip 2 and ending at the third data block of strip 3, filter out all data blocks in the second data block of strip 2 and the first to third data blocks of strip 3.

[0094] In this way, the range of candidate data blocks is first determined based on the first and second location information, and then the data blocks to be migrated are selected by pre-defined data types, so as to avoid redundant data blocks such as data blocks storing verification data being mixed in during the expansion process.

[0095] Optionally, based on the number of second-type data blocks to be generated into stripes, determine the total number of data blocks to be migrated. Then, starting from the first location information, select data blocks of preset data types sequentially according to the stripe order in the disk array before expansion until the required number is reached. At the same time, verify whether the location information of the last data block is consistent with the second location information to ensure accurate filtering.

[0096] For example, the number of second-type data blocks to be generated in the strip is 4, and the total number of data blocks to be migrated is 4. Starting from the second block of strip 2, the second block of strip 2 and the first to third blocks of strip 3 are selected in sequence, for a total of 4, and the position of the last block matches the second position information.

[0097] b3, based on the data in the data block to be migrated corresponding to the strip to be generated, obtain the data corresponding to the second type of data block in the strip to be generated.

[0098] Optionally, if the capacity of the data blocks to be migrated is the same as the capacity of the second type of data blocks, the data in the second type of data blocks containing preset data types in the stripe to be generated is determined based on the order of each data block to be migrated in the disk array before expansion. Here, the data blocks containing preset data types in the stripe to be generated may include data blocks from abnormal disks. For data blocks containing preset data types from abnormal disks, the data in the corresponding data blocks to be migrated is not written to the expanded disk array.

[0099] For example, the number of data blocks to be migrated matches the number of data blocks to be generated that store data of a preset data type, and the data in each second type of data block is directly assigned according to the order of each data block to be migrated in the disk array before expansion.

[0100] Optionally, if the capacity of the data blocks to be migrated is inconsistent with the capacity of the second type of data blocks, the data in each data block to be migrated is processed according to the capacity corresponding to the second type of data blocks (e.g., splitting or merging) to obtain processed data. The order of each processed data is determined based on the order of the data blocks to be migrated in the disk array before expansion. Further, based on the order of each processed data and each processed data, the data in the second type of data blocks storing data of a preset data type is determined. Similarly, the data blocks storing data of a preset data type in the stripe to be generated may include data blocks from abnormal disks. For data blocks storing data of a preset data type from abnormal disks, the data in the corresponding data blocks to be migrated is not written to the expanded disk array.

[0101] For example, two 128KB data blocks to be migrated need to be converted into four 64KB second-type data blocks storing data of a preset data type. The data in each data block to be migrated is split into two 64KB blocks. Then, based on the order of the data blocks in the disk array before expansion, the order of the split data is determined. Further, based on the order of the split data and the individual split data, the second-type data blocks storing the preset data type are determined.

[0102] For example, two 64KB data blocks to be migrated need to be converted into a 128KB second-type data block that stores data of a preset data type. The two data blocks to be migrated are merged into one 128KB data block. Then, the order of each merged data is determined according to the order of the data blocks to be migrated in the disk array before the expansion. Based on the order of each merged data and each merged data, the second-type data block that stores data of a preset data type is determined.

[0103] In this way, based on the first location information corresponding to the first data block and the second location information corresponding to the second data block, the data block to be migrated is accurately located from the stripe set, ensuring the consistency between the data and metadata in the stripe to be generated.

[0104] Optionally, b3 above includes the following:

[0105] c1, based on the data in the data block to be migrated corresponding to the strip to be generated, determine the first type of verification data corresponding to the strip to be generated.

[0106] The first type of verification data is used to repair the data in the strip to be generated.

[0107] Specifically, the first type of check data consists of redundant check data in the stripe to be generated, used to ensure the recoverability of the data in the stripe. For example, for a RAID5 array, the stripe to be generated requires one set of P check data (first type of check data) to recover data when any one disk in the stripe fails. Similarly, for a RAID6 array, two sets of P and Q check data are required to support recovery from the failure of two disks.

[0108] For example, taking a RAID5 array as an example, the first type of parity data is P-parity data. Based on the data in the data block to be migrated corresponding to the stripe to be generated, a bitwise XOR operation is performed, and the result is the P-parity data.

[0109] For example, taking a RAID6 array as an example, the first type of parity data includes P-parity data and Q-parity data. Based on the data in the data blocks to be migrated corresponding to the stripe to be generated, an XOR operation is performed to obtain the P-parity data. At the same time, based on polynomial operations in the Galois field, encoding calculations are performed on all data blocks to be migrated to obtain the Q-parity data.

[0110] c2, based on the first type of verification data and the data in the data block to be migrated corresponding to the stripe to be generated, obtains the data corresponding to the second type of data block in the stripe to be generated according to the preset stripe data distribution rules corresponding to the expanded disk array.

[0111] Specifically, the preset stripe data distribution rules corresponding to the expanded disk array refer to the location allocation logic of business data and verification data in the expanded disk array (such as the verification data being in a fixed position or a cyclic position, the order in which business data is filled, etc.).

[0112] In one possible scenario, based on the first type of verification data and the data in the data block to be migrated corresponding to the stripe to be generated, the data corresponding to the second type of data block in the stripe to be generated is obtained according to the preset stripe data distribution rules corresponding to the expanded disk array. The specific steps include the following:

[0113] First, based on the preset strip data distribution rules, the third position information of the data block corresponding to the first type of verification data in the strip to be generated is determined.

[0114] The third type of location information refers to the specific storage location of the first type of parity data within the stripe to be generated, including but not limited to the data block sequence number, disk number, and logical address within the stripe. For example, taking a RAID5 array with 5 disks after expansion as an example, the preset stripe data distribution rule is that the parity data is arranged cyclically. The first type of parity data (P4) of the stripe to be generated is allocated to the 5th data block of disk 5, corresponding to the logical address 0x00200000. Then, disk 5 - data block 5 - address 0x00200000 is the third type of location information.

[0115] Then, based on the position information of the data block to be migrated corresponding to the strip to be generated in the strip set, and the third position information, the data corresponding to each second type of data block in the strip to be generated is determined from the first type of verification data and the data of the data block to be migrated corresponding to the strip to be generated.

[0116] Specifically, the original storage location identifier of the data block to be migrated in the stripe set before expansion includes the original stripe number, the block number within the original stripe, and the original disk logical address. For example, if the data in the data block to be migrated comes from the second data block of stripe 2 in the stripe set (the logical address of this data block on disk 1 is 0x00100000), the above information is the location information of the data block to be migrated.

[0117] In this way, by accurately deriving the third location information of the first type of verification data through the preset stripe data distribution rules, and then combining it with the original location information of the data block to be migrated, the business data and verification data in the stripe to be generated can be accurately allocated, avoiding data distribution chaos during the expansion process and enhancing the reliability and data accuracy of disk array expansion.

[0118] For example, the capacity of the data block to be migrated is the same as the capacity of the second type of data block. Based on the third location information, the first type of parity data is stored in the corresponding data block of the expanded disk array. Then, the data in the data block to be migrated is sequentially filled into the second type of data block according to the order of each data block to be migrated in the disk array before expansion (i.e., the location information). For example, taking a RAID5 array with 5 disks as an example, the allocation logic for the data 1 in data block 1 to be migrated, the data 2 in data block 2 to be migrated, the data 3 in data block 3 to be migrated, the data 4 in data block 4 to be migrated, and the first type of parity data P4 is as follows: write data 1 to the second type of data block in disk 1, write data 2 to the second type of data block in disk 2, write data 3 to the second type of data block in disk 3, write data 4 to the second type of data block in disk 4, and write P4 to the second type of data block in disk 5.

[0119] It should be noted that when the data block in the stripe to be generated corresponding to the first type of verification data is determined to belong to an abnormal disk based on the preset stripe data distribution rules, the writing of the first type of verification data is skipped. Similarly, when the data block in the stripe to be generated corresponding to the data block to be migrated is determined to belong to an abnormal disk based on the preset stripe data distribution rules, the writing of the data in the data block to be migrated is skipped. Data blocks in the stripe to be generated that store data of a preset data type may include data blocks from abnormal disks. For data blocks on abnormal disks that store data of a preset data type, the data in the corresponding data blocks to be migrated is also not written to the expanded disk array.

[0120] In this way, the first type of verification data is calculated based on the data in the data block to be migrated, and combined with the preset stripe data distribution rules corresponding to the expanded disk array, the expanded disk array is ensured to have fault recovery capability, thereby improving the overall reliability of the expanded disk array.

[0121] In this embodiment, after determining the data type of the data in the first type of data block, the data type of the third type of data block on the abnormal disk is derived based on the preset stripe data distribution rules of the disk array before expansion. Data of the preset data type is then extracted through methods such as data integration, splitting, and data verification and recovery. This solves the problem of the abnormal disk being unreadable, ensures the accuracy of data reconstruction, and allows for the extraction of complete data without relying on the abnormal disk. This avoids the interruption of the expansion process caused by data loss in the abnormal disk in related technologies, ensures the integrity and consistency of data during the expansion process, and further enhances the fault tolerance of the disk array expansion process.

[0122] In some embodiments, based on any of the foregoing embodiments, the set of stripes to be generated in the expanded disk array corresponding to the set of stripes in the disk array before the expansion is determined according to a preset expansion rule, specifically including the following steps:

[0123] d1 retrieves the first stripe number, the size of the first stripe corresponding to the disk array before expansion, and the size of the second stripe corresponding to the disk array after expansion.

[0124] The first strip number indicates the strip to be generated.

[0125] Specifically, the first stripe number is the identifier of the stripe to be generated in the expanded disk array. In this embodiment, the stripe numbers in the disk array start from 0.

[0126] Stripe size, also known as stripe width, refers to the total number of disks in a disk array. For example, if the disk array before expansion contained 3 disks, then the first stripe size would be 3.

[0127] d2, based on the first stripe number, the first stripe size, and the second stripe size, determines the starting stripe and the ending stripe in the stripe set, respectively.

[0128] In one possible implementation, the starting and ending stripes in the stripe set are determined based on the first stripe number, the first stripe size, and the second stripe size, respectively, specifically including the following:

[0129] First, based on the first stripe number and the second stripe size, determine the starting logical block number of the stripe to be generated in the expanded disk array.

[0130] Specifically, the starting logical block number refers to the global number of the first data block in the expanded disk array where the stripe to be generated is located, serving as the base address for data addressing.

[0131] For example, the starting logical block number is determined by the following formula (applicable to disk arrays with logical block numbering starting from 0): Starting logical block number = (first stripe number - 1) × second stripe size.

[0132] For example, assuming the first stripe number is 4 and the second stripe size is 5, then the starting logic block number is (4-1)×5=15.

[0133] Then, the starting logic block number is divided by the size of the first stripe to obtain the second stripe number.

[0134] The second strip number is used to indicate the starting strip.

[0135] In one possible implementation, if the ratio between the starting logic block number and the first stripe size is an integer, then the ratio is determined as the second stripe number.

[0136] For example, if the starting logic block number is 21 and the first stripe size is 3, and the ratio is an integer, then 21 / 3 will be used as the second stripe number.

[0137] Alternatively, if the ratio between the starting logic block number and the size of the first stripe is not an integer, the ratio is rounded down, and the result obtained after rounding down is determined as the second stripe number.

[0138] For example, if the starting logic block number is 22 and the first stripe size is 3, and the ratio is not an integer, then the integer part of 22 / 3 will be determined as the second stripe number.

[0139] In this way, the second stripe number is determined by the ratio of the starting logic block number to the size of the first stripe, thus achieving accuracy in stripe mapping before and after expansion.

[0140] Next, the starting logic block number and the second strip size are summed to obtain the summed starting logic block number.

[0141] Finally, the summed starting logic block number is divided by the size of the first stripe to obtain the third stripe number.

[0142] The third band number is used to indicate the end of the band.

[0143] In this way, the set of stripes corresponding to the stripe to be generated can be quickly determined based on the stripe number and stripe size, realizing a one-to-one correspondence between stripes and data blocks before and after expansion, ensuring that the data migration range is complete without omissions, overlaps, or misalignments, and avoiding problems such as data distribution switching and address offset errors during the expansion process.

[0144] d3, based on the start and end stripes, yields the strip set.

[0145] In this embodiment, the starting and ending stripes in the disk array before expansion are located by the first stripe number, the first stripe size, and the second stripe size, and finally form a stripe set, thereby improving the accuracy of disk array expansion.

[0146] In some embodiments, based on the foregoing embodiments, the first location information is the block offset of the data block corresponding to the first data block in the first stripe; obtaining the first location information of the first data block in the first stripe corresponding to the stripe set specifically includes the following:

[0147] The first position information is obtained by taking the modulo operation between the starting logic block number and the size of the first stripe.

[0148] For example, if the starting logic block number is 20 and the first stripe size is 3, then the first position information is 2.

[0149] In this way, the first location information is determined as the block offset of the data block within the stripe. The first location information is obtained by taking the remainder of the starting logical block number with respect to the size of the first stripe. This achieves accurate mapping of the data block's position within the stripe before and after expansion, avoiding problems such as data misalignment, offset errors, and block address confusion. It provides a stable and reliable location basis for subsequent screening of data blocks to be migrated and construction of stripes to be generated.

[0150] In some embodiments, based on any of the foregoing embodiments, the second location information is the block offset of the data block corresponding to the second data block in the second stripe; obtaining the second location information of the second data block in the second stripe corresponding to the stripe set specifically includes the following:

[0151] The second position information is obtained by performing a remainder operation between the summed starting logic block number and the size of the first stripe.

[0152] In this embodiment, the second location information is the block offset of the data block corresponding to the second data block in the second stripe. Combined with the remainder operation of the summed starting logical block number and the size of the first stripe, the position of the second data block in the disk array before expansion is quickly and accurately located.

[0153] In some embodiments, when there are abnormal disks in the disk array before expansion, after reading the data corresponding to the first type of data block in the stripe set from the disk array before expansion, the method provided in this application embodiment further includes the following:

[0154] First, the data corresponding to the data blocks to be migrated in the stripe set is stored in the preset storage area.

[0155] Specifically, the preset storage area refers to the pre-allocated cache space used to temporarily store the read data to be migrated, avoiding repeated disk reads.

[0156] Then, a first mapping relationship is constructed between the address pointer of each data in the preset storage area and the corresponding data block to be migrated, and the first mapping relationship is stored in the preset storage area.

[0157] Specifically, the address pointer is used to indicate the storage location of data in a preset storage area.

[0158] For example, the first mapping relationship is constructed using the address pointer as the key and the identification information of the data block to be migrated (such as the stripe number and data block number) as the value.

[0159] In this way, by temporarily storing the data to be migrated in a preset storage area and establishing a primary mapping relationship between address pointers and the data blocks to be migrated, centralized cache management and rapid location of data are achieved during the expansion process. In degradation scenarios with faulty disks, it can effectively reduce repeated I / O access to the disk, reduce array load, and avoid data read interruptions caused by disk failures.

[0160] Furthermore, based on the position information of the data block to be migrated corresponding to the stripe to be generated in the stripe set, and the third position information, after determining the data corresponding to each second type of data block in the stripe to be generated from the first type of verification data and the data of the data block to be migrated corresponding to the stripe to be generated, the method provided in this application embodiment further includes the following:

[0161] Based on the data corresponding to each second type of data block in the strip to be generated, the first mapping relationship is adjusted to obtain the second mapping relationship, and the second mapping relationship is stored in the preset storage area.

[0162] In this way, the first mapping relationship is adjusted based on the second data block to form a second mapping relationship, ensuring a one-to-one correspondence between the data in the preset storage area and its location in the expanded disk array. This avoids secondary addressing calculations after data reorganization in scenarios involving disk expansion anomalies, achieving unified management of data and mapping relationships.

[0163] Furthermore, before writing the data corresponding to each of the second type of data blocks in the stripe to be generated into the disks corresponding to each of the second type of data blocks in the expanded disk array, the method provided in this application embodiment also includes the following:

[0164] Based on the second mapping relationship, the data corresponding to each second type of data block is obtained from the preset storage area and written to the disks corresponding to each second type of data block in the expanded disk array.

[0165] Figure 3 This is a schematic diagram of the data layout in the preset storage area. The entire preset storage area is divided into fixed-size 4KB data pages. The first 4KB data page serves as the mapping relationship storage area, and subsequent 4KB data pages are used to store the actual data. After reading the first type of data blocks in the disk array before expansion and determining the data corresponding to the data blocks to be migrated, the data corresponding to the data blocks to be migrated in the stripe set is stored into the subsequent 4KB data pages respectively. At the same time, the first mapping relationship is constructed and stored in the first 4KB data page. This mapping relationship is represented by a series of 64-bit address pointers (i.e., pointer 0, pointer 1, pointer 2... pointer n), each pointer pointing to a 4KB data page storing the data block to be migrated, thus establishing the association between the cache address and the data block to be migrated.

[0166] After determining the second type of data block to be generated based on the location information and third location information of the data block to be migrated, there is no need to move the actual 4KB data content. The first mapping relationship can be adjusted and the second mapping relationship can be constructed simply by swapping the arrangement order of each pointer in the first 4KB data page. Finally, the second mapping relationship is also stored in the first 4KB data page, thus completing the reconstruction of the data logical layout.

[0167] Before writing to the expanded disk array, based on the second mapping relationship, the pointer sequence in the first 4KB data page is directly traversed. According to the pointing order of each pointer, the data corresponding to each second type of data block is obtained from the corresponding 4KB data page in sequence, and then accurately written to the corresponding disk in the expanded disk array, so as to realize the orderly migration and writing of data.

[0168] The following continues with Figure 1 Taking a disk array expansion scenario as an example, the disk array expansion method provided in this application will be illustrated by way of example.

[0169] First, such as Figure 4 As shown, the good disk data (i.e., D4 and D5) in stripe 2 of the disk array before expansion is read into the cache. It is found that the bad disk data is parity data P, which is not needed during expansion, so it does not need to be repaired. The location where P should be stored is blank data.

[0170] Then, as Figure 5 As shown, the good disk data (i.e., D7 and P) of stripe 3 in the disk array before expansion is read into the cache. The damaged data in stripe 3 of the disk array before expansion is D6, which is needed during expansion. Therefore, an XOR operation is performed on D7 and P to obtain D6.

[0171] Next, as Figure 6 As shown, the good disk data (i.e., D9 and P) in stripe 4 of the disk array before expansion is read into the cache. The damaged data in stripe 4 of the disk array before expansion is D8, which is needed during expansion. Therefore, an XOR operation is performed on D9 and P to obtain D8.

[0172] It should be noted that, Figures 4-6 In this context, P represents the check data, and the P values ​​differ across different stripes.

[0173] Finally, for stripe 1 in the expanded disk array, the corresponding stripe sets are stripe 2, stripe 3, and stripe 4. For example... Figure 7 As shown, D4, D5, D6, D7, D8, and D9 are data blocks of a preset data type within the stripe set, respectively. Expansion requires the data blocks to be migrated containing D5, D6, D7, D8, and D9. Based on D5, D6, D7, D8, and D9, the corresponding first checksum data is calculated, thereby obtaining the data in each of the second-type data blocks.

[0174] Figure 8 The overall flowchart for online disk array expansion is as follows: First, the total number of stripes n in the expanded disk array is calculated, and the current processing stripe number k is initialized to 0. Then, a loop is entered, and the expansion operation is performed on the k-th stripe according to the preset stripe data distribution rules corresponding to the expanded disk array. During the expansion operation, disk failures are detected in real time. If no failure is detected, the expansion operation is performed as a good disk (i.e., normal expansion). If a disk failure is detected, the expansion operation is performed as a bad disk, i.e., based on the disk array expansion method provided in this application. After the k-th stripe expansion is completed, the stripe number k is incremented by 1, and it is checked whether k is less than the total number of stripes n. If yes, the loop returns to continue processing the next stripe; otherwise, the expansion process ends, completing the online expansion of the disk array.

[0175] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods according to the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method.

[0176] This application also provides a disk array expansion device for implementing the above embodiments and preferred embodiments; details already described will not be repeated. As used below, the term "module" can refer to a combination of software and / or hardware that performs a predetermined function. Although the device described in the following embodiments is preferably implemented in software, hardware implementation, or a combination of software and hardware, is also possible and contemplated.

[0177] This embodiment provides a disk array expansion device, such as... Figure 9 As shown, it includes:

[0178] The first determining module 901 is used to determine, according to a preset expansion rule, the stripe set corresponding to the stripe to be generated in the disk array before expansion in the disk array, wherein the data in the stripe set includes the data in the stripe to be generated;

[0179] The reading module 902 is used to read data corresponding to the first type of data block in the stripe set from the disk array before expansion when there is an abnormal disk in the disk array before expansion. The disk corresponding to the first type of data block in the disk array before expansion is a disk other than the abnormal disk. The disk array before expansion has the ability to recover data from the abnormal disk.

[0180] The second determining module 903 is used to obtain the data corresponding to the second type of data block in the stripe to be generated based on the data corresponding to all the first type of data blocks respectively, wherein the disk corresponding to the second type of data block in the expanded disk array is the disk other than the abnormal disk;

[0181] The writing module 904 is used to write the data corresponding to each of the second type of data blocks in the stripe to be generated to the disks corresponding to each second type of data block in the expanded disk array, so as to complete the expansion of the disk array before expansion.

[0182] In one possible implementation, the second determining module 903 is specifically used to determine the data type of the data in the first type of data block of the first stripe, wherein the first stripe is a stripe in a stripe set;

[0183] Based on the data type of the first type of data block in the first stripe and the preset stripe data distribution rules corresponding to the disk array before expansion, the data type of the third type of data block in the first stripe is determined, wherein the third type of data block is the data block in the abnormal disk before expansion;

[0184] After determining the data type of the third type of data blocks in all stripes in the stripe set, the data corresponding to the second type of data blocks in the stripe to be generated is obtained based on the data in the data blocks of the stripe set whose data type is a preset data type.

[0185] In one possible implementation, the device further includes a repair module, which, after determining the data type of the data in the third type of data blocks of all stripes in the stripe set, and before obtaining the data corresponding to the second type of data block in the stripe to be generated based on the data in the data blocks of the stripe set whose data type is a preset data type, repairs the data in the third type of data blocks of the first stripe according to a preset repair rule, based on the data in the first type of data blocks of the first stripe, to obtain the data in the third type of data blocks of the first stripe.

[0186] In one possible implementation, the second determining module 903 is specifically used to obtain the first position information of the first data block in the first stripe corresponding to the stripe set, and the second position information of the second data block in the second stripe corresponding to the stripe set, wherein the first data block is the first data block in the stripe to be generated whose data type is a preset data type, and the second data block is the last data block in the stripe to be generated whose data type is a preset data type.

[0187] Based on the preset data type, the first position information, and the second position information, the data block to be migrated corresponding to the strip to be generated is determined from the data blocks of the preset data type in the strip set;

[0188] Based on the data in the data block to be migrated corresponding to the strip to be generated, the data corresponding to the second type of data block in the strip to be generated is obtained.

[0189] In one possible implementation, the second determining module 903 is specifically used to determine the first type of verification data corresponding to the strip to be generated based on the data in the data block to be migrated corresponding to the strip to be generated, wherein the first type of verification data is used to repair the data in the strip to be generated;

[0190] Based on the first type of verification data and the data in the data block to be migrated corresponding to the stripe to be generated, the data corresponding to the second type of data block in the stripe to be generated is obtained according to the preset stripe data distribution rules corresponding to the expanded disk array.

[0191] In one possible implementation, the second determining module 903 is specifically used to determine the third position information of the data block corresponding to the first type of verification data in the strip to be generated based on the preset strip data distribution rules;

[0192] Based on the position information of the data block to be migrated corresponding to the strip to be generated in the strip set, and the third position information, the data corresponding to each second type of data block in the strip to be generated is determined from the first type of verification data and the data of the data block to be migrated corresponding to the strip to be generated.

[0193] In one possible implementation, the second determining module 903 is specifically used to determine all data blocks corresponding to the first data block and the second data block in the strip set based on the first location information and the second location information.

[0194] The first data block, the second data block, and all data blocks corresponding to the first data block and the second data block in the strip set are identified as candidate data blocks corresponding to the strip to be generated.

[0195] Candidate data blocks with a preset data type are identified as the data blocks to be migrated corresponding to the stripes to be generated.

[0196] In one possible implementation, the second determining module 903 is specifically used to obtain the first stripe number, the first stripe size corresponding to the disk array before expansion, and the second stripe size corresponding to the disk array after expansion, wherein the first stripe number is used to indicate the stripe to be generated;

[0197] Based on the first stripe number, the first stripe size, and the second stripe size, determine the starting stripe and the ending stripe in the stripe set, respectively;

[0198] Based on the start and end stripes, a strip set is obtained.

[0199] In one possible implementation, the second determining module 903 is specifically used to determine the starting logical block number of the stripe to be generated in the expanded disk array based on the first stripe number and the second stripe size.

[0200] Divide the starting logic block number by the size of the first stripe to obtain the second stripe number, which is used to indicate the starting stripe;

[0201] Summing the starting logic block number and the size of the second stripe yields the summed starting logic block number.

[0202] Divide the summed starting logic block number by the size of the first stripe to obtain the third stripe number, which is used to indicate the end stripe.

[0203] In one possible implementation, the second determining module 903 is specifically used to determine the ratio as the second stripe number if the ratio between the starting logic block number and the first stripe size is an integer.

[0204] or,

[0205] If the ratio between the starting logic block number and the size of the first stripe is not an integer, the ratio is rounded down, and the result obtained after rounding down is determined as the second stripe number.

[0206] In one possible implementation, the first position information is the block offset of the data block corresponding to the first data block in the first stripe in the first stripe; the second determining module 903 is specifically used to perform a remainder operation between the starting logical block number and the size of the first stripe to obtain the first position information.

[0207] In one possible implementation, the second position information is the block offset of the data block corresponding to the second data block in the second stripe; the second determining module 903 is specifically used to perform a remainder operation between the summed starting logical block number and the size of the first stripe to obtain the second position information.

[0208] In one possible implementation, the disk array before expansion includes a parity disk, and the number of abnormal disks in the disk array before expansion is less than or equal to the number of parity disks.

[0209] In one possible implementation, the device further includes a storage module, which, in the event that there is an abnormal disk in the disk array before expansion, reads the data corresponding to the first type of data block in the stripe set from the disk array before expansion, and then stores the data corresponding to the data block to be migrated in the stripe set into a preset storage area.

[0210] Construct a first mapping relationship between the address pointer of each data in the preset storage area and the corresponding data block to be migrated, and store the first mapping relationship in the preset storage area.

[0211] In one possible implementation, the storage module is further configured to determine the data corresponding to each second type of data block in the stripe to be generated from the first type of verification data and the data of the data block to be migrated corresponding to the stripe to be generated, based on the location information of the data block to be migrated in the stripe set corresponding to the stripe to be generated, and the third location information; then, based on the data corresponding to each second type of data block in the stripe to be generated, adjust the first mapping relationship to obtain the second mapping relationship, and store the second mapping relationship in a preset storage area.

[0212] In one possible implementation, the storage module is further configured to, based on a second mapping relationship, obtain the data corresponding to each of the second type of data blocks from a preset storage area before writing the data corresponding to each of the second type of data blocks in the expanded disk array to the disks corresponding to each of the second type of data blocks, respectively.

[0213] The apparatus provided in this application embodiment reads data from the first type of data blocks in the normal disk when there are abnormal disks in the disk array before expansion. Based on the data corresponding to all the first type of data blocks, the second type of data is reconstructed and written to the corresponding disk. This ensures data consistency during the disk array expansion process and has fault tolerance capability. By rationally allocating and reading / writing striped data, the reliability of expansion is improved, and online expansion of the disk array is realized in the case of disk abnormality.

[0214] For a description of the features in the embodiment corresponding to the disk array expansion device, please refer to the relevant description of the embodiment corresponding to the disk array expansion method, which will not be repeated here.

[0215] Embodiments of this application also provide an electronic device, such as... Figure 10 As shown, the system includes a memory 10 and a processor 20. The memory 10 stores a computer program, and the processor 20 is configured to run the computer program to perform the steps in any of the disk array expansion method embodiments described above.

[0216] Embodiments of this application also provide a computer-readable storage medium storing a computer program, wherein the computer program is configured to execute the steps in any of the disk array expansion method embodiments described above when it is run.

[0217] In one exemplary embodiment, the aforementioned computer-readable storage medium may include, but is not limited to, various media capable of storing computer programs, such as a USB flash drive, read-only memory (ROM), random access memory (RAM), portable hard disk, magnetic disk, or optical disk.

[0218] The embodiments of this application also provide a computer program product, which includes a computer program that, when executed by a processor, implements the steps in any of the disk array expansion method embodiments described above.

[0219] Embodiments of this application also provide another computer program product, including a non-volatile computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps in any of the disk array expansion method embodiments described above.

[0220] Those skilled in the art will further recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0221] The foregoing has provided a detailed description of a disk array expansion method, apparatus, electronic device, and medium provided in this application. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the embodiments above are merely for the purpose of helping to understand the method and its core ideas. It should be noted that those skilled in the art can make various improvements and modifications to this application without departing from its principles, and these improvements and modifications also fall within the protection scope of the claims of this application.

Claims

1. A method for expanding the capacity of a disk array, characterized in that, The method includes: According to the preset expansion rules, determine the set of stripes in the disk array before expansion that correspond to the stripes to be generated in the disk array after expansion, wherein the data in the stripe set includes the data in the stripes to be generated; If there is an abnormal disk in the disk array before expansion, data corresponding to the first type of data block in the stripe set is read from the disk array before expansion. The disk corresponding to the first type of data block in the disk array before expansion is a disk other than the abnormal disk. The disk array before expansion has the ability to recover data from the abnormal disk. Based on the data corresponding to all the first type of data blocks, the data corresponding to the second type of data blocks in the stripe to be generated is obtained, wherein the disk corresponding to the second type of data blocks in the expanded disk array is a disk other than the abnormal disk; The data corresponding to each of the second type of data blocks in the stripe to be generated is written to the disk corresponding to each of the second type of data blocks in the expanded disk array, so as to complete the expansion of the disk array before expansion.

2. The method according to claim 1, characterized in that, The step of obtaining the data corresponding to the second type of data block in the strip to be generated based on the data corresponding to all the first type of data blocks includes: Determine the data type of the data in the first type of data block of the first stripe, wherein the first stripe is a stripe in the stripe set; Based on the data type of the data in the first type of data block of the first stripe and the preset stripe data distribution rule corresponding to the disk array before expansion, the data type of the data in the third type of data block of the first stripe is determined, wherein the third type of data block is the data block in the abnormal disk before expansion; After determining the data type of the third type of data blocks in all stripes of the strip set, the data corresponding to the second type of data blocks in the stripe to be generated is obtained based on the data in the data blocks of the strip set whose data type is a preset data type.

3. The method according to claim 2, characterized in that, Before obtaining the data corresponding to the second type of data block in the stripe to be generated based on the data in the data blocks of the third type of data in all stripes in the stripe set, after determining the data type of the data in the third type of data blocks of the stripe set, the method further includes: If the data type of the data in the third type of data block of the first strip is the preset data type, the data in the third type of data block of the first strip is repaired according to the preset repair rules based on the data in the first type of data block of the first strip to obtain the data in the third type of data block of the first strip.

4. The method according to claim 2, characterized in that, After determining the data type of the third type of data blocks in all stripes of the stripe set, the data corresponding to the second type of data blocks in the stripe to be generated is obtained based on the data in the data blocks of the stripe set whose data type is a preset data type, including: Obtain the first position information of the first data block in the first stripe corresponding to the stripe set, and the second position information of the second data block in the second stripe corresponding to the stripe set, wherein the first data block is the first data block in the stripe to be generated whose data type is the preset data type, and the second data block is the last data block in the stripe to be generated whose data type is the preset data type; Based on the preset data type, the first location information, and the second location information, determine the data block to be migrated corresponding to the strip to be generated from the data blocks of the preset data type in the strip set; Based on the data in the data block to be migrated corresponding to the strip to be generated, the data corresponding to the second type of data block in the strip to be generated is obtained.

5. The method according to claim 4, characterized in that, The step of obtaining the data corresponding to the second type of data block in the strip to be generated based on the data in the data block to be migrated corresponding to the strip to be generated includes: Based on the data in the data block to be migrated corresponding to the strip to be generated, a first type of verification data corresponding to the strip to be generated is determined, wherein the first type of verification data is used to repair the data in the strip to be generated; Based on the first type of verification data and the data in the data block to be migrated corresponding to the stripe to be generated, the data corresponding to the second type of data block in the stripe to be generated is obtained according to the preset stripe data distribution rules corresponding to the expanded disk array.

6. The method according to claim 5, characterized in that, Based on the first type of verification data and the data in the data block to be migrated corresponding to the stripe to be generated, the data corresponding to the second type of data block in the stripe to be generated is obtained according to the preset stripe data distribution rules corresponding to the expanded disk array, including: Based on the preset strip data distribution rules, the third position information of the data block corresponding to the first type of verification data in the strip to be generated is determined; Based on the position information of the data block to be migrated corresponding to the strip to be generated in the strip set, and the third position information, the data corresponding to each second type of data block in the strip to be generated is determined from the first type of verification data and the data of the data block to be migrated corresponding to the strip to be generated.

7. The method according to claim 4, characterized in that, The step of determining the data block to be migrated corresponding to the stripe to be generated from the data blocks of the preset data type in the stripe set, based on the preset data type, the first location information, and the second location information, includes: Based on the first location information and the second location information, determine all data blocks corresponding to the first data block and the second data block in the strip set; The first data block, the second data block, and all data blocks corresponding to the first data block and the second data block in the strip set are determined as candidate data blocks corresponding to the strip to be generated; The candidate data block with the preset data type is determined as the data block to be migrated corresponding to the strip to be generated.

8. The method according to any one of claims 4-7, characterized in that, The step of determining the set of stripes to be generated in the expanded disk array according to the preset expansion rules includes: Obtain the first stripe number, the first stripe size corresponding to the disk array before expansion, and the second stripe size corresponding to the disk array after expansion, wherein the first stripe number is used to indicate the stripe to be generated; Based on the first stripe number, the first stripe size, and the second stripe size, the start stripe and the end stripe in the stripe set are determined respectively; The strip set is obtained based on the starting strip and the ending strip.

9. The method according to claim 8, characterized in that, The step of determining the start and end stripes in the strip set based on the first stripe number, the first stripe size, and the second stripe size includes: Based on the first stripe number and the second stripe size, determine the starting logical block number of the stripe to be generated in the expanded disk array; Divide the starting logic block number by the first stripe size to obtain the second stripe number, wherein the second stripe number is used to indicate the starting stripe; Summing the starting logic block number and the second stripe size yields the summed starting logic block number. The summed starting logic block number is divided by the first stripe size to obtain the third stripe number, which is used to indicate the ending stripe.

10. The method according to claim 9, characterized in that, The step of dividing the starting logic block number by the first stripe size to obtain the second stripe number includes: If the ratio between the starting logic block number and the first stripe size is an integer, then the ratio is determined as the second stripe number; or, If the ratio between the starting logic block number and the first stripe size is not an integer, then the ratio is rounded down, and the result obtained after rounding down is determined as the second stripe number.

11. The method according to claim 9, characterized in that, The first position information is the block offset of the data block corresponding to the first data block in the first stripe; obtaining the first position information of the first data block in the first stripe corresponding to the stripe set includes: The first position information is obtained by taking the remainder of the starting logic block number and the first stripe size.

12. The method according to claim 9, characterized in that, The second location information is the block offset of the data block corresponding to the second data block in the second stripe; obtaining the second location information of the second data block in the second stripe corresponding to the stripe set includes: The second position information is obtained by performing a remainder operation between the summed starting logic block number and the first stripe size.

13. The method according to any one of claims 1-4, characterized in that, The disk array before expansion includes a check disk, and the number of abnormal disks in the disk array before expansion is less than or equal to the number of check disks.

14. The method according to claim 6, characterized in that, In the case where there are abnormal disks in the disk array before expansion, after reading the data corresponding to the first type of data block in the stripe set from the disk array before expansion, the method further includes: The data corresponding to the data blocks to be migrated in the strip set is stored in a preset storage area; Construct a first mapping relationship between the address pointer of each data in the preset storage area and the corresponding data block to be migrated, and store the first mapping relationship in the preset storage area.

15. The method according to claim 14, characterized in that, After determining the data corresponding to each of the second type of data blocks in the stripe to be generated from the first type of verification data and the data of the data of the data of the data blocks to be migrated corresponding to the stripe to be generated in the stripe set based on the position information of the data blocks to be migrated corresponding to the stripe to be generated, and the third position information, the method further includes: Based on the data corresponding to each of the second type of data blocks in the strip to be generated, the first mapping relationship is adjusted to obtain the second mapping relationship, and the second mapping relationship is stored in the preset storage area.

16. The method according to claim 15, characterized in that, Before writing the data corresponding to each of the second type of data blocks in the stripe to be generated into the disks corresponding to each of the second type of data blocks in the expanded disk array, the method further includes: Based on the second mapping relationship, data corresponding to each of the second type of data blocks is obtained from the preset storage area and written to the disks corresponding to each of the second type of data blocks in the expanded disk array.

17. A disk array expansion device, characterized in that, The device includes: The first determining module is used to determine, according to a preset expansion rule, the set of stripes in the disk array before expansion that correspond to the stripes to be generated in the disk array after expansion, wherein the data in the stripe set includes the data in the stripes to be generated; The reading module is used to read data corresponding to the first type of data block in the stripe set from the disk array before expansion when there is an abnormal disk in the disk array before expansion. The disk corresponding to the first type of data block in the disk array before expansion is a disk other than the abnormal disk. The disk array before expansion has the ability to recover data from the abnormal disk. The second determining module is used to obtain the data corresponding to the second type of data block in the stripe to be generated based on the data corresponding to all the first type of data blocks respectively, wherein the disk corresponding to the second type of data block in the expanded disk array is a disk other than the abnormal disk; The writing module is used to write the data corresponding to each of the second type of data blocks in the stripe to be generated to the disks corresponding to each of the second type of data blocks in the expanded disk array, so as to complete the expansion of the disk array before expansion.

18. An electronic device, characterized in that, include: Memory, used to store computer programs; A processor, configured to implement the steps of the disk array expansion method as described in any one of claims 1-16 when executing the computer program.

19. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program, wherein when the computer program is executed by a processor, it implements the steps of the disk array expansion method as described in any one of claims 1-16.

20. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by the processor, it implements the steps of the disk array expansion method as described in any one of claims 1-16.