A RAID remapping method, system, device and storage medium
By calculating the parameters and data block locations of the RAID disk array and performing read and write operations according to the actual data page size, the problem of read errors in existing technologies is solved, thus improving the data read and write efficiency and accuracy of RAID disk arrays.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHANDONG YUNHAI GUOCHUANG CLOUD COMPUTING EQUIP IND INNOVATION CENT CO LTD
- Filing Date
- 2023-03-30
- Publication Date
- 2026-06-30
Smart Images

Figure CN116382582B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of storage, and more specifically to a RAID remapping method, system, device, and storage medium. Background Technology
[0002] With the rapid development of computer network technology, servers have been widely used in various fields. However, the hard drives used to store information have become a bottleneck for improving system performance due to their slow read / write speeds and poor reliability. Therefore, RAID (Rapid Array of Independent Disks) technology was developed. A RAID is a large-capacity disk group composed of many independent disks, thus significantly increasing read / write speeds. Currently, based on different implementation principles, RAID has evolved into several levels, and different combinations of disks can reduce errors and improve efficiency.
[0003] In host-driven read / write I / O scenarios, the RAID remapping algorithm vertically aggregates data according to stripes, significantly improving the efficiency of reading and writing data from the disk. When data is read and written using the PRP addressing method in the NVMe protocol, it is processed in units of data pages. However, in real-world scenarios, the actual data page size generated by the RAID remapping algorithm is used for processing, which differs from the aforementioned page-based processing algorithm. If data is read and written in units of the actual disk data page size, it is inevitable that pointers indicating the next data page will be read, which can easily lead to problems in practice. Summary of the Invention
[0004] In view of this, in order to overcome at least one aspect of the above problems, embodiments of the present invention propose a RAID remapping method, comprising the following steps:
[0005] Obtain the location of each data block on the RAID disk array that makes up the data that needs to be remapped;
[0006] The disk containing each data block is determined based on the location of each data block on the RAID disk array;
[0007] Read the corresponding number of data blocks from each of the disks respectively;
[0008] According to the disk number, several data blocks will be read from each disk and placed into the page table in memory in sequence.
[0009] In some embodiments, obtaining the location of each data block constituting the data to be remapped on the RAID disk array further includes:
[0010] Obtain the parameters of the RAID disk array and the total length of the data that needs to be remapped;
[0011] The position and length of each data block on the RAID disk array are determined based on the parameters and the total length.
[0012] In some embodiments, obtaining the parameters of the RAID disk array further includes:
[0013] Get the stripe unit size, number of disks, starting disk location, and parity disk location.
[0014] In some embodiments, several data blocks read from each disk are sequentially placed into a page table in memory according to the disk number, further comprising:
[0015] The access address of each data block in the page table is determined based on the length of each data block and the current offset of the page table.
[0016] In some embodiments, it also includes:
[0017] Determine whether the length of the data that needs to be remapped is greater than the remaining space in the page table;
[0018] In response to a value not greater than, several data blocks read from each disk are directly placed into the page table in memory in sequence according to the disk number.
[0019] In some embodiments, it also includes:
[0020] In response to the condition that the access address of the currently read data block is greater than the maximum address of the page table, before each time the currently read data block is placed into the page table, it is determined whether the access address of the currently read data block is less than the maximum address of the page table and whether the length of the currently read data block multiplied by the width of the page table plus the access address of the currently read data block is not greater than the maximum address of the page table.
[0021] If the access address of the currently read data block is less than the maximum address of the page table, and the length of the currently read data block multiplied by the width of the page table plus the access address of the currently read data block is not greater than the maximum address of the page table, then the currently read data block is directly placed into the page table.
[0022] In some embodiments, it also includes:
[0023] If the access address of the currently read data block is greater than the maximum address of the page table, or if the length of the currently read data block multiplied by the width of the page table plus the access address of the currently read data block is not less than the maximum address of the page table, the currently read data block is divided into a first sub-data block and a second sub-data block.
[0024] The first sub-data block is placed into the page table, and the second sub-data block is placed into the next page table.
[0025] Based on the same inventive concept, according to another aspect of the present invention, embodiments of the present invention also provide a RAID remapping system, comprising:
[0026] The acquisition module is configured to acquire the location of each data block that constitutes the data that needs to be remapped on the RAID disk array;
[0027] The determination module is configured to determine the disk where each data block is located based on the location of each data block on the RAID disk array.
[0028] The reading module is configured to read a corresponding number of data blocks from each of the disks;
[0029] The storage module is configured to sequentially read several data blocks from each disk and place them into a page table in memory according to the disk number.
[0030] Based on the same inventive concept, according to another aspect of the present invention, embodiments of the present invention also provide a computer device, comprising:
[0031] At least one processor; and
[0032] The memory stores a computer program that can run on the processor, which, when executing the program, performs the steps of any of the RAID remapping methods described above.
[0033] Based on the same inventive concept, according to another aspect of the present invention, embodiments of the present invention also provide a computer-readable storage medium storing a computer program that, when executed by a processor, performs the steps of any of the RAID remapping methods described above.
[0034] The present invention has one of the following beneficial technical effects: The solution proposed in this invention calculates the actual data page size of each disk, and when reading and writing data of each RAID disk according to stripes, reads and writes according to the page table address order based on the actual data size in the strip size of each RAID disk. Attached Figure Description
[0035] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other embodiments can be obtained based on these drawings without creative effort.
[0036] Figure 1A flowchart illustrating the RAID remapping method provided in an embodiment of the present invention;
[0037] Figure 2 A schematic diagram of the remapping algorithm provided for an embodiment of the present invention;
[0038] Figure 3 A schematic diagram showing the data block and access address of each disk data block obtained after the remapping algorithm, provided for an embodiment of the present invention.
[0039] Figure 4 A distribution diagram of data on a page table provided for embodiments of the present invention;
[0040] Figure 5 A data block distribution diagram that does not span page tables is provided for embodiments of the present invention;
[0041] Figure 6 A data block cross-page table distribution diagram provided for embodiments of the present invention;
[0042] Figure 7 A schematic diagram of the RAID remapping system provided in an embodiment of the present invention;
[0043] Figure 8 A schematic diagram of the structure of a computer device provided for an embodiment of the present invention;
[0044] Figure 9 A schematic diagram of the structure of a computer-readable storage medium provided for an embodiment of the present invention. Detailed Implementation
[0045] To make the objectives, technical solutions, and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below with reference to specific examples and the accompanying drawings.
[0046] It should be noted that all uses of "first" and "second" in the embodiments of the present invention are for the purpose of distinguishing two entities or parameters with the same name but different names. It is clear that "first" and "second" are only for the convenience of expression and should not be construed as limiting the embodiments of the present invention. Subsequent embodiments will not explain this in detail.
[0047] According to one aspect of the present invention, embodiments of the present invention provide a RAID remapping method, such as... Figure 1 As shown, it may include the following steps:
[0048] S1, obtain the location of each data block that makes up the data that needs to be remapped on the RAID disk array;
[0049] S2, determine the disk where each data block is located based on the position of each data block on the RAID disk array;
[0050] S3, read the corresponding number of data blocks from each of the disks respectively;
[0051] S4, according to the disk number, several data blocks read from each disk are sequentially placed into the page table of memory.
[0052] The proposed solution calculates the actual data page size of each disk and, when reading and writing data on each RAID disk according to stripes, reads and writes according to the page table address order based on the actual data size in the strip size of each RAID disk.
[0053] In some embodiments, obtaining the location of each data block constituting the data to be remapped on the RAID disk array further includes:
[0054] Obtain the parameters of the RAID disk array and the total length of the data that needs to be remapped;
[0055] The position and length of each data block on the RAID disk array are determined based on the parameters and the total length.
[0056] In some embodiments, obtaining the parameters of the RAID disk array further includes:
[0057] Get the stripe unit size, number of disks, starting disk location, and parity disk location.
[0058] In some embodiments, several data blocks read from each disk are sequentially placed into a page table in memory according to the disk number, further comprising:
[0059] The access address of each data block in the page table is determined based on the length of each data block and the current offset of the page table.
[0060] Specifically, such as Figure 2 Taking the RAID5 type as an example, when accessing data on the disks in sequence according to the stripes, the access order is disk2>disk0>disk1>disk3>disk0. After remapping, the access order is disk0>disk0>disk1>disk2>disk3. Therefore, it is necessary to calculate the position and length of each data block on each hard drive.
[0061] For example, when the RAID stripe size is 4, the number of disks is 4, the total length of data to be remapped is 17, the starting disk number is 2, the starting disk offset is 2, the P disk number in the stripe containing the starting disk is 3, and the starting address in the page table after remapping is start_addr.
[0062] The size of the current stripe data block in the starting disk can be calculated using the starting disk offset position `strip_offset` and the starting disk number `start_drv`: `head_size` = (strip_size – strip_offset) = 2. The size of the last data block can be calculated using the stripe unit size `strip_size` and the total data length `nlb`: `last_size` = (nlb – head_size) % `strip_size` = (17-2) % 4 = 3. The size of the intermediate data blocks is the stripe unit size `strip_size`. Then, the hard disk number of each data block can be calculated using the starting disk number `start_drv` and the P disk number `pd_idx`.
[0063] Typically, after mapping, all data resides in a single page table. The remapping algorithm then yields the data for each disk data block and its access address within the page table, as shown below. Figure 3 As shown.
[0064] In some embodiments, it also includes:
[0065] Determine whether the length of the data that needs to be remapped is greater than the remaining space in the page table;
[0066] In response to a value not greater than, several data blocks read from each disk are directly placed into the page table in memory in sequence according to the disk number.
[0067] In some embodiments, it also includes:
[0068] In response to the condition that the access address of the currently read data block is greater than the maximum address of the page table, before each time the currently read data block is placed into the page table, it is determined whether the access address of the currently read data block is less than the maximum address of the page table and whether the length of the currently read data block multiplied by the width of the page table plus the access address of the currently read data block is not greater than the maximum address of the page table.
[0069] If the access address of the currently read data block is less than the maximum address of the page table, and the length of the currently read data block multiplied by the width of the page table plus the access address of the currently read data block is not greater than the maximum address of the page table, then the currently read data block is directly placed into the page table.
[0070] In some embodiments, it further includes:
[0071] If the access address of the currently read data block is greater than the maximum address of the page table or the length of the currently read data block multiplied by the width of the page table plus the access address of the currently read data block is not less than the maximum address of the page table, the currently read data block is cut into a first sub-data block and a second sub-data block;
[0072] The first sub-data block is placed into the page table and the second sub-data block is placed into the next page table.
[0073] Specifically, as Figure 4 shown in the data distribution diagram, when the data length nlb is greater than the remaining space remain_len of the current page table, a cross-page table situation will occur. If the data length nlb is less than the remaining space remain_len of the current page table, the data can be directly placed into the current page table. Therefore, when calculating the disk data block address using the starting address start_addr, the situation of the cross-page table read pointer needs to be considered.
[0074] As Figure 5 shown in the schematic diagram of a data block not crossing pages, this arrangement method is the most common situation of data distribution, and the data block is within the address range of a page table. The data block and the starting address of each data block are calculated according to the RAID remapping algorithm.
[0075] The address of data block data1 is data1_addr, the data length is data1_len, the starting address of the page table where data block data1 is located is start_addr, the page table size is page_len, the width of the page table entry is data_width, the starting address of the page table is page1_addr, and the maximum address of the current page table (excluding the last entry of the current page table) page_max_addr = page1_addr + page_len * data_width. When data1_addr < page_max_addr and (data1_addr + data1_len * data_width) <= page_max_addr, it means that all data of data block data1 is within the current page table. Therefore, data1_addr and data1_len can be directly used as the address and length of the current data block accessing the page table.
[0076] As Figure 6 shown in the schematic diagram of a data block crossing pages, a cross-page table situation will occur if the remaining space of the current page table is not enough to accommodate a complete data block, and its data distribution is as Figure 6 shown.
[0077] The remaining space of the current page table is remain_len, the length of the data block is data_len, and the starting address is data1_addr. When remain_len < data_len, it is necessary to calculate the cross-page situation of the mapped address. The data block is divided into two parts: data1-1 and data1-2. The length of the data block data1-1 is remain_len, and the length of the data1-2 is (data_len – remain_len).
[0078] The pointer address of the current page table pointer_addr = page_max_addr + data_width. By reading the data pointer_data at this address, the starting address page2_addr of the next-level page table is obtained. The starting address of the data block data1-2 is page2_addr, and the starting address data2_addr of the next data block data2 = page2_addr + (data_len – remain_len) * data_width.
[0079] When accessing data, the data block data1 is cut. The pre-cross-page address data1_addr, the pre-cross-page data length remain_len, the post-cross-page address page2_addr, and the post-cross-page data length (data_len – remain_len) are used as the address and length of the current data block to access the page table.
[0080] It should be noted that the data blocks at the starting position (head_size), the middle position (strip_size), and the last position (last_size) may all have cross-page table situations. If there is a cross-page table situation, the page table address is accessed by dividing the data block.
[0081] The solution proposed by the present invention can calculate the data size in each disk according to the given disk parameters (strip unit size, number of disks, starting disk position, parity disk position, data length, etc.) through the RAID remapping algorithm module. The mapping starting address, disk data size and other parameters are used as the input of the calculation logic unit, and after a series of operations, the length of the data block and the starting address of the data block in each disk are calculated. When the data block crosses the page table, it is necessary to find the starting address of the next-level page table to calculate the starting address of the next data block. The data is read and written in the page table using the data block size and the starting address.
[0082] Based on the same inventive concept, according to another aspect of the present invention, an embodiment of the present invention also provides a RAID remapping system 400, as Figure 7 shown, including:
[0083] The acquisition module 401 is configured to acquire the position of each data block that constitutes the data to be remapped on the RAID disk array;
[0084] The determination module 402 is configured to determine the disk where each data block is located based on the position of each data block on the RAID disk array;
[0085] The reading module 403 is configured to read a corresponding plurality of data blocks from each of the disks;
[0086] Storage module 404 is configured to sequentially read several data blocks from each disk and place them into a page table in memory according to the disk number.
[0087] Based on the same inventive concept, according to another aspect of the present invention, such as Figure 8 As shown, embodiments of the present invention also provide a computer device 501, comprising:
[0088] At least one processor 520; and
[0089] The memory 510 stores a computer program 511 that can run on the processor. When the processor 520 executes the program, it performs the steps of any of the RAID remapping methods described above.
[0090] Based on the same inventive concept, according to another aspect of the present invention, such as Figure 9 As shown, embodiments of the present invention also provide a computer-readable storage medium 601, which stores a computer program 610. When the computer program 610 is executed by a processor, it performs the steps of any of the RAID remapping methods described above.
[0091] Finally, it should be noted that those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The program can be stored in a computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods.
[0092] Furthermore, it should be understood that the computer-readable storage medium (e.g., memory) described herein may be volatile memory or non-volatile memory, or may include both volatile memory and non-volatile memory.
[0093] Those skilled in the art will also understand that the various exemplary logic blocks, modules, circuits, and algorithm steps described in conjunction with the disclosure herein can be implemented as electronic hardware, computer software, or a combination of both. To clearly illustrate this interchangeability between hardware and software, the functionality of various illustrative components, blocks, modules, circuits, and steps has been generally described. Whether this functionality is implemented as software or as hardware depends on the specific application and the design constraints imposed on the system as a whole. Those skilled in the art can implement the functionality in various ways for each specific application, but such implementation decisions should not be construed as departing from the scope of the embodiments disclosed herein.
[0094] The above are exemplary embodiments disclosed in this invention. However, it should be noted that various changes and modifications can be made without departing from the scope of the embodiments of this invention as defined by the claims. The functions, steps, and / or actions of the methods according to the disclosed embodiments described herein do not need to be performed in any particular order. Furthermore, although the elements disclosed in the embodiments of this invention may be described or claimed individually, they may be understood as multiple unless explicitly limited to a singular number.
[0095] It should be understood that, as used herein, the singular form “a” is intended to include the plural form as well, unless the context clearly supports an exception. It should also be understood that, as used herein, “and / or” refers to any and all possible combinations of one or more of the associated listed items.
[0096] The embodiment numbers disclosed in the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.
[0097] Those skilled in the art will understand that all or part of the steps of the above embodiments can be implemented by hardware or by a program instructing related hardware. The program can be stored in a computer-readable storage medium, such as a read-only memory, a disk, or an optical disk.
[0098] Those skilled in the art should understand that the discussion of any of the above embodiments is merely exemplary and is not intended to imply that the scope of the invention (including the claims) is limited to these examples. Within the framework of the invention, technical features of the above embodiments or different embodiments can be combined, and many other variations of different aspects of the invention exist, which are not provided in the details for the sake of brevity. Therefore, any omissions, modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the invention should be included within the protection scope of the invention.
Claims
1. A RAID remapping method, characterized in that, Includes the following steps: Obtain the location of each data block on the RAID disk array that makes up the data that needs to be remapped; The disk containing each data block is determined based on the location of each data block on the RAID disk array; Read the corresponding number of data blocks from each of the disks respectively; According to the disk number, several data blocks will be read from each disk and placed into the page table in memory in sequence; The step of obtaining the location of each data block constituting the data to be remapped on the RAID disk array further includes: Obtain the parameters of the RAID disk array and the total length of the data that needs to be remapped; The position and length of each data block on the RAID disk array are determined based on the parameters and the total length. The step of obtaining the parameters of the RAID disk array further includes: Get the stripe unit size, number of disks, starting disk location, and parity disk location; The step of sequentially reading several data blocks from each disk and placing them into a page table in memory according to the disk number further includes: The access address of each data block in the page table is determined based on the length of each data block and the current offset of the page table.
2. The method as described in claim 1, characterized in that, Also includes: Determine whether the length of the data that needs to be remapped is greater than the remaining space in the page table; In response to a value not greater than, several data blocks read from each disk are directly placed into the page table in memory in sequence according to the disk number.
3. The method as described in claim 2, characterized in that, Also includes: In response to the condition that the access address of the currently read data block is greater than the maximum address of the page table, before each time the currently read data block is placed into the page table, it is determined whether the access address of the currently read data block is less than the maximum address of the page table and whether the length of the currently read data block multiplied by the width of the page table plus the access address of the currently read data block is not greater than the maximum address of the page table. If the access address of the currently read data block is less than the maximum address of the page table, and the length of the currently read data block multiplied by the width of the page table plus the access address of the currently read data block is not greater than the maximum address of the page table, then the currently read data block is directly placed into the page table.
4. The method as described in claim 3, characterized in that, Also includes: If the access address of the currently read data block is greater than the maximum address of the page table, or if the length of the currently read data block is multiplied by the width of the page table and then the access address of the currently read data block is greater than the maximum address of the page table, the currently read data block is divided into a first sub-data block and a second sub-data block. The first sub-data block is placed into the page table and the second sub-data block is placed into the next page table.
5. A RAID remapping system, characterized in that, include: The acquisition module is configured to acquire the location of each data block that constitutes the data that needs to be remapped on the RAID disk array; The determination module is configured to determine the disk where each data block is located based on the location of each data block on the RAID disk array. The reading module is configured to read a corresponding number of data blocks from each of the disks; The storage module is configured to sequentially read several data blocks from each disk and place them into a page table in memory according to the disk number; The acquisition module is further configured to: Obtain the parameters of the RAID disk array and the total length of the data that needs to be remapped; The position and length of each data block on the RAID disk array are determined based on the parameters and the total length. The step of obtaining the parameters of the RAID disk array further includes: Get the stripe unit size, number of disks, starting disk location, and parity disk location; The storage module is also configured to: The access address of each data block in the page table is determined based on the length of each data block and the current offset of the page table.
6. A computer device, comprising: At least one processor; as well as A memory storing a computer program executable on the processor, characterized in that the processor executes the program by performing the steps of the method as described in any one of claims 1-4.
7. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by a processor, it performs the steps of the method as described in any one of claims 1-4.