Data storage method and system
By modifying the data redundancy mechanism of the target hard drive in the data storage system and using mirror backup and verification of data, the problem of RAID conversion relying on the RAID controller is solved, achieving efficient data redundancy mechanism conversion and improving hard drive utilization and storage efficiency.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- HUAWEI TECH CO LTD
- Filing Date
- 2025-12-09
- Publication Date
- 2026-06-18
AI Technical Summary
Existing RAID conversion solutions rely on RAID controllers, consume a lot of resources, and only support conversion from RAID1 to RAID5. Their application scenarios are limited and they cannot meet the conversion needs of various data redundancy mechanisms.
By modifying the data redundancy mechanism of the target hard drive in the data storage system, and using mirror backup and verification data, it neither consumes host resources nor depends on the RAID controller. It supports the conversion from mirror backup to arbitrary verification data redundancy mechanism and is suitable for a variety of scenarios.
This approach achieves improved hard disk utilization while ensuring data reliability, adapting to new demands of data storage systems, reducing the occupation of host resources, and ensuring the efficient execution of storage tasks.
Smart Images

Figure CN2025141052_18062026_PF_FP_ABST
Abstract
Description
A data storage method and system
[0001] This application claims priority to Chinese Patent Application No. 202411848258.7, filed on December 13, 2024, entitled "A Data Storage Method and System", the entire contents of which are incorporated herein by reference. Technical Field
[0002] This application relates to the field of storage technology, and in particular to a data storage method and system. Background Technology
[0003] To provide reliable data services, data storage systems typically use redundant arrays of independent disks (RAID) to store data. There are many RAID types, including RAID 0, RAID 1, RAID 3, RAID 5, RAID 6, RAID 50, and RAID 60, among others. Different RAID types perform differently in terms of disk utilization, reliability, read / write performance, and redundancy calculation. However, the reliability provided by any RAID type often contradicts disk utilization. The more backup data a RAID type requires, the higher its reliability and the lower its disk utilization (e.g., RAID 1); conversely, the less backup data a RAID type requires, the lower its reliability and the higher its disk utilization (e.g., RAID 0).
[0004] To address changing system requirements, such as increased storage space needs and the need for higher hard drive utilization, a RAID controller can be used to convert the original RAID 1 configuration of the data storage system to RAID 5. This increases hard drive utilization from 50% to (n-1) / n, where n > 2, and n is the number of disks in the data storage system. As the number of disks n increases, the hard drive utilization of RAID 5 also increases. However, this solution is only suitable for scenarios with a RAID controller, and the conversion process consumes a significant amount of the RAID controller's resources. Summary of the Invention
[0005] To address the aforementioned technical issues, this application provides a data storage method and system that can modify the data redundancy mechanism of stored data without consuming host resources or relying on a RAID controller, making it applicable to a wide range of scenarios.
[0006] Firstly, a data storage system is provided, comprising multiple hard disks. The multiple hard disks include N first hard disks. Each of the N first hard disks stores first data. A target hard disk among the multiple hard disks is used to instruct M first hard disks to delete the first data based on first verification data. The first verification data is calculated based on the first data. The first verification data is used to verify the first data. N ≥ 2, M < N, where N is an integer and M is a positive integer.
[0007] The above solution modifies the data redundancy mechanism in the data storage system using the target hard drive's resources. This avoids consuming host resources and is independent of the RAID controller, making it widely applicable. Existing conversion solutions only support converting the original RAID1 to RAID5 using a RAID controller, making them suitable only for scenarios with a RAID controller and consuming significant RAID controller resources during the conversion process. This solution, however, modifies the data redundancy mechanism in the data storage system using the hard drive, consuming neither host resources nor relying on the RAID controller. Therefore, this solution is applicable to both scenarios with and without a RAID controller, offering broad applicability. Furthermore, existing conversion solutions only support RAID1 to RAID5 conversion, limiting their application. This solution, however, can be applied to scenarios with only one mirror backup (corresponding to the N=2 scenario above) as well as scenarios with more than one mirror backup (corresponding to the N>2 scenario above). Moreover, this solution supports converting the original mirror-based data redundancy mechanism to any data redundancy mechanism based on parity data. Therefore, this solution is not limited to RAID conversion scenarios and can be applied to a wider range of applications. This technical solution successfully transforms the data redundancy mechanism from mirror backup to verification data, which not only ensures data reliability but also saves storage space, improves hard disk utilization, and addresses new demands of the data storage system during operation.
[0008] In some possible implementations, the target hard disk is specifically used to acquire first data, calculate the first data based on a verification algorithm to obtain first verification data, and instruct M first hard disks to delete the first data based on the first verification data.
[0009] In some possible implementations, the aforementioned plurality of hard drives includes a second hard drive. The second hard drive stores second verification data. This second verification data is used to verify the second data. The second data is data stored on the first hard drive before storing the first data. The first data is used to overwrite the second data. Specifically, the target hard drive is used to send a first I / O instruction to the second hard drive, receive a first completion message sent by the second hard drive after completing the first I / O instruction, and instruct M first hard drives to delete the first data based on the first completion message. Specifically, the first I / O instruction instructs the second hard drive to retrieve the first data, calculate the first data based on a verification algorithm to obtain first verification data, and use the first verification data to overwrite the second verification data.
[0010] In some possible implementations, the above verification algorithm is either a disk array RAID verification algorithm or an erasure coding (EC) algorithm.
[0011] This technical solution provides two implementation methods for calculating the first checksum data. Specifically, the first implementation method involves the target hard drive calculating the first checksum data; the second implementation method involves the target hard drive instructing the second hard drive to calculate the first checksum data. As can be seen, the first implementation method utilizes the resources of the target hard drive, including computing resources (such as computing power) and storage resources (such as cache). This method avoids the target hard drive sending first I / O instructions to the second hard drive and receiving the execution results of the first I / O instructions from the second hard drive to complete the calculation of the first checksum data. In other words, the calculation time for the first checksum data does not include the communication time between the target hard drive and the second hard drive, thus enabling the calculation of the first checksum data to be completed quickly. This allows for the rapid implementation of the process of deleting the first data from M first hard drives, improving the conversion efficiency of the data redundancy mechanism in the data storage system. The second implementation method utilizes the resources of the second hard drive. This method reduces the workload of the target hard drive and avoids excessive occupation of its resources.
[0012] In some possible implementations, the aforementioned plurality of hard disks includes P third hard disks. Each of the P third hard disks stores third data. The target hard disk is used to instruct Q third hard disks to delete the third data based on third checksum data. The third checksum data is calculated based on either the first or third data. The third checksum data is used to overwrite the first checksum data. The third checksum data is used to verify either the first or third data. P ≥ 2, Q < P, where P is an integer and Q is a positive integer.
[0013] In scenarios where the first and third data share the same verification data, the third verification data can be used to verify both the first and third data. Specifically, when the first data is corrupted, the corrupted first data can be recovered using data from other hard drives (including the third data) and the third verification data; similarly, when the third data is corrupted, the corrupted third data can also be recovered using data from other hard drives (including the first data) and the third verification data. Compared to providing a mirror backup for each piece of data in the data storage system, the data redundancy mechanism based on verification data not only ensures data reliability but also allows multiple pieces of data to share the same verification data, further saving storage space and improving hard drive utilization.
[0014] In some possible implementations, the aforementioned data storage system includes a host. The host is used to retrieve first data and instruct N first hard disks to store the first data.
[0015] In the above scheme, the host performs the storage of the first data through mirror backup, ensuring the real-time reliability of the first data, with high execution efficiency and low host resource consumption. Modifying the data redundancy mechanism in the data storage system by the target hard drive does not consume host resources or depend on the RAID controller, making it applicable to a wide range of scenarios. Furthermore, modifying the data redundancy mechanism by the target hard drive ensures that the host's storage task execution process is independent of the data redundancy mechanism conversion process, without affecting the execution efficiency of the storage task.
[0016] In some possible implementations, the host is specifically used to send a second I / O instruction to each first hard disk and receive a second completion message sent by each first hard disk after completing the second I / O instruction. The second I / O instruction carries first data. The second I / O instruction is used to instruct the first hard disk to store the first data.
[0017] In some possible implementations, the host is specifically used to send a third I / O instruction to each first hard disk and receive a third completion message sent by each first hard disk after completing the third I / O instruction. The third I / O instruction carries the host's memory address. The third I / O instruction instructs the first hard disk to read first data from the host's memory and store the first data.
[0018] In some possible implementations, the host is specifically used to send a fourth I / O instruction to the target hard disk and receive a fourth completion message sent by the target hard disk after completing the fourth I / O instruction. The fourth I / O instruction instructs the target hard disk to notify N first hard disks to read first data from the host's memory and store the first data.
[0019] In some possible implementations, the host is specifically used to transmit first data and identification information of each first hard disk to the high-speed interconnect bus, so that the high-speed interconnect bus transmits the first data to each first hard disk according to the identification information.
[0020] This technical solution provides four implementation methods for the host to instruct N first hard disks to store first data, which are used to adapt to the needs of storing first data on N first hard disks in various scenarios, enabling the host to complete the storage task of first data efficiently and quickly.
[0021] In some possible implementations, when the first data is in the non-backup initiation phase or the backup initiation phase, the host is allowed to read the second data from the first hard drive; when the first data is in the startup conversion phase, the host is allowed to read the first data from the first hard drive. Here, the second data is the data stored on the first hard drive before storing the first data. The first data is used to overwrite the second data. The non-backup initiation phase indicates the time period after the host obtains the first data but before instructing N first hard drives to store the first data. The backup initiation phase indicates the time period corresponding to the process of increasing the number of first hard drives storing the first data from 0 to N. The startup conversion phase indicates the time period corresponding to the process of decreasing the number of first hard drives storing the first data from N to (NM).
[0022] This technical solution addresses scenarios where a data storage system receives multiple business requests, providing a comprehensive solution for request conflicts throughout the entire data storage process. For conflicts between write and read requests, this solution supports executing read requests throughout the entire process of storing the first data. Specifically, if the first data has not been mirrored and backed up, it supports reading the old data (i.e., the second data) from the data storage system; if the first data has been mirrored and backed up, it supports reading the new data (i.e., the first data) from the data storage system. This ensures that the data storage system remains readable while processing the first data, without affecting the execution of read requests, guaranteeing uninterrupted business operations and maintaining the performance of online services.
[0023] In some possible implementations, the host's memory contains first data and fourth data. The fourth data is stored in the host's memory after the first data, and the fourth data and the first data are stored in the same location on N first hard drives. When the first data is in the non-backup phase, or when the number of first hard drives storing the first data is (NM), the host performs a first operation; when the first data is in the backup phase, the host performs the first operation, and instructs the target hard drive to stop instructing M first hard drives to delete the first data based on the first checksum. Specifically, the first operation involves instructing N first hard drives to store the fourth data, and, when the target hard drive stores the fourth data on all N first hard drives, instructing M first hard drives to delete the fourth data based on the fourth checksum. The fourth checksum is calculated based on the fourth data and is used to verify the fourth data.
[0024] To address write request conflicts, this technical solution determines whether to continue the data redundancy mechanism conversion related to the first data based on the current stage of the first data. If the data redundancy mechanism conversion related to the first data has not yet started, it can be skipped. Instead, a mirror backup of the latest data (i.e., the fourth data) and the data redundancy mechanism conversion related to the latest data can be performed directly in the data storage system. This reduces the data storage cost of the data storage system.
[0025] Secondly, a data storage method is provided, applied to a data storage system. This data storage system includes multiple hard disks. The multiple hard disks include N first hard disks. Each of the N first hard disks stores first data. The method includes: a target hard disk instructs M first hard disks to delete the first data based on first verification data. The first verification data is calculated based on the first data. The first verification data is used to verify the first data. N ≥ 2, M < N, where N is an integer and M is a positive integer.
[0026] In some possible implementations, the target hard disk instructs M first hard disks to delete the first data based on the first verification data, including: the target hard disk acquiring the first data, calculating the first data based on the verification algorithm to obtain the first verification data, and instructing the M first hard disks to delete the first data based on the first verification data.
[0027] In some possible implementations, the aforementioned plurality of hard drives includes a second hard drive. The second hard drive stores second verification data. This second verification data is used to verify the second data. The second data is data stored on the first hard drive before storing the first data. The first data is used to overwrite the second data. The target hard drive instructs M first hard drives to delete the first data based on the first verification data, including: the target hard drive sending a first I / O instruction to the second hard drive, receiving a first completion message sent by the second hard drive after completing the first I / O instruction, and instructing the M first hard drives to delete the first data based on the first completion message. Specifically, the first I / O instruction instructs the second hard drive to retrieve the first data, calculate the first data based on a verification algorithm to obtain first verification data, and use the first verification data to overwrite the second verification data.
[0028] In some possible implementations, the above verification algorithm is either a disk array RAID verification algorithm or an erasure coding (EC) algorithm.
[0029] In some possible implementations, the aforementioned multiple hard drives include P third hard drives. Each of the P third hard drives stores third data. The method further includes: the target hard drive instructing Q third hard drives to delete the third data based on third checksum data. The third checksum data is calculated based on the first data and the third data, or it is calculated based on the first checksum data and the third data. The third checksum data is used to overwrite the first checksum data. The third checksum data is used to verify either the first data or the third data. P ≥ 2, Q < P, where P is an integer and Q is a positive integer.
[0030] In some possible implementations, the aforementioned data storage system includes a host. Before the target hard disk instructs M first hard disks to delete the first data based on the first checksum, the method further includes: the host acquiring the first data and instructing N first hard disks to store the first data.
[0031] In some possible implementations, instructing N first hard disks to store first data includes: sending a second I / O instruction to each first hard disk and receiving a second completion message sent by each first hard disk after completing the second I / O instruction. The second I / O instruction carries the first data. The second I / O instruction is used to instruct the first hard disks to store the first data.
[0032] In some possible implementations, instructing N first hard disks to store first data includes: sending a third I / O instruction to each first hard disk and receiving a third completion message from each first hard disk after completing the third I / O instruction. The third I / O instruction carries the host's memory address. The third I / O instruction is used to instruct the first hard disks to read the first data from the host's memory and store the first data.
[0033] In some possible implementations, instructing N first hard disks to store first data includes: sending a fourth I / O instruction to the target hard disk and receiving a fourth completion message sent by the target hard disk after completing the fourth I / O instruction. The fourth I / O instruction is used to instruct the target hard disk to notify the N first hard disks to read the first data from the host's memory and store the first data.
[0034] In some possible implementations, the above instruction to N first hard disks to store first data includes: transmitting the first data and the identification information of each first hard disk to a high-speed interconnect bus, so that the high-speed interconnect bus transmits the first data to each first hard disk according to the identification information.
[0035] In some possible implementations, the method further includes: allowing the host to read second data from the first hard drive when the first data is in the non-backup phase or the backup phase; and allowing the host to read the first data from the first hard drive when the first data is in the transition phase. The second data is data stored on the first hard drive before storing the first data. The first data is used to overwrite the second data. The non-backup phase indicates the time period after the host obtains the first data but before instructing N first hard drives to store the first data. The backup phase indicates the time period corresponding to the process of increasing the number of first hard drives storing the first data from 0 to N. The transition phase indicates the time period corresponding to the process of decreasing the number of first hard drives storing the first data from N to (NM).
[0036] In some possible implementations, the host's memory contains first data and fourth data. The fourth data is stored in the host's memory after the first data, and the fourth data and the first data are stored in the same location on the N first hard disks. The method further includes: when the first data is in the non-backup stage, or when the number of first hard disks storing the first data is (NM), the host performs a first operation; when the first data is in the backup stage, the host performs the first operation, and instructs the target hard disk to stop instructing the M first hard disks to delete the first data based on the first checksum. Specifically, the first operation involves: instructing the N first hard disks to store the fourth data, and instructing the target hard disk to instruct the M first hard disks to delete the fourth data based on the fourth checksum when the fourth data is stored on all N first hard disks. The fourth checksum is calculated based on the fourth data and is used to verify the fourth data.
[0037] Thirdly, a computer program product containing instructions is provided that, when executed by a computing device, causes the computing device to perform the method as described in any of the second aspects.
[0038] Fourthly, a computer-readable storage medium is provided, characterized in that it includes computer program instructions, which, when executed by a computing device, perform the method as described in any of the second aspects. Attached Figure Description
[0039] Figure 1 is an architecture diagram of a data storage system provided in an embodiment of this application;
[0040] Figure 2 is a structural diagram of a data storage system provided in an embodiment of this application;
[0041] Figure 3 is a schematic diagram of a RAID conversion result provided in an embodiment of this application;
[0042] Figure 4 is a flowchart illustrating a data storage method provided in an embodiment of this application. Detailed Implementation
[0043] The technical solutions of the embodiments of this application will be described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.
[0044] To address the issue of RAID conversion relying on the RAID controller, this application provides a data storage system that supports data storage by the host through mirroring backup, and the target hard drive alters the data redundancy mechanism in the data storage system. This approach neither consumes host resources nor depends on the RAID controller, while always ensuring data reliability.
[0045] Referring to Figure 1, Figure 1 is an architecture diagram of a data storage system provided in an embodiment of this application. As shown in Figure 1, the architecture includes a client 10 and a data storage system 20. The client 10 and the data storage system 20 can communicate via wired or wireless means.
[0046] In some possible implementations, client 10 can be deployed on computing devices or terminal devices. Computing devices are electronic devices used for computing, processing, and storing data, including servers, supercomputers, personal computers, workstations, industrial control computers, mobile devices, etc. Terminal devices are electronic devices used to access computing devices, including personal computers, smartphones, handheld devices, tablets, mobile laptops, all-in-one handheld devices, smart conferencing devices, smart advertising devices, etc.
[0047] In some possible implementations, the data storage system 20 can be deployed on a computing device.
[0048] In some possible implementations, the client 10 and the data storage system 20 can be deployed on different computing devices, or on the same computing device. When the client 10 and the data storage system 20 are deployed on different computing devices, they can be deployed on different computing devices within the same computing device cluster, or on different computing devices within different computing clusters. The computing device cluster includes multiple of the aforementioned computing devices. In practical applications, the specific deployment of the client 10 and the data storage system 20 can be determined based on the actual application scenario, and this application does not impose specific limitations.
[0049] The data storage system architecture shown in Figure 1 provides users with read and write capabilities for online business operations. Specifically, users can write and read data within the data storage system.
[0050] The specific process of a user writing data to the data storage system is as follows: The user creates a first business request carrying first data on client 10 according to business needs, and sends the first business request to data storage system 20 through client 10. After receiving the first business request, data storage system 20 extracts the first data from the first business request and stores the first data. Subsequently, data storage system 20 sends a request completion message to client 10, notifying the user that the storage of the first data has been completed. In a specific implementation, the first business request carries not only the first data but also a first address. In this case, after receiving the first business request, data storage system 20 extracts the first data and the first address from the first business request and stores the first data in the storage space indicated by the first address.
[0051] The specific process by which a user reads data from the data storage system is as follows: The user creates a second business request carrying a first address on client 10 according to business needs, and sends the second business request to data storage system 20 through client 10. Upon receiving the second business request, data storage system 20 extracts the first address from the second business request, retrieves the first data from the storage space indicated by the first address, generates a request completion message carrying the first data, and sends the request completion message to client 10, notifying the user that the reading of the first data has been completed. Upon receiving the request completion message, client 10 extracts the first data from the request completion message.
[0052] It should be understood that Figure 1 only shows one client 10. In practical applications, the number of clients 10 connected to the data storage system 20 can be more, and this application does not make a specific limitation. When there are multiple clients 10 connected to the data storage system 20, the data storage system 20 can provide online read and write functions for multiple users.
[0053] The specific composition and functions of the data storage system 20 will be further described below with reference to Figure 2.
[0054] Referring to Figure 2, which is a structural diagram of a data storage system provided in an embodiment of this application, the data storage system 20 includes a host 21, a hard disk 22, and a high-speed interconnect bus 23. The host 21 and the hard disk 22, as well as multiple hard disks 22, communicate through the high-speed interconnect bus 23.
[0055] The host 21 can be, for example, a central processing unit (CPU), a data processing unit (DPU), a graphics processing unit (GPU), a neural network processing unit (NPU), etc., or an application-specific integrated circuit (ASIC), or one or more integrated circuits.
[0056] Hard drive 22 can be either a physical hard drive or a virtual hard drive. A physical hard drive refers to an actual hard drive device, such as a hard disk drive (HDD) or a solid-state drive (SSD). A virtual hard drive refers to a hard drive device simulated by software. A virtual hard drive can be a virtual disk created on a physical hard drive, or a virtual disk created on a network storage device (such as direct-attached storage, network-attached storage, storage area networks, etc.), such as a virtual hard drive in a virtual machine (e.g., VMDK, VHD, VDI formats), or a virtual hard drive in a cloud storage service, etc.
[0057] Since there are various types of hard drives 22, for ease of explanation, the following text will use hard drive 22 as an example of a solid-state drive.
[0058] When hard disk 22 is an SSD (see hard disk 22 in Figure 2), hard disk 22 is a storage device built from flash memory chips, including an SSD controller and storage media. The SSD controller executes input / output (I / O) instructions from the host 21. The SSD controller can be a chip, such as a field-programmable gate array (FPGA) or an ASIC. The storage media consists of several flash memory chips. Each flash memory chip can be divided into several physical chunks of a fixed size. Therefore, the physical chunks have a standard capacity, for example, 2 to the power of A, where A is a positive integer. Optionally, the SSD controller and the flash memory chips in the storage media can be mounted on the same printed circuit board (PCB), presented as a disk or card, and communicate with the host 21 or other hard disks 22 via a high-speed interconnect bus 23 through the input / output (I / O) interface on the PCB. In the case where hard disk 22 includes an SSD controller and storage media, if the IO command received by hard disk 22 is a write command, the SSD controller concurrently writes data from outside hard disk 22 to one or more physical blocks of the storage media. If the IO command received by hard disk 22 is a read command, the SSD controller concurrently reads data from one or more physical blocks of the storage media and then sends the data to the outside of hard disk 22.
[0059] The high-speed interconnect bus 23 refers to the communication bus connecting the host 21, hard disk 22, and other devices. The host 21, hard disk 22, and other devices connected via the high-speed interconnect bus 23 have equal status, thus allowing direct data exchange and communication between these devices. The high-speed interconnect bus 23 can be, for example, a unified bus (UB), an NV-LINK bus, a peripheral component interconnect express (PCIe) bus, etc.
[0060] When the data storage system 20 is deployed on a computing device, the data storage system 20 can be a single computing device or a cluster of computing devices.
[0061] When the data storage system 20 is a computing device, the host 21, hard disk 22, and high-speed interconnect bus 23 are deployed on the same computing device. The specific deployment result is shown in Figure 2.
[0062] When the data storage system 20 is a cluster of computing devices, it can be that the host 21 and at least one hard disk 22 are deployed on the same computing device, while the remaining hard disks 22 are deployed on another computing device; or the host 21 and at least one hard disk 22 are deployed on the same computing device, while the remaining hard disks 22 are distributed across multiple computing devices; or the host 21 is deployed on a single computing device, while all hard disks 22 are deployed on another computing device; or the host 21 is deployed on a single computing device, while multiple hard disks 22 are distributed across multiple computing devices, and so on. In practical applications, the host 21 and hard disks 22 can be deployed as needed according to the actual application scenario, and this application does not impose specific limitations.
[0063] It should be understood that the number of hosts and hard disks in the data storage system is illustrated using the data storage system 20 shown in Figure 2, which includes one host 21 and six hard disks 22 as an example. In practical applications, the number of hosts and hard disks in the data storage system can be more or less, and this application does not impose specific limitations. When the data storage system includes multiple hosts, the deployment method of hosts and hard disks in the data storage system can refer to the deployment method of the data storage system 20 on computing devices (clusters) described above. Alternatively, the hosts and hard disks can be deployed in two separate device clusters, where the device cluster with hosts can be a computing device cluster, and the device cluster with hard disks can be a storage device cluster.
[0064] In some possible implementations, multiple hard disks 22 in the data storage system 20 collectively provide storage resources. In the data storage system 20, each hard disk can contribute a portion or all of the device's storage space. By integrating and managing the storage space of multiple hard disks 22, the data storage system 20 provides unified storage resources for upper-layer services, thereby better leveraging storage performance advantages. This application does not impose any limitations on the number, type, or specifications of the hard disks included in the data storage system 20.
[0065] In some potential application scenarios, data storage systems typically employ data redundancy mechanisms, such as redundant arrays of independent disks (RAID), to provide reliable data services. However, the data redundancy mechanisms configured before the data storage system is put into use do not always support its sustainable operation. As the amount of business data in the data storage system increases, the system's requirements also change, specifically requiring increased storage space and higher disk utilization. At this point, it becomes necessary to modify the data redundancy mechanism to ensure the system's sustainable operation while continuing to provide reliable data services. Data reliability refers to the stability and reliability of data during storage, transmission, and processing. Data reliability requires that data can be correctly accessed, used, and recovered when needed.
[0066] The following example scenario will illustrate this point.
[0067] Referring to Figure 3, Figure 3 is a schematic diagram of the result of a RAID conversion provided in an embodiment of this application.
[0068] As shown in the left figure of Figure 3, the data storage system originally consisted of two hard drives. Before being put into use, the two hard drives were configured as RAID 1, with one hard drive serving as the primary drive and the other as the mirror drive. When a user needs to write target data to the data storage system, the host obtains the target data and writes it to both the primary and mirror drives. Once the target data is successfully written to both hard drives, the host sends a completion message to the client, indicating that the write operation is complete. For example, after obtaining target data A0, the host stores target data A0 on both the primary and mirror drives. When a user needs to read target data from the data storage system, the host reads the target data from either the primary or mirror drive and sends it to the client, indicating that the read operation is complete. If one hard drive fails, the host can read the target data from the other hard drive. Therefore, RAID 1 can provide redundancy protection for data. However, RAID 1 limits the hard drive utilization of the data storage system to only 50%, and RAID 1 can only be applied to two hard drives—no more and no less. Therefore, the amount of business data stored in a data storage system configured with RAID 1 cannot exceed the storage capacity of the primary hard drive.
[0069] As shown in the middle diagram of Figure 3, to store more business data, new hard drives can be added to the data storage system to expand its storage capacity. In the middle diagram of Figure 3, four new hard drives have been added.
[0070] As shown in the right figure of Figure 3, adding a new hard drive provides the data storage system with more optional data redundancy mechanisms. Figure 3 only shows two schemes: RAID10 and RAID3. RAID10 refers to a combination of RAID0 and RAID1.
[0071] In the first approach, the data redundancy mechanism of the data storage system is converted from RAID 1 to RAID 10. Specifically, the original configuration of two hard drives in the data storage system is maintained, and four new hard drives are grouped in pairs, with each pair configured as RAID 1, making the overall data storage system a RAID 10 configuration. After converting the data redundancy mechanism from RAID 1 to RAID 10, when the host obtains the target data, it splits the target data into three parts and stores them in three groups. Each part of the data is written simultaneously to two hard drives in one group, ensuring that each part of the data has a mirror backup. For example, the host splits the target data A0 into data A1, data A2, and data A3, using the first group of hard drives to store two copies of data A1, the second group to store two copies of data A2, and the third group to store two copies of data A3. Although the first approach maintains high data reliability, the hard drive utilization rate of the data storage system is still only 50%. To store more business data in the future, new hard drives must be continuously added.
[0072] In the second scheme, the data redundancy mechanism of the data storage system is converted from RAID1 to RAID3. Specifically, the original primary hard drive, mirror hard drive, and three new hard drives (a total of five hard drives) in the data storage system are all used as data disks for storing data, and the last new hard drive is used as a parity disk for storing parity data. After converting the data redundancy mechanism from RAID1 to RAID3, when the host obtains the target data, the host splits the target data into five parts and stores them on the five data disks respectively. An XOR operation is performed on the five parts of the data to obtain the parity data, which is then stored on the parity disk. For example, the host splits the target data A0 into data A1, A2, A3, A4, and A5, stores data A1, A2, A3, A4, and A5 on the five data disks respectively, and performs an XOR operation on data A1, A2, A3, A4, and A5 to obtain the parity data PA0, specifically PA0 = A1 XOR A2 XOR A3 XOR A4 XOR A5, and stores the parity data PA0 on the parity disk. When one of the data disks fails, the host can recover the damaged data using data from the other data disks and parity data from the parity disk. Therefore, the second solution still ensures high data reliability while increasing the hard drive utilization of the data storage system from 50% to 83%. Thus, with the same number of hard drives, the second solution can store more business data than the first solution, thereby ensuring the long-term sustainable operation of the data storage system.
[0073] The above examples demonstrate that, on the one hand, new demands arising during the operation of a data storage system can be addressed by modifying its data redundancy mechanism. On the other hand, compared to mirror backups, using verification data to provide redundancy protection not only ensures data reliability but also saves storage space and improves hard drive utilization.
[0074] It should be understood that the above conversion of RAID1 to RAID3 is merely an example, and this application does not impose any specific limitations. In practical applications, RAID1 can also be converted to any of the RAID types such as RAID5, RAID6, RAID50, and RAID60, which use parity data to provide redundant protection for data. These conversion schemes can all improve hard drive utilization, thereby addressing new demands of data storage systems during operation.
[0075] To ensure data reliability, save storage space, improve hard disk utilization, and minimize the occupation of host resources while ensuring the efficiency of storage task execution, the data storage system 20 provided in this application performs the following operations when executing storage tasks:
[0076] Host 21 is used to obtain the first data and instruct N first hard disks to store the first data. When the first data is stored on all N first hard disks, it indicates that host 21 has completed the storage task with high reliability requirements. The N first hard disks belong to multiple hard disks 22 in the data storage system 20. N ≥ 2, where N is an integer.
[0077] In one possible application scenario, host 21 receives a first service request from client 10, extracts first data from the first service request, and then instructs N first hard disks to store the first data. When all N first hard disks contain the first data, the host can immediately send a request completion message to client 10, notifying the user that the storage of the first data has been completed while ensuring data reliability.
[0078] In another possible application scenario, if host 21 generates first data while running an application, then host 21 instructs N first hard disks to store the first data. When all N first hard disks contain the first data, it indicates that host 21 has completed the storage of the first data while ensuring data reliability.
[0079] It should be understood that the difference between the two application scenarios mentioned above lies only in whether the host 21 interacts with the client 10. The data storage method in Figure 4 below will explain in detail the specific process of the data storage system 20 executing the storage task in the scenario where the host 21 interacts with the client 10. Specifically, the process of the host 21 obtaining the first data and instructing N first hard disks to store the first data can be referred to in step S102 of the data storage method in Figure 4. It can be understood that in the scenario where the host 21 generates the first data, the specific process of the data storage system 20 executing the storage task is similar to the execution steps of the data storage system 20 described in the data storage method in Figure 4; for the sake of brevity, it will not be repeated hereafter.
[0080] The target hard disk is used to instruct M first hard disks to delete the first data when the first data is stored on all N first hard disks, based on the first checksum. The target hard disk belongs to multiple hard disks 22. The first checksum is calculated based on the first data and is used to verify the first data. M < N, where M is a positive integer.
[0081] The aforementioned target hard disk instructs M first hard disks to delete the first data based on the first verification data, and includes at least the following two implementation methods:
[0082] Implementation Method 1: If the target hard disk is used to store the first verification data, then the first data is obtained from the target hard disk, and the first verification data is calculated based on the first data. This process can be found in the relevant description of Implementation Method 1 in step S104 of the data storage method in Figure 4.
[0083] Implementation Method 2: The second hard disk is used to store the first verification data, and the second hard disk is different from the target hard disk. The target hard disk sends a first I / O instruction to the second hard disk via the high-speed interconnect bus 23, receives a first completion message sent by the second hard disk via the high-speed interconnect bus 23 after completing the first I / O instruction, and then instructs M first hard disks to delete the first data based on the first completion message. The first I / O instruction is used to instruct the second hard disk to retrieve the first data and calculate the first verification data based on the first data. This process can be referred to in step S104 of the data storage method in Figure 4 regarding the relevant description of Implementation Method 2.
[0084] In summary, the data storage system 20 provided in this application, which modifies the data redundancy mechanism, has the following advantages:
[0085] Firstly, the host uses mirroring backup to store the initial data, ensuring its real-time reliability. Existing conversion schemes use a RAID controller to perform the RAID1 to RAID5 conversion. During the entire conversion process, only the primary hard drive of the RAID1 provides online business read / write services, while the mirror hard drive of the RAID1 participates in the conversion. The primary and mirror hard drives are asynchronously mirrored; new business data is only written to the primary hard drive. Only when the business traffic on the primary hard drive will not affect online services will the new business data be synchronized to the mirror hard drive for conversion. If the primary hard drive fails before the new business data is synchronized to the mirror hard drive, resulting in the corruption of the new business data, it will be impossible to recover the new business data. It can be seen that such a conversion process cannot guarantee the real-time reliability of business data. In this technical solution, for any business data (such as the initial data), the host always uses mirroring backup to store that business data, thereby ensuring its real-time reliability. Furthermore, mirroring backup can provide a high level of reliability; the more mirroring backups, the higher the data reliability, thus meeting users' high reliability requirements for stored data.
[0086] Secondly, the host uses image backup to store the initial data, resulting in high execution efficiency and low consumption of host resources (such as computing and storage resources). Specifically, image backup can be achieved through read and write operations without any computation, thus the execution time is short, the time occupied by host resources is minimal, host resources can be released as quickly as possible, and the execution process consumes very few host resources.
[0087] Thirdly, this method modifies the data redundancy mechanism in the data storage system using the target hard drive's resources. This avoids consuming host resources and is independent of the RAID controller, making it suitable for a wide range of scenarios. Existing conversion solutions only support converting the original RAID1 to RAID5 using a RAID controller. This approach is only applicable to scenarios with a RAID controller, and the conversion process consumes significant RAID controller resources. In contrast, this technical solution modifies the data redundancy mechanism in the data storage system using the hard drive, without consuming host resources or relying on the RAID controller. Therefore, this solution is suitable for both scenarios with and without a RAID controller, offering broad applicability. Furthermore, existing conversion solutions only support RAID1 to RAID5 conversion, limiting their application scenarios. This technical solution can be applied to scenarios with only one mirror backup (corresponding to RAID1, i.e., the scenario where N=2 above) as well as scenarios with more than one mirror backup (corresponding to the scenario where N>2 above). Furthermore, this solution supports converting the original data redundancy mechanism based on mirror backup into any data redundancy mechanism based on parity data. Many RAID types (such as RAID3, RAID5, RAID6, RAID50, RAID60, etc.) are only partially based on parity data for data redundancy. Therefore, this solution is not limited to RAID conversion scenarios and can be applied to a wider range of scenarios. In addition, changing the data redundancy mechanism on the target hard drive ensures that the host's storage task execution process is independent of the data redundancy mechanism conversion process, without affecting the execution efficiency of the storage task.
[0088] Fourth, this technical solution transforms the original data redundancy mechanism based on mirror backup into a data redundancy mechanism based on verification data. This not only ensures data reliability but also saves storage space, improves hard disk utilization, and addresses new demands of the data storage system during operation.
[0089] The data storage system provided by the embodiments of this application has been described above with reference to Figures 1 and 2. Next, a data storage method provided by an embodiment of this application will be introduced. It should be noted that the application scenarios of the data storage method provided by the embodiments of this application are not limited to the data storage systems shown in Figures 1 and 2. All scenarios in which the data storage method provided by the embodiments of this application can be applied are within the protection scope of this application.
[0090] Referring to Figure 4, Figure 4 is a flowchart illustrating a data storage method provided in an embodiment of this application. As shown in Figure 4, the data storage method provided in this application includes:
[0091] S101: The client sends the first service request to the data storage system.
[0092] Accordingly, the data storage system receives the first business request from the client.
[0093] The client can be client 10 as shown in Figure 1 or Figure 2. The data storage system can be data storage system 20 as shown in Figure 1 or Figure 2.
[0094] The first service request carries the first data. This first service request instructs the data storage system to store the first data and guarantee its reliability. Data reliability refers to the stability and reliability of the data during storage, transmission, and processing. Data reliability requires that the data can be correctly accessed, used, and recovered when needed.
[0095] In one specific implementation, the first service request carries not only the first data but also the first address. The first service request is used to instruct the data storage system to store the first data in the storage space indicated by the first address and to ensure the reliability of the first data.
[0096] The way an address is identified depends on the addressing method in the data storage system.
[0097] In some possible implementations, the data storage system uses logical unit numbers (LUNs) to address physical blocks on the hard disk. The specific implementation mainly includes the following configuration and application phases:
[0098] Configuration Phase: A LUN is created using physical blocks from multiple hard drives. Specifically, physical blocks on multiple hard drives are mapped into multiple logical blocks, logical block addresses (LBAs) are assigned to all logical blocks, and a mapping relationship is established between LBAs and physical block addresses (PBAs). When the mapping between physical and logical blocks is one-to-one, one PBA corresponds to one LBA; when it's one-to-many, one PBA corresponds to multiple LBAs; and when it's many-to-one, multiple PBAs correspond to one LBA. The mapping relationship between physical and logical blocks is determined by the user. For ease of explanation, the following descriptions of LBAs will consistently use the example of a one-to-one correspondence between LBAs and PBAs.
[0099] Application phase: LBAs are used to indicate the physical blocks that make up the LUN. Specifically, the PBA corresponding to a given LBA is determined by looking up the correspondence between LBAs and PBAs.
[0100] As can be seen, in a data storage system based on LUN storage, LBA is used to represent the address. Therefore, the first address carried in the first service request specifically refers to the LBA. For example, if the first address is LBA 0, then the first service request instructs the data storage system to store the first data in the physical block indicated by LBA 0. In practical applications, the number of first addresses in the first service request can be one or more; this application does not impose a specific limitation.
[0101] It should be understood that the above-described use of LBA to represent addresses is merely an example. In practical applications, other identifiers can also be used to represent addresses, and this application does not impose specific limitations. It is understood that when other identifiers are used to represent addresses, the content indicated by the first service request is similar to the content indicated by the first service request in the scenario described above where data is stored based on LUNs in a data storage system. For the sake of brevity, this will not be repeated here. Furthermore, for ease of explanation, LBA will be used to represent addresses uniformly in the following description.
[0102] S102: The host in the data storage system extracts the first data from the first service request and instructs N first hard disks to store the first data.
[0103] The host can be host 21 in the data storage system 20 in Figure 2.
[0104] In some possible implementations, hard drive identifiers (IDs), serial numbers, or serial numbers can be used as identification information for hard drives to distinguish them. Both the hard drive ID and serial number are unique identifiers assigned by the hard drive manufacturer during production. However, the hard drive ID is typically a hardware-level identifier, while the hard drive serial number is one of the serial numbers for the hard drive device. The serial number is usually included in the hard drive ID, but the hard drive ID may also include other hardware information or identifiers. The hard drive number can be a sequential number assigned to all hard drives by the data storage system, specifically hard drive number 1, hard drive number 2, ..., hard drive number Z. For ease of explanation, the hard drive number will be used to identify different hard drives in the following text.
[0105] In some possible implementations, after receiving the first service request and extracting the first data from it, the host determines N first hard drives from multiple hard drives in the data storage system. Here, N ≥ 2, and N is an integer. The method for determining the N first hard drives depends on the specific content of the first service request. If the first service request only carries the first data and not the first address, the host can arbitrarily select N hard drives as the first hard drives. For example, hard drives 1, 2, ..., and N can be selected as the first hard drives. If the first service request carries both the first data and the first address, the host will use the hard drive that provides the storage space indicated by the first address as the first hard drive, and then select one or more hard drives from the remaining hard drives as the first hard drives, making the number of first hard drives N. For example, if the first address is LBA 0, and the physical block corresponding to LBA 0 is in hard drive 1, then hard drive 1 will be selected as the first hard drive, and hard drives 2, 3, ..., and N will be selected as the first hard drives. These hard drives can be hard drives 22 in the data storage system 20 of Figure 2.
[0106] After determining N first hard disks, the host instructs these first hard disks to store the first data. The methods by which the host instructs the N first hard disks to store the first data can be referenced in the following implementation methods 1-4:
[0107] Implementation Method 1: The host sends a second I / O instruction to each of the first hard disks and receives a second completion message from the first hard disk after completing the second I / O instruction. The second I / O instruction carries the first data. The second I / O instruction is used to instruct the first hard disk to store the first data.
[0108] Specifically, after determining N first hard drives, the host generates a second I / O instruction. This second I / O instruction carries the first data and is a write instruction (e.g., carrying a `write` flag). The host then sends the second I / O instruction to each of the first hard drives. Correspondingly, each first hard drive receives the second I / O instruction from the host. Next, the first hard drive executes the second I / O instruction, specifically by extracting the first data from the instruction and writing it to a physical block. After writing the first data to the physical block, the first hard drive has completed the second I / O instruction. Therefore, the first hard drive generates a second completion message indicating successful execution and sends it to the host. Thus, the host receives N second completion messages.
[0109] In one specific implementation, step S102, implementation method 1, is based on a high-speed interconnect bus. Specifically, the host and the first hard disk communicate via the high-speed interconnect bus. The host sends a second I / O instruction to the first hard disk via the high-speed interconnect bus, and the first hard disk sends a second completion message to the host via the high-speed interconnect bus. The high-speed interconnect bus can be the high-speed interconnect bus 23 in the data storage system 20 of Figure 2.
[0110] Implementation Method 2: The host sends a third I / O instruction to each first hard disk and receives a third completion message from each first hard disk after completing the third I / O instruction. The third I / O instruction carries the host's memory address. The third I / O instruction instructs the first hard disk to read first data from the host's memory and store the first data.
[0111] In this context, host memory refers to the devices or components in a data storage system used to store programs and data. It temporarily stores the data and instructions needed by the host so that the processor can access and operate on them quickly. Host memory can be directly accessed using Direct Memory Access (DMA) technology. Examples of host memory include Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), flash memory, virtual memory, and so on. Host memory addresses are used to identify storage units within memory.
[0112] Specifically, after determining N first hard disks, the host generates a third I / O instruction. This third I / O instruction carries the host's memory address, and the instruction type of the second I / O instruction is a write instruction (e.g., carrying a write flag). The host then sends the third I / O instruction to each first hard disk. Correspondingly, each first hard disk receives the third I / O instruction from the host. Next, the first hard disk executes the third I / O instruction: it extracts the memory address from the instruction, reads the first data from the host's memory based on the address, and then writes the first data to a physical block. After writing the first data to the physical block, it signifies that the first hard disk has completed the third I / O instruction. Therefore, the first hard disk generates a third completion message indicating successful execution and sends it to the host. Thus, the host receives N third completion messages.
[0113] In one specific implementation, step S102, implementation method 2, is based on a high-speed interconnect bus. Specifically, the host and the first hard disk communicate through the high-speed interconnect bus. The host sends a third IO instruction to the first hard disk through the high-speed interconnect bus, and the first hard disk sends a third completion message to the host through the high-speed interconnect bus.
[0114] Implementation Method 3: The host sends a fourth I / O instruction to the target hard disk and receives a fourth completion message from the target hard disk after completing the fourth I / O instruction. The fourth I / O instruction carries the host's memory address. The fourth I / O instruction instructs the target hard disk to notify N first hard disks to read and store the first data from the host's memory.
[0115] In some possible implementations, the host can select any one of the N first hard disks as the target hard disk, or it can select the hard disk used to store the first verification data as the target hard disk, or it can select any hard disk other than the first hard disk and the hard disk used to store the first verification data as the target hard disk. The first verification data is used to verify the first data. The first verification data is calculated based on the first data. The calculation process of the first verification data can be found in the description of step S104 below regarding the process of using a verification algorithm to calculate the first data and obtain the first verification data.
[0116] After identifying the target hard drive, the host generates a fourth I / O instruction. This fourth I / O instruction carries the host's memory address and the identification information of all first hard drives (such as hard drive serial numbers), and its instruction type is a write instruction (e.g., carrying a write flag). The host then sends the fourth I / O instruction to the target hard drive. Correspondingly, the target hard drive receives the fourth I / O instruction from the host. Next, the target hard drive executes the fourth I / O instruction; specifically, the target hard drive extracts the memory address and the identification information of all first hard drives from the fourth I / O instruction to generate a fifth I / O instruction. This fifth I / O instruction carries the host's memory address, and its instruction type is a write instruction (e.g., carrying a write flag).
[0117] If the target hard drive is the first hard drive, it reads the first data from the host's memory based on the memory address and then writes the first data to a physical block. The target hard drive sends a fifth I / O instruction to each of the first hard drives except itself, based on the identification information. Correspondingly, each first hard drive receives the fifth I / O instruction from the target hard drive. Next, the first hard drive executes the fifth I / O instruction; specifically, it extracts the memory address from the fifth I / O instruction, reads the first data from the host's memory based on the memory address, and then writes the first data to a physical block. After writing the first data to the physical block, it means the first hard drive has completed the fifth I / O instruction. Therefore, the first hard drive generates a fifth completion message indicating successful execution and sends it to the target hard drive. After the target hard drive receives (N-1) fifth completion messages, it generates a fourth completion message indicating successful execution and sends it to the host. Thus, the host receives a fourth completion message.
[0118] If the target hard drive is not a first hard drive, the target hard drive sends a fifth I / O instruction to each first hard drive based on the identification information. Correspondingly, each first hard drive receives the fifth I / O instruction from the target hard drive. Next, the first hard drive executes the fifth I / O instruction: specifically, it extracts the memory address from the fifth I / O instruction, reads the first data from the host's memory according to the memory address, and then writes the first data into a physical block. After writing the first data into the physical block, it means the first hard drive has completed the fifth I / O instruction. Therefore, the first hard drive generates a fifth completion message indicating successful execution and sends it to the target hard drive. After the target hard drive receives N fifth completion messages, it generates a fourth completion message indicating successful execution and sends it to the host. Thus, the host receives a fourth completion message.
[0119] In one specific implementation, step S102, implementation method 3, is based on a high-speed interconnect bus. That is, the host and the target hard disk, as well as the target hard disk and the first hard disk, communicate through the high-speed interconnect bus. Specifically, the host sends a fourth IO instruction to the target hard disk through the high-speed interconnect bus, the target hard disk sends a fifth IO instruction to the first hard disk through the high-speed interconnect bus, the first hard disk sends a fifth completion message to the target hard disk through the high-speed interconnect bus, and the target hard disk sends a fourth completion message to the host through the high-speed interconnect bus.
[0120] Implementation Method 4: The host transmits the first data and the identification information of each first hard disk to the high-speed interconnect bus, so that the high-speed interconnect bus transmits the first data to each first hard disk according to the identification information.
[0121] Specifically, numerous devices are connected to the high-speed interconnect bus. Therefore, the high-speed interconnect bus stores a correspondence between device address information and device identification information. The device address information indicates the device's location on the high-speed interconnect bus, while the device identification information distinguishes different devices. After receiving the first data from the host and the identification information of N first hard drives, the high-speed interconnect bus determines the address information corresponding to the identification information of each first hard drive from the correspondence between address information and identification information. Then, it transmits the first data to the device (i.e., the first hard drive) with that address information, thereby enabling access to each first hard drive. Each of the N first hard drives, upon receiving the first data, writes the first data into a physical block, thus completing the storage of the first data by the N first hard drives.
[0122] It should be understood that the implementation methods 1-4 in step S102 above are merely examples and are not specifically limited here.
[0123] In summary, this technical solution provides multiple implementation methods for the host to instruct N first hard disks to store first data, which can be adapted to the needs of storing first data on N first hard disks in various scenarios, enabling the host to efficiently and quickly complete the storage task of the first data.
[0124] S103: When the host has stored the first data on all N first hard disks, it sends a request completion message to the client.
[0125] After the first data is stored on N first hard disks, for example, in implementation 1 of step S102, the host receives N second completion messages, or in implementation 2 of step S102, the host receives N third completion messages, or in implementation 3 of step S102, the host receives a fourth completion message, which means that the host has completed the storage task. Therefore, the host generates a request completion message to indicate that the storage was successful, and then sends the request completion message to the client.
[0126] In one specific implementation, after the host sends a request completion message to the client, it deletes the first data in the host's memory, thereby releasing memory resources.
[0127] S104: In the case that the target hard disk in the data storage system stores the first data on all N first hard disks, the first data is deleted from M first hard disks based on the first check data.
[0128] In some possible implementations, the method for determining that the target hard disk contains the first data on all N first hard disks can refer to the following determination methods 1 and 2:
[0129] Method 1: The host notifies the target hard disk that the first data is stored on all N first hard disks.
[0130] In one specific implementation, since the host generates the request completion message when the first data is stored on all N first hard disks, the host can directly send the request completion message to the target hard disk through the high-speed interconnect bus, notifying the target hard disk that the storage of the first data on the N first hard disks has been completed.
[0131] It should be understood that in practical applications, the host can also generate a different completion message than the request completion message, and then send this completion message to the target hard drive to notify the target hard drive that the storage of the first data on N first hard drives has been completed. This completion message can be generated earlier than, later than, or simultaneously with the request completion message.
[0132] Method 2: The target hard drive monitors whether the first data is stored on all N first hard drives.
[0133] In one specific implementation, after receiving the first service request, the host sends a control message carrying the first data to the target hard drive, instructing the target hard drive to monitor whether the number of hard drives storing the first data is N. Therefore, after receiving the control message, the target hard drive monitors each hard drive for the first data. Specifically, for its own hard drive, the target hard drive periodically or irregularly searches for the first data. If the first data is found, it indicates that the target hard drive stores the first data; if the first data is not found, it indicates that the target hard drive does not store the first data. For other hard drives, the target hard drive periodically or irregularly sends instructions to each hard drive via the high-speed interconnect bus to instruct it to read the first data. If a hard drive successfully executes the instruction, it indicates that the first data is stored on that hard drive; if the instruction fails to execute, it indicates that the first data is not stored on that hard drive. The target hard drive determines whether the number of hard drives storing the first data is N based on the monitoring results of each hard drive.
[0134] In another specific implementation, corresponding to implementation method 3 in step S102 above, the target hard disk monitors whether the first data is stored on the hard disk by sending a fifth IO instruction to the first hard disk and receiving a fifth completion message from the first hard disk. If the target hard disk is the first hard disk, and the target hard disk receives (N-1) fifth completion messages, then the target hard disk determines that the first data is stored on all N first hard disks. If the target hard disk is not the first hard disk, and the target hard disk receives N fifth completion messages, then the target hard disk determines that the first data is stored on all N first hard disks.
[0135] It should be understood that determination method 1 and determination method 2 in step S104 above are merely examples and are not specifically limited here.
[0136] After confirming that the first data is stored on all N first hard drives, the target hard drive instructs M first hard drives to delete the first data based on the first checksum. The method by which the target hard drive instructs the M first hard drives to delete the first data based on the first checksum can be implemented using methods 1 and 2 below:
[0137] Implementation Method 1: The target hard drive obtains the first data, calculates the first data based on the verification algorithm to obtain the first verification data, and then instructs M first hard drives to delete the first data based on the first verification data.
[0138] A verification algorithm is a coding method used to verify data integrity. Data integrity refers to the accuracy and completeness of data. It requires that data not be accidentally tampered with, damaged, or lost during storage, transmission, and processing. Verification algorithms include RAID verification algorithms, erasure coding (EC) algorithms, and others.
[0139] RAID parity algorithms are a redundant array technology used to improve data reliability and fault tolerance. Different RAID types provide different RAID parity algorithms. In RAID 3 or RAID 5, an XOR operation is performed on the data within the same stripe, and the result is used as the parity data for that stripe. The specific calculation process can be found in the second scheme of Figure 3 above, where parity data PA0 is calculated; it will not be repeated here. In RAID 6, parity codes are used, including P-check and Q-check. P-check involves performing an XOR operation on the data within the same stripe to obtain the parity data for that stripe. The process can be found in the second scheme of Figure 3 above, where parity data PA0 is calculated; it will not be repeated here. Q-check involves performing an XOR operation on the data on the same hard drive, and the result is used as the parity data for that hard drive. For example, if hard drive (SSD 1) stores data A1, data B1, and data C1, then performing an XOR operation on data A1, data B1, and data C1 yields the checksum Q(SSD 1), specifically Q(SSD 1) = y1*A1 XOR y2*B1 XOR y3*C1. Here, y1, y2, and y3 are coefficients determined by the user. For the sake of brevity, the RAID checksum algorithms provided for other RAID types will not be elaborated here.
[0140] The EC algorithm is a data redundancy coding technique used to achieve fault tolerance and data recovery in data storage and communication. EC algorithms typically use the original data as input, encode the coefficient matrix and the original data, and generate check data (i.e., output data). Different EC algorithms use different coefficient matrices and have different encoding processes. Therefore, the specific process of obtaining check data using the EC algorithm can be compared to the process of encoding the original data to generate redundant data using Reed-Solomon (RS) codes, low-density parity-check (LDPC) codes, or Bose-Chaudhuri-Hocquenghem (BCH) codes.
[0141] In some possible implementations, the target hard disk acquires the first data, calculates the first data based on a verification algorithm to obtain the first verification data, and then instructs M first hard disks to delete the first data based on the first verification data, which includes at least the following steps:
[0142] Step 1: Obtain the first data.
[0143] Specifically, if the target hard drive is the first hard drive, then the target hard drive reads the first data from this drive. If the target hard drive is not the first hard drive, then the target hard drive sends an instruction to the first hard drive via the high-speed interconnect bus to instruct it to read the first data, and receives the first data sent by the first hard drive via the high-speed interconnect bus after the first hard drive completes the instruction.
[0144] Step 2: Calculate the first data using a verification algorithm to obtain the first verification data.
[0145] In some possible implementations, the target hard disk can calculate the first checksum data in the following calculation methods 1 and 2:
[0146] Calculation Method 1: The target hard drive obtains other data from other hard drives, and uses a verification algorithm to calculate the first data and other data to obtain the first verification data.
[0147] Other data refers to data that shares the same verification data as the first data.
[0148] For example, in a scenario where multiple hard drives are striped, assuming the first data belongs to the first stripe, the target hard drive retrieves all data in the first stripe except for the first data, and then uses a verification algorithm to calculate the first verification data by comparing the first data with the other data. The target hard drive can retrieve other data from other hard drives by sending instructions to those other hard drives to instruct them to read the other data.
[0149] When the verification algorithm is a RAID verification algorithm, an XOR operation is performed on the first data and other data, and the result is used as the first verification data. This process can be referred to the process of calculating the verification data PA0 in the second scheme of Figure 3 above.
[0150] When the verification algorithm is EC, before calculating the first data and other data, the target hard drive first obtains the coefficient matrix. Then, using the first data and other data as input data, the input data and coefficient matrix are encoded to generate the first verification data as output data. The specific values of the coefficient matrix depend on the type of EC algorithm. The process of encoding the input data and coefficient matrix is the same as the process of calculating the first data and other data. This process can be similar to the process of encoding the original data using RS code, LDPC code, or BCH code to generate redundant data.
[0151] Calculation Method 2: In the scenario where the first data is used to overwrite the second data, the target hard disk obtains the second data and the second check data, performs an XOR operation on the first data, the second data and the second check data, and uses the result as the first check data.
[0152] The second data refers to the data stored by the first hard drive before storing the first data, and the first and second data are stored in the same location on the first hard drive. The process by which the target hard drive acquires the second data is as follows: If the target hard drive is the first hard drive, it first reads the second data stored therein into its SSD controller before storing the first data, and then writes the first data into the physical blocks. If the target hard drive is not the first hard drive, it first sends a command to the first hard drive via the high-speed interconnect bus to instruct it to read the second data before storing the first data, and then receives the second data sent by the first hard drive via the high-speed interconnect bus after the first hard drive completes the command.
[0153] The second checksum is calculated based on the second data and is used to verify the second data. The first checksum is used to overwrite the second checksum. The process by which the target hard drive obtains the second checksum is as follows: If the target hard drive is the hard drive that stores the second checksum (hereinafter referred to as the second hard drive), then the target hard drive reads the second checksum from this drive. If the target hard drive is not the second hard drive, then the target hard drive sends an instruction to the second hard drive to instruct it to read the second checksum, and receives the second checksum sent by the second hard drive after completing the instruction.
[0154] For example, assuming the first data is data A1, the second data is data A1', and the second check data is check data PA0', the specific calculation process of the first check data PA0 is: PA0 = A1' XOR PA0' XOR A1.
[0155] In one specific implementation, the SSD controller in the hard drive is not only responsible for managing and controlling the overall operation of the hard drive, but also for performing various calculation and processing tasks. In this case, the SSD controller in the target hard drive implements either the calculation method 1 in step S104, which calculates the first data and other data to obtain the first check data, or the calculation method 2, which performs an XOR operation on the first data, the second data, and the second check data to obtain the first check data.
[0156] In another specific implementation, the SSD controller in the hard drive is responsible for managing and controlling the overall operation of the hard drive, while the SSD processor in the hard drive is responsible for executing the instructions issued by the SSD controller and various calculation and processing tasks. The SSD controller and SSD processor can be independent of each other or integrated on the same PCB. The SSD processor can be, for example, an Advanced Reduced Instruction Set Computing Processor (ARM). In this case, the SSD controller obtains the first data and other data from calculation method 1 in step S104, and then sends the first data and other data to the SSD processor. The SSD processor performs calculations on the first data and other data to obtain the first checksum, and then sends the first checksum to the SSD controller. Alternatively, the SSD controller obtains the first data, second data, and second checksum from calculation method 2, and then sends the first data, second data, and second checksum to the SSD processor. The SSD processor performs an XOR operation on the first data, second data, and second checksum to obtain the first checksum, and then sends the first checksum to the SSD controller.
[0157] After calculating the first checksum, if the target hard drive is the second hard drive, the target hard drive writes the first checksum to a physical block. If the target hard drive is not the second hard drive, the target hard drive sends the first checksum to the second hard drive so that the second hard drive stores the first checksum.
[0158] Step 3: After obtaining the first verification data, instruct the M first hard disks to delete the first data.
[0159] After storing the first set of verification data, the target hard drive generates a deletion command. This deletion command instructs the deletion of the first set of data.
[0160] In some possible implementations, the determination of the M first hard disks depends on the first service request.
[0161] If the first service request does not carry the first address, the target hard drive can arbitrarily select M hard drives from the N first hard drives and send deletion commands to these hard drives respectively via the high-speed interconnect bus. Alternatively, if the target hard drive is a first hard drive, the target hard drive can arbitrarily select M hard drives from the (N-1) first hard drives excluding itself and send deletion commands to these hard drives respectively via the high-speed interconnect bus. Where M < N, and M is a positive integer.
[0162] If the first service request carries a first address, the target hard disk will arbitrarily select M hard disks from the (N-1) first hard disks other than the first hard disk used to provide the storage space for the first address indication, and send deletion instructions to these hard disks respectively through the high-speed interconnect bus.
[0163] Upon receiving the deletion command, the first hard drive deletes the first data in the physical block. After deleting the first data, it generates a deletion completion message to indicate successful execution and then sends the deletion completion message to the target hard drive via the high-speed interconnect bus. Therefore, when the target hard drive receives M deletion completion messages, it means that the deletion of the first data by the first M hard drives has been completed.
[0164] In summary, Implementation Method 1 calculates the first verification data using the target hard drive. This method utilizes the resources of the target hard drive, including processing resources (such as computing power) and storage resources (such as cache). This method avoids the target hard drive sending first I / O instructions to the second hard drive and receiving the execution results of the first I / O instructions from the second hard drive in order to complete the calculation of the first verification data. In other words, the calculation time of the first verification data does not include the communication time between the target hard drive and the second hard drive. Therefore, the calculation of the first verification data can be completed quickly, thereby enabling the process of deleting the first data from M first hard drives as soon as possible, and improving the conversion efficiency of the data redundancy mechanism in the data storage system.
[0165] Implementation Method 2: The target hard disk sends a first IO instruction to the second hard disk, receives a first completion message sent by the second hard disk after completing the first IO instruction, and then instructs M first hard disks to delete the first data based on the first completion message.
[0166] The first I / O instruction is used to instruct the second hard disk to obtain the first data, and to calculate the first data based on the verification algorithm to obtain the first verification data.
[0167] Therefore, after receiving the first I / O instruction, the second hard disk acquires the first data, calculates the first data based on the verification algorithm to obtain the first verification data, and writes the first verification data into a physical block to overwrite the second verification data. This process can be referred to in implementation method 1 of step S104 above, where the target hard disk acquires the first data, calculates the first data based on the verification algorithm to obtain the first verification data. For details, please refer to the execution process of steps 1 and 2 in implementation method 1. For the sake of brevity, it will not be described again here.
[0168] After calculating the first verification data, the second hard disk generates a first completion message to indicate successful execution, and then sends the first completion message to the target hard disk via the high-speed interconnect bus. Upon receiving the first completion message, the target hard disk instructs the M first hard disks to delete the first data. The process by which the target hard disk instructs the M first hard disks to delete the first data can be referred to as the execution process of step 3 in implementation method 1 of step S104 above; for the sake of brevity, it will not be elaborated here.
[0169] In summary, implementation method 2 involves the target hard drive instructing the second hard drive to calculate the first verification data. This method utilizes the resources of the second hard drive, reducing the workload on the target hard drive and avoiding excessive resource consumption.
[0170] It should be understood that implementation methods 1 and 2 in step S104 above are merely examples and are not specifically limited here.
[0171] It should be understood that in practical applications, the target hard disk can also send a first I / O instruction to other hard disks so that the other hard disks can obtain the first data, calculate the first data based on the verification algorithm, and obtain the first verification data. This application does not make any specific limitations.
[0172] In some possible application scenarios, after completing step S104 above, the client sends a third service request to the data storage system. This third service request carries third data. The third service request instructs the data storage system to store the third data and ensures its reliability. Upon receiving the third service request, the host in the data storage system extracts the third data from it and instructs P third hard drives to store the third data. If all P third hard drives store the third data, the host sends a request completion message to the client, and the target hard drive instructs Q third hard drives to delete the third data based on the third verification data. Here, P ≥ 2, where P is an integer, and Q < P, where Q is a positive integer. The third verification data is calculated based on the third data and is used to verify the third data. Specifically:
[0173] First, after receiving the third service request, the host extracts the third data from the request. Then, it determines P third hard drives from the multiple hard drives in the data storage system and instructs these P third hard drives to store the third data. The process of determining the P third hard drives can be referenced to the process of determining N first hard drives in step S102 above, and the process of instructing the P third hard drives to store the third data can be referenced to the process of instructing the N first hard drives to store the first data in step S102 above. See any of the implementation methods 1-4 in step S102 for details. After storing the third data on the P third hard drives, all P third hard drives will contain the third data. The third data is stored in a different location than the first data; that is, the third data is not used to overwrite the first data. Therefore, the data storage system stores both the first and third data simultaneously.
[0174] Next, after the third data is stored on all P third hard drives, the host sends a completion request message to the client. This process can be referred to the execution process of step S103 above.
[0175] Given that all P third hard drives store third data, the target hard drive instructs Q third hard drives to delete the third data based on the third checksum. In one specific implementation, the target hard drive acquires the third data, calculates the third data using a checksum algorithm to obtain the third checksum, and then instructs the Q third hard drives to delete the third data based on the third checksum. In another specific implementation, the target hard drive sends a sixth I / O instruction to a fourth hard drive used to store the third checksum, receives a sixth completion message from the fourth hard drive after completing the sixth I / O instruction, and then instructs the Q third hard drives to delete the third data based on the sixth completion message. The sixth I / O instruction is used to instruct the fourth hard drive to acquire the third data, calculate the third data using a checksum algorithm, and obtain the third checksum.
[0176] In this process, the target hard drive calculates the third checksum based on a verification algorithm. The data required for this calculation depends on the correspondence between the third checksum and the first checksum. Specifically:
[0177] If the first data and the third data do not share the same checksum, then the data required to calculate the third checksum does not include either the first data or the first checksum. In this case, the target hard drive obtains other data (excluding the first data) that shares the same checksum with the third data, and uses a checksum algorithm to calculate the third data and the other data to obtain the third checksum. Alternatively, in the scenario where the third data is used to overwrite old data, the target hard drive obtains the old data and the old checksum, performs an XOR operation on the third data, the old data, and the old checksum, and uses the result as the third checksum. The process of using the checksum algorithm to calculate the third data and the other data to obtain the third checksum can refer to the execution process of step 2 in implementation method 1 of step S104 above. The process of performing an XOR operation on the third data, the old data, and the old checksum to obtain the third checksum can refer to the execution process of implementation method 2 of step S104 above.
[0178] For example, in a scenario where multiple hard drives are striped for management, taking the second scheme in Figure 3 above as an example, assuming the first data is data A1, which belongs to the first stripe, and the check data PA0 is the check data of the first stripe (i.e., check data PA0 is the first check data), and the third data is data B1, which belongs to the second stripe (the second stripe is a different stripe from the first stripe), then the check data PB0 is the third check data. It can be seen that the data required to calculate the check data PB0 are data B1, data B2, data B3, data B4, and data B5 from the second stripe, or the data required to calculate the check data PB0 are data B1, data B1' (i.e., old data), and check data PB0' (i.e., old check data). Therefore, the data required to calculate the check data PB0 does not include data A1 or check data PA0.
[0179] If the first data and the third data share the same checksum, meaning the third checksum is used to overwrite the first checksum, then the data required to calculate the third checksum includes either the first data or the first checksum. In this case, the target hard drive acquires the first data, uses a checksum algorithm to calculate the third checksum using the first data and the third data, and obtains the third checksum. Alternatively, in the scenario where the third data is used to overwrite old data, the target hard drive acquires the old data and the first checksum, performs an XOR operation on the third data, the old data, and the first checksum, and uses the result as the third checksum. The process of using the checksum algorithm to calculate the third checksum using the first data and the third data can refer to the execution process of step 2 in implementation method 1 of step S104 above. The process of performing an XOR operation on the third data, the old data, and the first checksum to obtain the third checksum can refer to the execution process of implementation method 2 of step S104 above.
[0180] Continuing with the second scheme in Figure 3 above as an example, assuming the first data is data A1, which belongs to the first stripe, and the check data PA0 is the check data of the first stripe (i.e., check data PA0 is the first check data), and the third data is data A2, which belongs to the first stripe. It can be seen that calculating the third check data, i.e., the check data PA0” used to cover check data PA0, requires data A1, A2, A3, A4, and A5 from the first stripe; or, calculating check data PA0” requires data A2, data A2' (i.e., the old data), and check data PA0. Therefore, the data required to calculate check data PA0” includes either data A1 or check data PA0.
[0181] In scenarios where the first and third data share the same checksum, after calculating the third checksum, it can be used to verify both the first and third data. Specifically, when the first data is corrupted, the corrupted first data can be recovered using data from other hard drives (including the third data) and the third checksum; similarly, when the third data is corrupted, the corrupted third data can also be recovered using data from other hard drives (including the first data) and the third checksum. Compared to providing a mirror backup for each piece of data in the data storage system, the data redundancy mechanism based on checksum not only ensures data reliability but also allows multiple pieces of data to share the same checksum, further saving storage space and improving hard drive utilization.
[0182] In some possible implementations, the storage phase begins from the moment the data storage system receives the first service request until the deletion of the first data by M first hard drives, and the host or target hard drive marks the storage phase of the first data. This storage phase includes a backup-not-started phase, a backup-started phase, a backup-completed phase, a conversion-started phase, and a conversion-completed phase. The backup-not-started phase indicates the time period after the host receives the first data but before instructing N first hard drives to store it; specifically, it corresponds to the execution process of step S101 and the process in step S102 where the host extracts the first data from the first service request. The backup-started phase indicates the time period corresponding to the process of increasing the number of first hard drives storing the first data from 0 to N; specifically, it corresponds to the process in step S102 where the host instructs N first hard drives to store the first data until the storage of the first data by N first hard drives is completed. The backup-completed phase indicates the time period when the number of first hard drives storing the first data is N. The conversion-started phase indicates the time period corresponding to the process of decreasing the number of first hard drives storing the first data from N to (NM); specifically, it corresponds to the execution process of step S104. The completion of the conversion phase is used to indicate a time period in which the number of first hard disks storing the first data is (NM).
[0183] For scenarios where a data storage system receives multiple business requests, this application provides a request conflict resolution solution based on the data storage stage. The specific details of the data storage stage can be found in the aforementioned description of the storage stage of the first data. The resolution process for different request conflicts will be explained in detail below.
[0184] Conflict scenario 1: Write request conflicts with read request.
[0185] In this scenario, the write request is the first business request, and the read request is the second business request. Specifically, the first business request carries first data and a first address. The first business request instructs the data storage system to store the first data in the storage space indicated by the first address and ensures the reliability of the first data. The second business request carries the first address. The second business request instructs the data storage system to retrieve data from the storage space indicated by the first address and return the data.
[0186] The data storage system addresses this scenario as follows: When the first data is in the pre-backup or pre-backup phase, the data storage system allows the host to read the second data from the first hard drive. When the first data is in the backup complete, conversion start, or conversion complete phase, the data storage system allows the host to read the first data from the first hard drive.
[0187] It should be noted that the solution provided by the data storage system described above also applies to scenarios where the first business request only carries the first data and not the first address. As long as the storage space used to store the first data is the same as the storage space indicated by the first address, it falls under scenario 1 of the conflict.
[0188] In summary, to address the conflict between write and read requests, this technical solution supports executing read requests throughout the entire process of storing the first data. Specifically, if the first data has not been mirrored and backed up, it supports reading the old data (i.e., the second data) from the data storage system; if the first data has been mirrored and backed up, it supports reading the new data (i.e., the first data) from the data storage system. This ensures that the data storage system remains readable while processing the first data, without affecting the execution of read requests, thus ensuring uninterrupted business operations and maintaining the performance of online services.
[0189] Conflict Scenario 2: Write request conflicts with other write requests.
[0190] In this scenario, one write request is the first business request, and the other is the fourth business request. Specifically, the first business request carries first data and a first address. The first business request instructs the data storage system to store the first data in the storage space indicated by the first address, ensuring the reliability of the first data. The fourth business request carries fourth data and a first address. The fourth business request instructs the data storage system to store the fourth data in the storage space indicated by the first address, ensuring the reliability of the fourth data.
[0191] In a scenario where the host receives the first service request first, followed by the fourth service request, the host first extracts the first data and the first address from the first service request, and then extracts the fourth data and the first address from the fourth service request. Therefore, the host's memory stores both the first and fourth data, with the fourth data being stored later than the first data. Furthermore, by comparing the addresses in the first and fourth service requests, the host can determine that the storage space used to store the first and fourth data on the hard drives is the same space. That is, the fourth data and the first data are stored in the same location on N first hard drives, thus causing a conflict between the first and fourth service requests.
[0192] The data storage system addresses this scenario in the following ways:
[0193] When the first data is in the pre-backup stage or the conversion stage is complete, the host performs a first operation. Specifically, the first operation involves instructing N first hard drives to store the fourth data, and, if the target hard drive stores the fourth data on all N first hard drives, instructing M first hard drives to delete the fourth data based on fourth checksum data. The fourth checksum data is calculated based on the fourth data and is used to verify the fourth data. The process of the host instructing the N first hard drives to store the fourth data can be referred to in step S102 above. Similarly, the process of the host instructing the M first hard drives to delete the fourth data based on fourth checksum data when the target hard drive stores the fourth data on all N first hard drives can be referred to in step S104 above, where the target hard drive determines that the first data is stored on all N first hard drives, and after determining that the first data is stored on all N first hard drives, the target hard drive instructs the M first hard drives to delete the first data based on first checksum data.
[0194] When the first data is in the startup or backup completion phase, the host performs a first operation and instructs the target hard drive to stop instructing the M first hard drives to delete the first data based on the first checksum. It should be noted that this application does not limit the order in which the host performs the first operation and instructs the target hard drive to stop instructing the M first hard drives to delete the first data based on the first checksum. This means that the host can either perform the first operation first and then instruct the target hard drive to stop instructing the M first hard drives to delete the first data based on the first checksum; or it can instruct the target hard drive to stop instructing the M first hard drives to delete the first data based on the first checksum first and then perform the first operation; or it can perform the first operation and instruct the target hard drive to stop instructing the M first hard drives to delete the first data based on the first checksum simultaneously. The process of the host instructing the target hard drive to stop instructing the M first hard drives to delete the first data based on the first checksum can be implemented by the host sending a stop command to the target hard drive and receiving a stop completion message sent by the target hard drive after completing the stop command.
[0195] While the first data is in the initial conversion phase, the target hard drive continues to instruct M first hard drives to delete the first data based on the first checksum data.
[0196] In summary, regarding write request conflicts, this technical solution determines whether to continue the data redundancy mechanism conversion related to the first data based on the stage at which the first data is currently located. If the data redundancy mechanism conversion related to the first data has not yet started, it can be skipped. Instead, a mirror backup of the latest data (i.e., the fourth data) and the data redundancy mechanism conversion related to the latest data can be performed directly in the data storage system. This reduces the data storage cost of the data storage system.
[0197] It should be noted that the arrangement of data and its verification data, including first data and first verification data, second data and second verification data, third data and third verification data, etc., can be determined according to the actual application scenario, and this application does not impose specific limitations. In practical applications, data and its verification data can be arranged according to any RAID type such as RAID3, RAID5, RAID6, RAID50, RAID60, etc., or according to business requirements.
[0198] In summary, the data storage method provided in this application can transform the original data redundancy mechanism based on mirror backup into a data redundancy mechanism based on verification data. This ensures data reliability while saving storage space, improving hard drive utilization, and addressing new requirements of data storage systems during operation. Specifically, the host performs the storage of the first data through mirror backup, ensuring real-time reliability of the first data, with high execution efficiency and low host resource consumption. Changing the data redundancy mechanism in the data storage system via the target hard drive does not consume host resources or depend on the RAID controller, making it applicable to a wide range of scenarios. Furthermore, changing the data redundancy mechanism via the target hard drive ensures that the host's storage task execution process is independent of the data redundancy mechanism conversion process, without affecting the execution efficiency of the storage task.
[0199] This application also provides a computer program product containing instructions. This computer program product may be a software or program product containing instructions, capable of running on a computing device or stored on any available medium. When the computer program product runs on a computing device, it causes the computing device to execute the data storage method described in FIG4.
[0200] This application also provides a computer-readable storage medium. The computer-readable storage medium can be any available medium capable of being stored by a computing device, or a data storage device such as a data center containing one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state drive). The computer-readable storage medium includes instructions that instruct the computing device to execute the data storage method described in FIG4.
[0201] It should be understood that in the embodiments of this application, "when," "...when," and "if" all refer to the device making corresponding processing under certain objective circumstances, and are not time-limited, nor do they require the device to make a judgment action, nor do they imply any other limitations.
[0202] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the protection scope of the technical solutions of the embodiments of this application.
Claims
1. A data storage system, characterized in that, It includes multiple hard drives, wherein the multiple hard drives include N first hard drives, each of the N first hard drives storing first data, where N≥2 and N is an integer. The target hard drive among the plurality of hard drives is used to instruct M first hard drives to delete the first data based on the first verification data; Wherein, the first verification data is calculated based on the first data, and the first verification data is used to verify the first data, M < N, where M is a positive integer.
2. The system according to claim 1, characterized in that, The target hard drive is specifically used to acquire the first data, calculate the first data based on the verification algorithm to obtain the first verification data, and instruct M first hard drives to delete the first data based on the first verification data.
3. The system according to claim 2, characterized in that, The verification algorithm is either the RAID verification algorithm or the erasure coding (EC) algorithm.
4. The system according to claim 1, characterized in that, The plurality of hard drives includes a second hard drive, which stores second verification data. This second verification data is used to verify the second data, which is data stored on the first hard drive before the first data was stored therein. The first data is used to overwrite the second data. The target hard disk is specifically used to send a first IO instruction to the second hard disk, receive a first completion message sent by the second hard disk after completing the first IO instruction, and instruct M first hard disks to delete the first data based on the first completion message; The first I / O instruction is used to instruct the second hard disk to acquire the first data, calculate the first data based on the verification algorithm to obtain the first verification data, and use the first verification data to overwrite the second verification data.
5. The system according to any one of claims 1-4, characterized in that, The plurality of hard disks includes P third hard disks, each of which stores third data, where P ≥ 2 and P is an integer. The target hard disk is used to instruct Q third hard disks to delete the third data based on the third check data; Wherein, the third verification data is calculated based on the first data and the third data, or the third verification data is calculated based on the first verification data and the third data, the third verification data is used to overwrite the first verification data, the third verification data is used to verify the first data or the third data, Q < P, and Q is a positive integer.
6. The system according to any one of claims 1-5, characterized in that, The data storage system includes a host, The host is used to obtain the first data and instruct N first hard disks to store the first data.
7. The system according to claim 6, characterized in that, The host is specifically used to send a second IO instruction to each first hard disk and receive a second completion message sent by each first hard disk after completing the second IO instruction; The second I / O instruction carries the first data and is used to instruct the first hard disk to store the first data.
8. The system according to claim 6, characterized in that, The host is specifically used to send a third I / O instruction to each first hard disk and receive a third completion message sent by each first hard disk after completing the third I / O instruction; The third I / O instruction carries the memory address of the host and is used to instruct the first hard disk to read the first data from the memory of the host and store the first data.
9. The system according to claim 6, characterized in that, The host is specifically used to send a fourth I / O instruction to the target hard disk and receive a fourth completion message sent by the target hard disk after completing the fourth I / O instruction; The fourth I / O instruction is used to instruct the target hard disk to notify N first hard disks to read the first data from the host's memory and store the first data.
10. The system according to claim 6, characterized in that, The host is specifically used to transmit the first data and the identification information of each first hard disk to the high-speed interconnect bus, so that the high-speed interconnect bus transmits the first data to each first hard disk according to the identification information.
11. The system according to any one of claims 6-10, characterized in that, If the first data is in the non-backup stage or in the backup stage, the host is allowed to read the second data from the first hard drive; While the first data is in the startup conversion phase, the host is allowed to read the first data from the first hard disk; The "backup not started" phase refers to the time period after the host obtains the first data but before instructing N first hard disks to store the first data. The startup backup phase is used to indicate the time period corresponding to the process of increasing the number of first hard drives storing the first data from 0 to N. The startup conversion phase is used to indicate the time period corresponding to the process in which the number of first hard disks storing the first data is reduced from N to (NM); The second data is the data stored by the first hard disk before storing the first data, and the first data is used to overwrite the second data.
12. The system according to claim 11, characterized in that, The host computer's memory contains the first data and the fourth data, wherein the fourth data is stored later than the first data in the host computer's memory, and the fourth data and the first data are stored in the same location on N first hard disks. When the first data is in the non-backup stage, or when the number of first hard disks storing the first data is (NM), the host is used to perform the first operation; When the first data is in the startup backup phase, the host is used to perform the first operation, and to instruct the target hard disk to stop instructing M first hard disks to delete the first data based on the first verification data; The first operation is as follows: instructing N first hard disks to store the fourth data, and instructing the target hard disk to instruct M first hard disks to delete the fourth data based on the fourth check data when the fourth data is stored on all N first hard disks. The fourth check data is calculated based on the fourth data and is used to verify the fourth data.
13. A data storage method, characterized in that, The method is applied to a data storage system, which includes multiple hard disks, including N first hard disks, each of which stores first data, where N ≥ 2 and N is an integer. The target hard drive among the plurality of hard drives instructs M first hard drives to delete the first data based on the first verification data; Wherein, the first verification data is calculated based on the first data, and the first verification data is used to verify the first data, M < N, where M is a positive integer.
14. The method according to claim 13, characterized in that, The target hard disk instructs M first hard disks to delete the first data based on the first verification data, including: The target hard disk acquires the first data, calculates the first data based on the verification algorithm to obtain the first verification data, and instructs M first hard disks to delete the first data based on the first verification data.
15. The method according to claim 14, characterized in that, The verification algorithm is either the RAID verification algorithm or the erasure coding (EC) algorithm.
16. The method according to claim 13, characterized in that, The plurality of hard drives includes a second hard drive, which stores second verification data. This second verification data is used to verify the second data, which is data stored on the first hard drive before the first data was stored therein. The first data is used to overwrite the second data. The target hard disk instructs M first hard disks to delete the first data based on the first verification data, including: The target hard disk sends a first IO instruction to the second hard disk, receives a first completion message sent by the second hard disk after completing the first IO instruction, and instructs M first hard disks to delete the first data based on the first completion message; The first I / O instruction is used to instruct the second hard disk to acquire the first data, calculate the first data based on the verification algorithm to obtain the first verification data, and use the first verification data to overwrite the second verification data.
17. The method according to any one of claims 13-16, characterized in that, The plurality of hard disks includes P third hard disks, each of which stores third data, where P ≥ 2 and P is an integer. The method further includes: The target hard disk instructs Q third hard disks to delete the third data based on the third check data; Wherein, the third verification data is calculated based on the first data and the third data, or the third verification data is calculated based on the first verification data and the third data, the third verification data is used to overwrite the first verification data, the third verification data is used to verify the first data or the third data, Q < P, and Q is a positive integer.
18. The method according to claim 13, characterized in that, The data storage system includes a host, Before the target hard disk instructs M first hard disks to delete the first data based on the first checksum data, the method further includes: The host obtains the first data and instructs N first hard disks to store the first data.
19. The method according to claim 18, characterized in that, The instruction to store the first data on N first hard disks includes: Send a second I / O instruction to each first hard disk and receive a second completion message sent by each first hard disk after completing the second I / O instruction; wherein the second I / O instruction carries the first data and is used to instruct the first hard disk to store the first data.
20. The method according to claim 19, characterized in that, The instruction to store the first data on N first hard disks includes: A third I / O instruction is sent to each first hard disk, and a third completion message is received from each first hard disk after completing the third I / O instruction; wherein, the third I / O instruction carries the memory address of the host, and the third I / O instruction is used to instruct the first hard disk to read the first data from the memory of the host and store the first data.
21. The method according to claim 19, characterized in that, The instruction to store the first data on N first hard disks includes: Send a fourth I / O instruction to the target hard disk and receive a fourth completion message sent by the target hard disk after completing the fourth I / O instruction; wherein, the fourth I / O instruction is used to instruct the target hard disk to notify N first hard disks to read the first data from the memory of the host and store the first data.
22. The method according to claim 19, characterized in that, The instruction to store the first data on N first hard disks includes: The first data and the identification information of each first hard disk are transmitted to the high-speed interconnect bus, so that the high-speed interconnect bus transmits the first data to each first hard disk according to the identification information.
23. The method according to any one of claims 19-22, characterized in that, The method further includes: If the first data is in the non-backup stage or in the backup stage, the host is allowed to read the second data from the first hard drive; While the first data is in the startup conversion phase, the host is allowed to read the first data from the first hard disk; Wherein, the non-backup start phase is used to indicate the time period corresponding to the host after obtaining the first data and before instructing N first hard disks to store the first data; the backup start phase is used to indicate the time period corresponding to the process of increasing the number of first hard disks storing the first data from 0 to N; the conversion start phase is used to indicate the time period corresponding to the process of decreasing the number of first hard disks storing the first data from N to (NM); the second data is the data stored on the first hard disks before storing the first data, and the first data is used to overwrite the second data.
24. The method according to claim 23, characterized in that, The host computer's memory contains the first data and the fourth data, wherein the fourth data is stored later than the first data in the host computer's memory, and the fourth data and the first data are stored in the same location on N first hard disks. The method further includes: When the first data is in the non-backup stage, or when the number of first hard disks storing the first data is (NM), the host performs the first operation; When the first data is in the startup backup phase, the host performs the first operation and instructs the target hard disk to stop instructing M first hard disks to delete the first data based on the first verification data; The first operation is as follows: instructing N first hard disks to store the fourth data, and instructing the target hard disk to instruct M first hard disks to delete the fourth data based on the fourth check data when the fourth data is stored on all N first hard disks. The fourth check data is calculated based on the fourth data and is used to verify the fourth data.
25. A computer program product containing instructions, characterized in that, When the instructions are executed by the computing device, the computing device performs the method as described in any one of claims 13-24.
26. A computer-readable storage medium, characterized in that, It includes computer program instructions that, when run on a computing device, implement the method as described in any one of claims 13-24.