A distributed storage system and method of use
By setting an odd number of first and second nodes in the distributed storage system, and replacing the failed first node with the second node in case of failure, the system downtime problem caused by the failure of the fault isolation domain node in the distributed key-value storage system is solved, and the system achieves high availability and disaster recovery capability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA TELECOM CLOUD TECH CO LTD
- Filing Date
- 2022-12-23
- Publication Date
- 2026-06-12
Smart Images

Figure CN115952055B_ABST
Abstract
Description
Technical Field
[0001] This application belongs to the technical field of distributed storage systems, and specifically relates to a distributed storage system and its usage method. Background Technology
[0002] Currently, in the cloud computing field, a common resource deployment approach is to establish multiple availability zones (AZs) within a single region to provide users with high availability of storage systems. Simultaneously, for the underlying infrastructure deployed in multiple AZ environments, distributed key-value storage systems (ETCDs) are often used to provide service discovery and configuration distribution capabilities. The deployment method, monitoring, disaster recovery, and other technical solutions of distributed key-value storage systems determine the high availability of the underlying system. A highly available and high-performance distributed key-value storage system deployment and disaster recovery solution is of paramount importance in building the underlying infrastructure.
[0003] However, in related technologies, when a node in a fault isolation domain of a distributed key-value storage system fails, the entire distributed key-value storage system may stop operating normally. Summary of the Invention
[0004] To address the shortcomings of the prior art, this application provides a distributed storage system comprising at least two fault isolation domains. Each fault isolation domain includes at least one first node and at least one second node. The number of first nodes is a minimum odd number greater than the number of fault isolation domains, or, if the number of fault isolation domains is odd, the number of first nodes is equal to the number of fault isolation domains. When one fault isolation domain fails, a second node from the other fault isolation domain is set as the first node, replacing the failed first node and ensuring the normal operation of the entire distributed storage system.
[0005] The technical effect to be achieved in this application is accomplished through the following solution:
[0006] In a first aspect, this application provides a distributed storage system, the distributed storage system including at least two fault isolation domains, with at least one first node and at least one second node set in each fault isolation domain, wherein the number of the first nodes is a minimum odd number greater than the number of fault isolation domains, or when the number of fault isolation domains is odd, the number of the first nodes is equal to the number of fault isolation domains.
[0007] Optionally, at least the two fault isolation domains are set within the same first area.
[0008] Optionally, the number of the first node and the number of the second node are equal.
[0009] Optionally, the distributed storage system is connected to a client; each of the first nodes communicates with each other, and the first node can store data, the client can read the data stored by the first node, and / or the client can write the data stored by the first node.
[0010] Optionally, each of the first nodes stores the same data, and when the data of one first node is modified by the client, the other first nodes will undergo the same change.
[0011] Optionally, the distributed storage system is provided with a first interface; the distributed storage system is provided with a monitoring system, the monitoring system is communicatively connected to the first interface, and the monitoring system is used to call the first interface to detect the operating status of the first node and / or the second node in the distributed storage system.
[0012] Optionally, the second node serves as a backup node, and the second node may refuse requests to read and write the data it stores.
[0013] Optionally, the distributed storage system is provided with a second interface, which is communicatively connected to the second node of each of the fault isolation domains; when the monitoring system detects a fault in one of the fault isolation domains, the monitoring system calls the second interface and sets the second node of another fault isolation domain as the first node.
[0014] Secondly, this application provides a method for using a distributed storage system, employing any of the distributed storage systems described in the first aspect, the method comprising:
[0015] The distributed storage system is detected to obtain its operating status.
[0016] If the operating state is that one of the fault isolation domains has failed, then the second node of the other fault isolation domain is set as the first node.
[0017] Optionally, detecting the distributed storage system to obtain its operating status includes:
[0018] A preset time interval is set, and the distributed storage system is checked at the preset time interval to obtain the operating status of the distributed storage system.
[0019] Thirdly, this application provides a control device for a distributed storage system, comprising:
[0020] The detection unit is used to detect the distributed storage system in order to obtain the operating status of the distributed storage system;
[0021] An execution unit is configured to, if the operating state is that a fault isolation domain has failed, set the second node of another fault isolation domain as the first node.
[0022] Optionally, the execution unit is configured to:
[0023] A preset time interval is set, and the distributed storage system is checked at the preset time interval to obtain the operating status of the distributed storage system.
[0024] Fourthly, this application provides a readable medium comprising execution instructions, wherein when a processor of an electronic device executes the execution instructions, the electronic device performs the method as described in any one of claims 1-7.
[0025] Fifthly, this application provides an electronic device, the electronic device including a processor and a memory storing execution instructions, wherein when the processor executes the execution instructions stored in the memory, the processor performs the method as described in any of the second aspects.
[0026] This application has the following advantages:
[0027] This application discloses a distributed storage system and its usage method. The distributed storage system includes at least two fault isolation domains. In each fault isolation domain, at least one first node and at least one second node are configured. The number of first nodes is a minimum odd number greater than the number of fault isolation domains, or, if the number of fault isolation domains is odd, the number of first nodes is equal to the number of fault isolation domains. When one fault isolation domain fails, a second node from the other fault isolation domain is set as the first node, replacing the failed first node and ensuring the normal operation of the entire distributed storage system. Attached Figure Description
[0028] To more clearly illustrate the embodiments of this application or the existing technical solutions, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0029] Figure 1 This is a schematic diagram of the structure of a distributed storage system in one embodiment of this application;
[0030] Figure 2This is a flowchart of a method for using a distributed storage system according to an embodiment of this application;
[0031] Figure 3 This is a schematic diagram of the structure of the control device of a distributed storage system in one embodiment of this application;
[0032] Figure 4 This is a schematic diagram of the structure of an electronic device in one embodiment of this application. Detailed Implementation
[0033] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of them. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0034] In the cloud computing field, the deployment methods, monitoring, disaster recovery, and other technical solutions of distributed key-value storage systems determine the high availability of the underlying system. A highly available and high-performance distributed key-value storage system deployment and disaster recovery solution is of paramount importance in building the underlying infrastructure.
[0035] However, in related technologies, when a node in a fault isolation domain of a distributed key-value storage system fails, the entire distributed key-value storage system may stop operating normally.
[0036] To address the aforementioned problems, this application proposes a distributed storage system, which includes a distributed key-value storage system. The distributed storage system includes at least two fault isolation domains. Within each fault isolation domain, at least one first node and at least one second node are configured. The number of first nodes is a minimum odd number greater than the number of fault isolation domains, or, if the number of fault isolation domains is odd, the number of first nodes is equal to the number of fault isolation domains. When one fault isolation domain fails, a second node from the other fault isolation domain is set as the first node, replacing the first node of the failed fault isolation domain and ensuring the normal operation of the entire distributed storage system.
[0037] The various non-limiting embodiments of this application will now be described in detail with reference to the accompanying drawings.
[0038] As attached Figure 1The figure shows a schematic diagram of a distributed storage system according to an embodiment of this application. As can be seen from the figure, the distributed storage system includes at least two availability zones (AZs). Each availability zone contains at least one first node and at least one second node. The number of first nodes is a minimum odd number greater than the number of availability zones, or when the number of availability zones is odd, the number of first nodes is equal to the number of availability zones. See also the attached figure. Figure 1 In one example, there are two fault isolation domains. One fault isolation domain contains two first nodes and one second node, while the other fault isolation domain contains one first node and two second nodes. The smallest odd number greater than the number of fault isolation domains (2) is 3; therefore, the number of first nodes is 3.
[0039] In one example, the first node is an ETCD (Electronic Key-Value Store) node, and the second node is a Learner node. When a fault isolation domain fails, the second node from another fault isolation domain is set as the first node, replacing the failed first node in that domain and ensuring the normal operation of the entire distributed storage system. This overcomes the problem that a failure in a node within a fault isolation domain of the distributed key-value store system could cause the entire system to stop operating normally.
[0040] In some embodiments, the at least two fault isolation domains are located within the same first region, and these fault isolation domains within the same first region can communicate with each other to facilitate data coordination. In one example, the distributed storage system uses the Raft consensus protocol to maintain the consistency of the states of the various first nodes within the first region. The distributed storage system is a distributed system composed of multiple first nodes communicating with each other to form a unified storage system that provides services to the outside world. Each first node stores complete data, and the consensus protocol ensures that the data maintained by each first node is consistent. Each first node in the distributed storage system maintains a state machine, and there is always a valid first node at any given time. The first node handles all write operations from clients, and the consensus protocol ensures that changes to the state machine made by write operations are reliably synchronized to other nodes.
[0041] Furthermore, the number of the first nodes and the number of the second nodes are equal. In the above embodiments, it has been explained that when one of the fault isolation domains fails, a second node from another fault isolation domain can be set as the first node to replace the first node in the failed fault isolation domain, ensuring the normal operation of the entire distributed storage system. Therefore, in order to ensure that there are enough second nodes to be set as the first node when a fault isolation domain fails, [further steps are needed].
[0042] In some embodiments, the distributed storage system is connected to a client. Each of the first nodes communicates with each other, and the first node can store data. The client can read the data stored by the first node, and / or the client can write the data stored by the first node. The first node provides data information to the client, or the data stored by the first node facilitates the operation of the distributed storage system.
[0043] Understandably, each of the first nodes stores the same data. When the data on one first node is modified by the client, the other first nodes will experience the same change. That is, the distributed storage system uses the Raft consensus protocol to maintain the consistency of the state of each first node within the first region. The distributed storage system is a distributed system composed of multiple first nodes communicating with each other to form a unified storage system that provides services to the outside world. Each first node stores complete data, and the consensus protocol ensures that the data maintained by each first node is consistent. Each first node in the distributed storage system maintains a state machine, and at any given time, there is a valid first node. The first node handles all write operations from clients, and the consensus protocol ensures that changes to the state machine made by write operations are reliably synchronized to other nodes.
[0044] In some embodiments, the distributed storage system includes a first interface, which can be a member interface. A monitoring system is provided for the distributed storage system, and the monitoring system is communicatively connected to the first interface. The monitoring system is used to call the first interface to detect the operating status of the first node and / or the second node in the distributed storage system. The monitoring system is set up to monitor the operating status of the distributed storage system in real time, thus facilitating online real-time monitoring of the distributed storage system.
[0045] Understandably, the second node acts as a backup node, and it can refuse requests to read or write its stored data. The second node possesses all the data, does not affect the consensus protocol, and each second node serves as a disaster recovery node, achieving high consistency based on native database replication. Each fault isolation domain has the first node and a backup second node, and data applications are also performed on the second node.
[0046] In some embodiments, the distributed storage system includes a second interface. In one example, the second interface is an "upgrade" interface. The second interface is communicatively connected to the second node of each fault isolation domain. When the monitoring system detects a fault in one of the fault isolation domains, the monitoring system invokes the second interface and sets the second node of another fault isolation domain as the first node. By setting the second node as the first node, the faulty first node of the fault isolation domain is filled, ensuring the normal operation of the entire distributed storage system.
[0047] As attached Figure 2 The diagram shows a flowchart of a method for using a distributed storage system according to an embodiment of this application. The method utilizes any of the distributed storage systems described above. The specific structure of this distributed storage system refers to the embodiments described above. Since this method employs all the technical solutions of all the embodiments described above, it possesses at least all the beneficial effects brought about by the technical solutions of the embodiments described above, which will not be elaborated upon here. As can be seen from the diagram, the method for using the distributed storage system includes steps S01 and S02.
[0048] Step S01: Detect the distributed storage system to obtain the operating status of the distributed storage system;
[0049] Step S02: If the operating state is that one of the fault isolation domains has failed, then the second node of the other fault isolation domain is set as the first node.
[0050] The operating status of the distributed storage system is detected. When the distributed storage system fails to operate normally, the second node of the fault isolation domain can be set as the first node of the fault isolation domain, thereby ensuring the operation of the distributed storage system.
[0051] In some embodiments, step S02 includes setting a preset time interval and detecting the distributed storage system at regular intervals to obtain the operating status of the distributed storage system. Detecting the distributed storage system at preset intervals allows for timely detection of faults in the distributed storage system, enabling the second node to be set as the first node, thus ensuring the operation of the distributed storage system.
[0052] As attached Figure 3 The figure shows a schematic diagram of the control device for a distributed storage system according to an embodiment of this application. As can be seen from the figure, the control device for the distributed storage system includes a detection unit and an execution unit.
[0053] The detection unit is used to detect the distributed storage system in order to obtain the operating status of the distributed storage system.
[0054] An execution unit is configured to, if the operating state is that a fault isolation domain has failed, set the second node of another fault isolation domain as the first node.
[0055] Optionally, the execution unit is configured to:
[0056] A preset time interval is set, and the distributed storage system is checked at the preset time interval to obtain the operating status of the distributed storage system.
[0057] Figure 4 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. At the hardware level, the electronic device includes a processor, and optionally also includes an internal bus, a network interface, and a memory. The memory may include RAM, such as high-speed random-access memory (RAM), or non-volatile memory, such as at least one disk storage device. Of course, the electronic device may also include other hardware required for other services.
[0058] The processor, network interface, and memory can be interconnected via an internal bus, which can be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, or an EISA (Extended Industry Standard Architecture) bus, etc. This bus can be divided into address bus, data bus, control bus, etc. For ease of representation, Figure 4The symbol is represented by a single double-headed arrow, but this does not mean that there is only one bus or one type of bus.
[0059] Memory is used to store instructions for execution. Specifically, instructions for execution are computer programs that can be executed. Memory can include main memory and non-volatile memory, and it provides the processor with execution instructions and data.
[0060] In one possible implementation, the processor reads the corresponding execution instructions from non-volatile memory into main memory and then executes them. Alternatively, it may obtain the corresponding execution instructions from other devices to logically form a distributed storage system usage method. The processor executes the execution instructions stored in memory to implement the distributed storage system usage method provided in any embodiment of this application.
[0061] The above is as stated in this application. Figure 3 The method executed by the control device of the distributed storage system provided in the illustrated embodiment can be applied to a processor, or implemented by a processor. The processor may be an integrated circuit chip with signal processing capabilities. During implementation, each step of the above method can be completed by integrated logic circuits in the processor's hardware or by instructions in software form. The processor can be a general-purpose processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc.; it can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor.
[0062] The steps of the method disclosed in the embodiments of this application can be directly manifested as being executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules can reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. This storage medium is located in memory, and the processor reads information from the memory and, in conjunction with its hardware, completes the steps of the above method.
[0063] This application also proposes a readable medium that stores execution instructions. When the stored execution instructions are executed by the processor of an electronic device, the electronic device can perform the distributed storage system usage method provided in any embodiment of this application, and specifically perform the aforementioned distributed storage system usage method.
[0064] The electronic devices described in the foregoing embodiments may be computers.
[0065] Those skilled in the art will understand that the embodiments of this application can be provided as methods or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or a combination of software and hardware.
[0066] The various embodiments in this application are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the device embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.
[0067] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0068] The above description is merely an embodiment of this application and is not intended to limit the scope of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of the claims of this application.
Claims
1. A distributed storage system, characterized in that, The distributed storage system includes at least two fault isolation domains, and at least one first node and at least one second node are set in each fault isolation domain. The number of first nodes is a minimum odd number greater than the number of fault isolation domains, or when the number of fault isolation domains is odd, the number of first nodes is equal to the number of fault isolation domains. The distributed storage system includes a distributed key-value storage system, the first node is an ETCD node, and the second node is a Learner node; The second node acts as a backup node, and it can refuse requests to read or write the data it stores. The distributed storage system is provided with a first interface; the distributed storage system is provided with a monitoring system, the monitoring system is communicatively connected to the first interface, and the monitoring system is used to call the first interface to detect the operating status of the first node and / or the second node in the distributed storage system; The distributed storage system is provided with a second interface, which is communicatively connected to the second node of each of the fault isolation domains. When the monitoring system detects a fault in one of the fault isolation domains, the monitoring system calls the second interface and sets the second node of another fault isolation domain as the first node to fill the first node of the faulty fault isolation domain.
2. The distributed storage system as described in claim 1, characterized in that, At least the two fault isolation domains are set in the same first region.
3. The distributed storage system as described in claim 2, characterized in that, The number of the first node and the number of the second node are equal.
4. The distributed storage system as described in claim 3, characterized in that, The distributed storage system is connected to the client; each of the first nodes communicates with each other, and the first node can store data, the client can read the data stored by the first node, and / or the client can write the data stored by the first node.
5. The distributed storage system as described in claim 4, characterized in that, Each of the first nodes stores the same data, and when the data of one first node is modified by the client and changes, the other first nodes will undergo the same change.
6. A method of using a distributed storage system, characterized in that, Using the distributed storage system as described in any one of claims 1-5, the method includes: The distributed storage system is detected to obtain its operating status. If the operating state is that one of the fault isolation domains has failed, then the second node of the other fault isolation domain is set as the first node.
7. The method of using the distributed storage system as described in claim 6, characterized in that, The step of detecting the distributed storage system to obtain its operating status includes: A preset time interval is set, and the distributed storage system is checked at the preset time interval to obtain the operating status of the distributed storage system.