[0050] Example 1
[0051] see figure 1 shown, figure 1 It is a flowchart of Embodiment 1 of the data update method in the distributed storage system disclosed in the embodiment of the present invention. In this embodiment, the method may include:
[0052] Step 101: The current server node receives the data to be updated sent by the client.
[0053]The client first sends the data to be updated to a server node in the distributed storage system. For example, user information is stored on each server node in the distributed storage system, and if the user information needs to be changed, the client Sending new user information to one of the server nodes is the data to be updated, and the server node that receives the new user information is the current server node.
[0054] Step 102: The current server node incrementally assigns a unique version number to the data to be updated, and acquires the identifiers of multiple replica server nodes where multiple copies of the data to be updated are located from the metadata information repository.
[0055] The metadata information repository is pre-established, and can store the identifiers of each server node in the distributed storage system, the distribution information of the replicas in the server nodes, and the status of the replicas. Each server node can register the copy distribution information and copy status of the server node with the metadata information repository through its own data service module when it is started, and then maintain the heartbeat with the metadata information repository. It is also possible to maintain replica distribution information and replica state data through the metadata information repository, and provide externally a replica query interface and a replica state change monitoring interface.
[0056] After receiving the to-be-updated data sent by the client, the current server node may acquire, from the metadata information repository, the identifiers of multiple replica server nodes where multiple replicas of the to-be-updated data are located. In a distributed storage system, a piece of data will be divided into multiple sub-data, and these multiple sub-data are stored in multiple server nodes, and each server node has a copy of a certain sub-data stored in it. Therefore, in this step, if the data to be updated is received, it is first necessary to know which server nodes the data to be updated is stored in, that is, multiple replica server nodes where multiple copies of the data to be updated are located.
[0057] For example, if the data to be updated is user information, then if the user information is divided into 10 parts and stored in 10 servers, the 1st to 5th servers save the 1st to 100th user information, and the 6th to 100th The 10th server saves the 101st to 200th user information, and so on, the 45th to 50th servers saves the 901st to 1000th user information, and the user information saved on each server node has a copy. exist. If the data to be updated in this step is the 99th user information, it can be determined that the copies saved in the 1st to 5th server nodes need to be updated, that is, the 1st to 5th server nodes are determined in this step.
[0058] Among them, when the current server node assigns a version number to the data to be updated, for the same data, the version number of the data to be updated received by the current server from the first update can be assigned as 1, and so on. After several updates, a unique version number can be assigned to the data to be updated, and each time the replica server receives the data to be updated and the version number, it updates the data and records the currently received version number, and treats it as the copy's version number. Current version number.
[0059] Step 103: The current server node sends the data to be updated and its assigned version number to the replica server nodes corresponding to the identifiers of the multiple replica server nodes, so that the multiple replica server nodes store the data separately according to the data to be updated. The copy and the corresponding version number are updated; the version number indicates the update times of the copy.
[0060] The current server node sequentially sends the data to be updated to the replica server nodes corresponding to the identifiers of the multiple replica server nodes, and the multiple replica server nodes then store the respective copies and corresponding versions according to the data to be updated. number, where the version number represents the number of times the copy is updated. For example, a version number of 1 indicates that the current update is the first update, and so on, a version number of n indicates that the current update is the nth update. When the replica server performs data update, it also needs to determine whether the version number is consistent, that is, whether the replica version number recorded by itself is 1 smaller than the received version number, and if not, it does not need to update.
[0061] The current server node may send data to the multiple replica server nodes in different ways, for example, the data to be updated may be sent to the multiple replica server nodes in parallel or in an orderly manner.
[0062] Step 104 : the current server node judges whether at least half of the multiple replica server nodes successfully update data, and if so, go to step 105 .
[0063] After the replica server node successfully updates the copy saved by itself, it informs the current server node that if the current server node determines that at least half of the replica server nodes have successfully updated the data, it will go to step 105 to return the data update success to the client. The message and the updated version number. For example, following the above example, if three replica server nodes are successfully updated, step 105 is executed. If there are 6 replica servers in total, at least 4 replica servers are required to be updated successfully.
[0064] Step 105: Return the data update success message and the updated version number to the client.
[0065] In this embodiment, when at least half of the replica server nodes are successfully updated, it is considered that the update is successful when updating the data, so that the efficiency of updating the data can be improved. At the same time, the scheme in which the version number of the data corresponds to the number of updates is also adopted, so that when the server node requests data with the successfully written version number when reading, if the version number of the server node indicates that the copy stored by itself is not the latest , the read operation can be rejected, so that the client can retry to other server nodes to ensure that the client can read the latest data, that is, to achieve read-write consistency. Therefore, the present embodiment more completely solves the problem of data consistency among multiple copies, and ensures subsequent read performance, and the solutions provided by the embodiments of the present invention can be directly applied to various storage systems.