Example processing methods, apparatuses, electronic devices, and computer-readable storage media

By updating the hash value and permission expiration time of the permission table in the stateful service instance and managing them according to the preset renewal cycle, the availability problem of stateful service instances during scaling up, down or failure is solved, ensuring the stability and reliability of the service.

CN117081761BActive Publication Date: 2026-06-26TENCENT TECHNOLOGY (SHENZHEN) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
TENCENT TECHNOLOGY (SHENZHEN) CO LTD
Filing Date
2022-05-09
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Stateful service instances cannot maintain availability when the server scales up or down or fails, resulting in service unavailability.

Method used

By obtaining the updated local permission table, including the hash value of stateful service instances on the hash ring and the permission expiration time, and updating it according to the preset renewal cycle, the validity of the permission table is ensured. The table is also updated accordingly when new service instances are launched to avoid permission conflicts.

Benefits of technology

This ensures the availability and permission validity of stateful service instances in the event of server scaling up, scaling down, or failure, reducing the risk of service interruption and improving system stability.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117081761B_ABST
    Figure CN117081761B_ABST
Patent Text Reader

Abstract

Embodiments of the present application disclose an instance processing method and device, electronic equipment and a computer readable storage medium; the method comprises: obtaining a local authority table updated at a first renewal time, the local authority table comprising a first hash value and a first authority deadline of each stateful service instance on a hash ring; at a second renewal time, based on the update data obtained from the database, performing a first update on the first authority deadline and the first hash value to obtain a first updated local authority table; if the first updated local authority table comprises a second hash value of a newly added stateful service instance on the hash ring, obtaining a target update time of the newly added stateful service instance; according to the target update time, performing a second update on the first updated local authority table to obtain a second updated local authority table, and the second updated local authority table comprises a second authority deadline of the newly added stateful service instance. The embodiments of the present application can guarantee the availability of the cluster.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of Internet technology, and more specifically to an instance processing method, apparatus, electronic device, and computer-readable storage medium. Background Technology

[0002] Stateful services refer to services that store intermediate data generated during operation and depend on context. In other words, services that process the same request may yield different results. For example, a stateful service might be a purchase service. The server, through the purchase service, adds the first item to the shopping cart based on a received add-to-cart request. When the server receives a purchase service request, it responds based on the first item in the shopping cart. The first item in the shopping cart is the intermediate data for the purchase service. A stateful service typically has multiple instances.

[0003] When a server is running a stateful service instance, if the server scales up or down or fails, the stateful service instance may lose availability. Summary of the Invention

[0004] This application provides an instance processing method, apparatus, electronic device, and computer-readable storage medium, which can solve the technical problem that stateful service instances cannot maintain availability.

[0005] An instance processing method includes:

[0006] Obtain the local permission table updated with the first renewal time. The local permission table includes the first hash value and the first permission expiration time of each stateful service instance on the hash ring. The time interval between the first permission expiration time and the first renewal time is greater than the preset renewal period.

[0007] During the second renewal period, updated data is retrieved from the database, and based on the updated data, the first permission expiration time and the first hash value are updated to obtain the local permission table after the first update. The first time interval between the second renewal period and the first renewal period is the preset renewal period.

[0008] If the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, then obtain the target update time of the newly added stateful service instance.

[0009] Based on the target update time mentioned above, the local permission table after the first update is updated a second time to obtain the local permission table after the second update. The local permission table after the second update includes the expiration time of the second permission for the newly added stateful service instance.

[0010] Accordingly, embodiments of this application provide an instance processing apparatus, including:

[0011] The first acquisition module is used to acquire the local permission table updated by the first renewal time. The local permission table includes the first hash value of each stateful service instance on the hash ring and the first permission expiration time. The time interval between the first permission expiration time and the first renewal time is greater than the preset renewal period.

[0012] The second acquisition module is used to acquire updated data from the database during the second renewal time, and based on the updated data, perform a first update on the first permission expiration time and the first hash value to obtain the local permission table after the first update. The first time interval between the second renewal time and the first renewal time is the preset renewal period.

[0013] The third acquisition module is used to acquire the target update time of the newly added stateful service instance if the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring.

[0014] The update module is used to perform a second update on the local permission table after the first update based on the target update time, so as to obtain a second updated local permission table. The second updated local permission table includes the second permission expiration time of the newly added stateful service instance.

[0015] Optionally, the third acquisition module is specifically used to execute:

[0016] If the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, then obtain the preset time and the online time of the newly added stateful service instance.

[0017] The target renewal period is determined based on the aforementioned preset time and the aforementioned preset renewal period;

[0018] Based on the above-mentioned launch time and target renewal period, the target update time for the newly added stateful service instances is determined.

[0019] Optionally, the above-mentioned instance processing apparatus further includes:

[0020] Delete the module used for execution:

[0021] Receive offline notification;

[0022] Based on the above offline notification, stateful service instances that are offline are selected from the above local permission table.

[0023] Delete the first hash value and first permission expiration time of the above offline stateful service instance in the above local permission table to obtain the third updated local permission table.

[0024] Optionally, the second acquisition module is specifically used to perform:

[0025] The aforementioned preset renewal period is divided to obtain a time slice set, which includes a first number of time slices;

[0026] Filter out the target time slice corresponding to each of the above stateful service instances from the above time slice set;

[0027] In the second renewal time of the target time shard corresponding to the aforementioned stateful service instance, updated data is retrieved from the database. The second renewal time and the first renewal time are in different preset renewal periods.

[0028] Optionally, the first number mentioned above is greater than the second number of stateful service instances mentioned above in the local permission table.

[0029] Accordingly, the second acquisition module is specifically used to execute:

[0030] The target time slice is obtained by selecting the first second number of time slices from the above set of time slices;

[0031] Assign a target time slice to each of the above stateful service instances to obtain the target time slices corresponding to the stateful service instances;

[0032] The second acquisition module mentioned above is specifically used to execute:

[0033] Based on the updated data, the first permission expiration time and the first hash value are updated for the first time.

[0034] If the first update fails, retry time slices are selected from the remaining time slices. The remaining time slices are the time slices in the set of time slices other than the target time slice.

[0035] In the retry time sharding described above, the first permission expiration time and the first hash value are updated based on the updated data to obtain the local permission table after the first update.

[0036] Optionally, the second acquisition module described above is specifically used to perform:

[0037] If the first update fails, then determine the target number of update failures;

[0038] If the target number is less than or equal to the preset number, then the retry time shard of the stateful service instance is selected from the remaining time shards based on the target time shard corresponding to the stateful service instance and the target number.

[0039] Optionally, the above-mentioned instance processing apparatus further includes:

[0040] The request / response module is used to execute:

[0041] Obtain the service request for the first execution stateful service instance, and perform a hash operation on the object identifier in the service request to obtain the third hash value corresponding to the service request.

[0042] Based on the third hash value and the fourth hash value in the second updated local permission table, the second execution stateful service instance that executes the above service request is selected from the second updated local permission table.

[0043] If the first execution stateful service instance and the second execution stateful service instance are the same, and the current time is before the first permission expiration time of the second execution stateful service instance, then the service request will be responded to through the second execution stateful service instance.

[0044] Furthermore, this application also provides an electronic device, including a processor and a memory, wherein the memory stores a computer program, and the processor is used to run the computer program in the memory to implement the instance processing method provided in this application.

[0045] Furthermore, embodiments of this application also provide a computer-readable storage medium storing a computer program adapted for loading by a processor to execute any of the instance processing methods provided in embodiments of this application.

[0046] Furthermore, this application also provides a computer program product, including a computer program, which, when executed by a processor, implements any of the instance processing methods provided in this application.

[0047] In this embodiment, a local permission table updated with a first renewal time is obtained. This local permission table includes the first hash value of each stateful service instance on the hash ring and a first permission expiration time. The time interval between the first permission expiration time and the first renewal time is greater than a preset renewal period. Then, at the second renewal time, updated data is retrieved from the database, and based on this updated data, the first permission expiration time and the first hash value are updated to obtain a first-updated local permission table. The first time interval between the second renewal time and the first renewal time is the preset renewal period. If the first-updated local permission table includes the second hash value of a newly added stateful service instance on the hash ring, the target update time of the newly added stateful service instance is obtained. Finally, based on the target update time, the first-updated local permission table is updated a second time to obtain a second-updated local permission table, which includes the second permission expiration time of the newly added stateful service instance.

[0048] In this embodiment, since the first time interval between the first renewal time and the second renewal time is a preset renewal period, and the time interval between the first permission expiration time and the first renewal time is greater than the preset renewal period, updating the local permission table at the second renewal time can ensure that the permissions of each stateful service instance on the hash ring in the local permission table will not expire, thus ensuring the availability of each stateful service instance.

[0049] Furthermore, when the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, it indicates the existence of a newly launched stateful service instance. This means that some permissions of the existing stateful service instance on the hash ring need to be transferred to the newly added stateful service instance. Because the existence of the newly added stateful service instance can only be known after the second renewal time, based on the updated data. Therefore, by performing a second update on the local permission table after the first update according to the target update time, the second expiration time of the newly added stateful service instance's permissions is obtained. This avoids the situation where these permissions are simultaneously held by both the existing stateful service instance and the newly added stateful service instance, thus ensuring the availability of both. Attached Figure Description

[0050] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0051] Figure 1 This is a schematic diagram of the instance processing procedure provided in the embodiments of this application;

[0052] Figure 2 This is a flowchart illustrating the instance processing method provided in the embodiments of this application;

[0053] Figure 3 This is a schematic diagram illustrating the distribution of data provided in the embodiments of this application;

[0054] Figure 4 This is a schematic diagram of cluster scaling down provided in an embodiment of this application;

[0055] Figure 5 This is a schematic diagram of another cluster scaling-down method provided in an embodiment of this application;

[0056] Figure 6 This is a schematic diagram of a hash ring provided in an embodiment of this application;

[0057] Figure 7 This is a schematic diagram of the renewal process provided in an embodiment of this application;

[0058] Figure 8 This is a schematic diagram of retry time fragmentation provided in an embodiment of this application;

[0059] Figure 9 This is a schematic diagram of another hash ring provided in an embodiment of this application;

[0060] Figure 10 This is a schematic diagram of another renewal process provided in an embodiment of this application;

[0061] Figure 11 This is a schematic diagram of another renewal process provided in an embodiment of this application;

[0062] Figure 12 This is a schematic diagram of another renewal process provided in an embodiment of this application;

[0063] Figure 13 This is a flowchart illustrating another instance processing method provided in an embodiment of this application;

[0064] Figure 14 This is a schematic diagram of the framework of the instance processing method provided in the embodiments of this application;

[0065] Figure 15 This is a schematic diagram illustrating the online process of a stateful service instance provided in the embodiments of this application;

[0066] Figure 16 This is a schematic diagram of the structure of the instance processing device provided in the embodiments of this application;

[0067] Figure 17 This is a schematic diagram of the structure of the electronic device provided in the embodiments of this application. Detailed Implementation

[0068] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0069] This application provides an instance processing method, apparatus, electronic device, and computer-readable storage medium. The instance processing apparatus can be integrated into an electronic device, which may be a server or a terminal, etc.

[0070] The server can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, network acceleration services (Content Delivery Network, CDN), as well as big data and artificial intelligence platforms.

[0071] Furthermore, multiple servers can form a blockchain, with the servers being nodes on the blockchain.

[0072] The terminal can be a smartphone, tablet, laptop, desktop computer, smart speaker, smartwatch, etc., but is not limited to these. The terminal and the server can be connected directly or indirectly through wired or wireless communication, which is not limited herein.

[0073] For example, such as Figure 1 As shown, the server cluster includes a first server and a second server. The first server first obtains the local permission table updated with the first renewal time. The local permission table includes the first hash value of each stateful service instance on the hash ring and the first permission expiration time. The time interval between the first permission expiration time and the first renewal time is greater than a preset renewal period. Then, when the second server obtains a newly added stateful service instance, it updates its own target local permission table so that the updated target local permission table includes the second hash value of the newly added stateful service instance, and stores the second hash value of the newly added stateful service instance in the main permission table of the database.

[0074] The first server retrieves updated data from the database during the second renewal period. The updated data includes the main permission table in the database. Based on the main permission table in the database, the first server updates the local permission table updated during the first renewal period to obtain the first updated local permission table. The first updated local permission table includes the second hash value of the newly added stateful server instance. The first time interval between the second renewal period and the first renewal period is a preset renewal period. The first permission expiration time in the first updated local permission table is sent to the main permission table in the database for storage.

[0075] The second server determines the target update time based on the online time and preset renewal period of the newly added stateful service instance. Then, based on the target update time, it retrieves the target update data from the database, updates the target local permission table according to the target update data, obtains the second permission expiration time of the newly added stateful service instance, and stores the second permission expiration time of the newly added stateful service instance in the main permission table of the database, thus obtaining the updated main permission table in the database.

[0076] At the third renewal time, the first server retrieves updated data from the database. The updated data includes the updated master permission table in the database. Based on the updated master permission table in the database, the first updated local permission table is updated a second time to obtain the second updated local permission table. The second updated local permission table includes the second permission expiration time of the newly added stateful service instance. The second time interval between the third renewal time and the second renewal time is the preset renewal period.

[0077] Furthermore, in the embodiments of this application, "multiple" refers to two or more. The terms "first" and "second," etc., in the embodiments of this application are used for distinguishing descriptions and should not be construed as implying relative importance.

[0078] The following sections provide detailed descriptions of each example. It should be noted that the order in which the embodiments are described is not intended to limit the preferred order of the embodiments.

[0079] In this embodiment, the description will be from the perspective of an instance processing device, which can be integrated into a server or terminal. To facilitate the explanation of the instance processing method of this application, the following will describe the instance processing device integrated into a first server in detail, that is, the first server will be used as the execution subject in the detailed explanation.

[0080] Please see Figure 2 , Figure 2 This is a flowchart illustrating an embodiment of an instance processing method provided in this application. The instance processing method may include:

[0081] S201. Obtain the local permission table updated with the first renewal time. The local permission table includes the first hash value and the first permission expiration time of each stateful service instance on the hash ring. The time interval between the first permission expiration time and the first renewal time is greater than the preset renewal period.

[0082] An instance refers to a collection of code that implements a specific function, such as a collection of code that implements the login function for game players. A stateful service instance refers to an instance corresponding to a stateful service. A stateful service is a service that stores intermediate data generated during service operation and depends on its context. For example, when a server purchases a service, it adds the first item to a shopping cart based on a received add-to-service request. When the server receives a purchase service request, it responds to the purchase service request based on the first item in the shopping cart. The first item in the shopping cart is the intermediate data for the purchase service.

[0083] Each service request contains an associated object. For example, if service request 1 is to obtain data about game player A, then the associated object of service request 1 is game player A. In other words, service request 1 contains the identifier of game player A.

[0084] Typically, there are multiple objects and multiple stateful service instances. To improve processing speed, the data corresponding to the objects is stored in the cache of the stateful service instance. That is, the data corresponding to the object identifier is stored in the cache of the stateful service instance. To ensure data consistency, the data corresponding to the same object identifier cannot be processed by two stateful service instances at the same time. Furthermore, to achieve load balancing, multiple stateful service instances can evenly process the data corresponding to multiple object identifiers.

[0085] For example, there exist M object identifiers corresponding to data and N stateful service instances, where 0, 1, 2, ..., M-1 represent object identifiers, and 0, 1, 2, ..., N-1 represent identifiers of stateful service instances (both object identifiers and stateful service instance identifiers can exist in the form of numbers). The M object identifiers and N stateful service instances satisfy the following mapping relationship:

[0086] f: {0, 1, 2..., M-1} → {0, 1, 2..., N-1}

[0087] The type of f can be selected according to the actual situation. For example, f can be a modulo operator, i.e.:

[0088] f(m)=m%N, m∈{0, 1, 2..., M-1}

[0089] However, in practical applications, the object identifier and the identifier of the stateful service instance may not be regular. For example, the object identifier may be any 32-bit integer, and the object identifier and the identifier of the stateful service instance may not be continuous. This may result in multiple stateful service instances not processing the data corresponding to multiple object identifiers evenly, that is, the distribution of the data corresponding to multiple object identifiers is not uniform.

[0090] Furthermore, when the cluster expands, shrinks, or experiences a failure, the data corresponding to the object identifier needs to be reallocated, causing most of the cached data corresponding to the object identifier to become invalid, resulting in a cascading failure.

[0091] For example, when there are 6 sets of data corresponding to object identifiers and 3 stateful service instances, the distribution of the data corresponding to the 6 sets of object identifiers can be as follows: Figure 3 As shown in section 301, when the cluster shrinks, i.e., when only two stateful service instances remain, the data corresponding to the object identifiers will be reallocated. This causes most of the cached data corresponding to the object identifiers to become invalid. The cached data corresponding to the invalidated object identifiers can be... Figure 3 As shown by the dashed line in 302.

[0092] To solve this technical problem, a consistent hashing function can be used as f in this instance, that is, a consistent hashing function can be used for the above mapping relationship. In this case, the process of mapping objects to stateful service instances can be as follows:

[0093] First, the space containing the hash value is treated as a hash ring, and the hash ring is divided into scales. Then, by taking the remainder of the object identifier with respect to the size of the hash ring, the hash value of the object identifier on the hash ring is obtained (i.e., the scale of the object identifier on the hash ring is obtained). By taking the remainder of the stateful service instance identifier with respect to the size of the hash ring, the first hash value of the stateful service instance identifier on the hash ring is obtained (i.e., the scale of the object identifier on the hash ring is obtained). In other words, the object identifier and the identifier of the stateful service instance are mapped to the hash ring, thereby finding the correspondence between the object identifier and the identifier of the stateful service instance.

[0094] For example, the positions of the object identifier and the identifier of the stateful service instance on the hash ring can be as follows: Figure 4 As shown in section 401, the stateful service instance corresponding to the first stateful service instance encountered by the hash value of the object identifier in a clockwise direction is the stateful service instance corresponding to the object identifier. For example, the stateful service instance corresponding to object {0, 3} (object {0, 3} represents object identifier 0 and object identifier 3) is stateful service instance 0, and the stateful service instance corresponding to object {2, 5} is stateful service instance 2.

[0095] When the cluster is downsized, stateful service instance 2 is removed from the hash ring. At this time, as follows: Figure 4 As shown in Figure 402, the stateful service instance corresponding to object {2, 5} changes from stateful service instance 2 to stateful service instance 0. At this time, only the data corresponding to object {2, 5} is invalid in the cache, and the data corresponding to other object identifiers does not need to be changed, thereby reducing the amount of invalid data.

[0096] Although using a consistent hashing function as the mapping relationship can solve the problem of uneven data distribution, the data distribution is still not uniform as scaling up and down proceeds. Therefore, to further address the issue of data distribution uniformity, in some embodiments, the process of mapping objects to stateful service instances can be as follows:

[0097] The identifier of a stateful service instance is mapped to multiple virtual identifiers, and then the virtual identifiers are mapped to a hash ring. When an object identifier corresponds to a certain virtual identifier, the object identifier also corresponds to the identifier of the stateful service instance corresponding to that virtual identifier.

[0098] For example, such as Figure 5As shown in Figure 501, stateful service instance identifier A is mapped to virtual identifiers A1, A2, A3, and A4. The hash value of virtual identifier A1 is 0, the hash value of virtual identifier A2 is 90, the hash value of virtual identifier A3 is 505, and the hash value of virtual identifier A4 is 701. Stateful service instance B is mapped to virtual identifiers B1, B2, B3, and B4. The hash value of virtual identifier B1 is 251, the hash value of virtual identifier B2 is 480, the hash value of virtual identifier B3 is 800, and the hash value of virtual identifier A4 is 850.

[0099] If the hash value corresponding to the object identifier is 270, then the virtual identifier corresponding to the object identifier is B2, and the stateful service instance corresponding to the object identifier is stateful service instance B.

[0100] If a stateful service instance goes offline, all virtual identifiers corresponding to that stateful service instance are deleted. For example, if stateful service instance B goes offline, all virtual identifiers of stateful service instance B on the hash ring are deleted. Figure 5 As shown in Figure 502. At this time, if the hash value corresponding to the object identifier is 270, then the virtual identifier corresponding to the object identifier is A3, and the stateful service instance corresponding to the object identifier is stateful service instance A.

[0101] As the number of virtual identifiers increases, stateful service instances can be more evenly distributed on the hash ring, thereby enabling data to be evenly distributed to each stateful service instance, thus achieving a better load balancing effect. In this embodiment, virtual identifiers are introduced to ensure dispersion while reducing the impact range when stateful service instances change.

[0102] Therefore, in this embodiment, after the first server obtains the stateful service instance, it first maps the identifier of the stateful service instance to multiple virtual identifiers, and then performs a hash operation on each virtual identifier to obtain the first hash value of each virtual identifier on the hash ring, which is also the first hash value of the stateful service instance on the hash ring. Then, the identifier (or virtual identifier) ​​of the stateful service instance and the first hash value of the virtual identifier on the hash ring are stored in the local permission table. That is, the identifier of a stateful service instance has multiple first hash values ​​on the hash ring.

[0103] It should be understood that since there are multiple servers in the cluster (the first server is one of the servers in the cluster), in order to keep the local permission tables of each server consistent, after the first server stores the virtual identifier and the first hash value of the virtual identifier on the hash ring in its local permission table, it can store the virtual identifier and the first hash value of the virtual identifier on the hash ring in the database. This allows the second server to retrieve the virtual identifier and the first hash value of the virtual identifier on the hash ring from the database and update the target local permission table in the second server. The second server is a server in the cluster other than the first server.

[0104] Furthermore, in this embodiment, the local permission table also includes a first permission expiration time for the stateful service instance, so that when the first server receives a service request, it can determine whether the stateful service instance has permission to process the service request based on the first permission expiration time.

[0105] At this point, the representation of a stateful service instance on the hash ring can be as follows: Figure 6 As shown, {A,0,t1} represents that the first hash value of stateful service instance A is 0 and the first permission expiration time is t1.

[0106] Since each stateful service instance has a corresponding first permission expiration time, if a stateful service instance needs to maintain its permissions on the hash, the first permission expiration time needs to be updated.

[0107] Therefore, in this embodiment, a preset renewal period is set, and the expiration time of the first permission is updated according to the preset renewal period to extend the expiration time of the first permission. Each time an update is performed, the first server can first retrieve the local permission table updated with the first renewal time from its local storage space. That is, the first renewal time is the time when the first server last updated the local permission table.

[0108] Furthermore, the time interval between the first permission expiration time and the first renewal time is greater than the preset renewal period, so that when the first permission expiration time is updated according to the preset renewal period, the permissions of the stateful service instance on the hash ring are still in a valid state.

[0109] However, if a stateful service instance goes offline before the first permission expiration time has elapsed, the instance may still retain permissions on the hash table. Therefore, a long first permission expiration time can negatively impact cluster availability.

[0110] If the initial permission expiration time is too short, a brief network failure may cause stateful service instances to lose permissions on the hash because the expiration time is not updated in time. Therefore, the initial permission expiration time should not be too long or too short.

[0111] Therefore, in some embodiments, the time interval between the first permission expiration time and the first renewal time can be twice the preset renewal period. That is, when a stateful service instance is updated once, the validity period of the permissions of the stateful service instance is twice the preset renewal period. So even if the stateful service instance goes offline abnormally, the permissions occupied by the stateful service instance on the hash will only be retained for a maximum of twice the preset renewal period, and twice the preset renewal period is sufficient to ensure normal updates.

[0112] It should be noted that if a stateful service instance is not updated before the first permission expiration time, then the first permission expiration time of the stateful service instance will no longer be allowed to be updated, that is, the stateful service instance will no longer be allowed to be renewed.

[0113] For example, such as Figure 7 As shown, the first renewal time is t0, which is the Nth renewal time, and T is the preset renewal period. The second renewal time is (t0+T), which is the (N+1)th renewal time. If the local permission table is renewed for the (N+2)th time after (t0+3T), the (N+2)th renewal is invalid because the stateful service instance has lost its permissions on the hash.

[0114] S202. At the second renewal time, retrieve updated data from the database and, based on the updated data, perform a first update on the first permission expiration time and the first hash value to obtain the local permission table after the first update. The first time interval between the second renewal time and the first renewal time is the preset renewal period.

[0115] After obtaining the first renewal update of the local permission table, if the time reaches the second renewal time of the first deployed stateful service instance, the first server updates the first permission expiration time of the first deployed stateful service instance. The first deployed stateful service instance refers to the stateful service instance deployed on the first server in the local permission table.

[0116] However, since the second server also updates the first permission expiration time of the second deployed stateful service instance and stores the updated first permission expiration time of the second deployed stateful service instance in the database, the second deployed stateful service instance refers to the stateful service instance deployed on the second server in the local permission table.

[0117] In other words, if the expiration time of the first permission for the second deployed stateful service instance also changes, then the expiration time of the first permission for the second deployed stateful service instance in the local permission table needs to be updated.

[0118] Furthermore, between the first renewal period and the second renewal period, it is possible that a new stateful service instance was launched, and the second hash value of the new stateful service instance will also be stored in the database.

[0119] Therefore, during the second renewal period, the first server can obtain updated data from the database. That is, the updated data includes the first permission expiration time after the second deployed stateful service instance is updated, and / or the updated data includes the second hash value of the newly added stateful service instance.

[0120] Then, based on the updated data, the first permission expiration time and the first hash value of the stateful service instance in the local permission table are updated.

[0121] Optionally, if the updated data includes the second hash value of the newly added stateful service instance, the first permission expiration time in the local permission table can be modified based on the updated data to obtain the first permission expiration time of the stateful service instance in the second renewal time, and the second hash value can be inserted into the local permission table to update the permissions of the first hash value, thereby realizing the update of the first permission expiration time and the first hash value.

[0122] If the updated data does not include the second hash value of the newly added stateful service instance, then based on the updated data, only the expiration time of the first permission in the local permission table can be modified. That is, in this case, based on the updated data, only the expiration time of the first permission is updated.

[0123] It should be noted that the local permissions table may include the first hash value and the first permission expiration time of all stateful service instances in the cluster where the first server is located on the hash ring. Then, when the time reaches the second renewal time of the first deployed stateful service instance, the first server retrieves the updated data from the database.

[0124] For example, the cluster includes stateful service instance A and stateful service instance B. That is, the local permission table includes the first permission expiration time and the first hash value on the hash ring of stateful service instance A, as well as the first permission expiration time and the first hash value on the hash ring of stateful service instance B.

[0125] Stateful service instance A is the first deployed stateful service instance, and stateful service instance B is the second deployed stateful service instance. The first renewal time for stateful service instance A is t0, and the first renewal time for stateful service instance B is (t0+e). The second server will store the updated target local permission table in the database at (t0+e).

[0126] The first server retrieves updated data from the database at (t0+T) to update its local permission table, and stores the updated local permission table in the database at (t0+T). The updated data includes the target local permission table updated by the second server at (t0+e).

[0127] The second server retrieves the target update data from the database at (t0+e+T). The target update data includes the local permission table updated by the first server at (t0+T), so as to update the target local permission table in the second server according to the updated permission table of the first server.

[0128] When there are a large number of stateful service instances in the cluster, conflicts can easily arise when each server in the cluster retrieves updated data from the database to update the expiration time of the first permission of the stateful service instances and stores the updated data in the database, resulting in renewal failure and affecting the availability of the cluster.

[0129] Therefore, in some embodiments, during the second renewal time, retrieving updated data from the database includes:

[0130] The preset renewal period is divided into time slice sets, which include a first number of time slices.

[0131] Filter the target time shard corresponding to each stateful service instance from the time shard set;

[0132] In the second renewal time of the target time shard corresponding to the stateful service instance, updated data is retrieved from the database. The second renewal time and the first renewal time are in different preset renewal periods.

[0133] For example, both the first and second preset renewal periods are divided to obtain time shards corresponding to the first and second preset renewal periods. Time shard 1 is the time shard corresponding to the stateful service instance A. That is, the first renewal time is in time shard 1 of the first preset renewal period, and the second renewal time is in time shard 1 of the second preset renewal period. The interval between the first and second renewal times is the preset renewal period.

[0134] The length of each time shard is sufficient for the server to complete the renewal of stateful service instances, that is, sufficient for the server to complete the update of the first permission expiration time of stateful service instances. Furthermore, typically within a preset renewal cycle, after a second number of time shards, the renewal of all stateful service instances in the cluster can be completed. Therefore, at this point, the first number can be equal to the second number, where the second number is the number of stateful service instances in the cluster.

[0135] In this embodiment, a time shard is allocated to each stateful service instance, so that the first server retrieves updated data from the database during the second renewal time in the time shard corresponding to the stateful service instance, updates the first permission expiration time of the stateful service instance according to the updated data, and finally stores the updated first permission expiration time in the database. This ensures that there are no conflicts when the servers in the cluster access the database, thereby guaranteeing the normal maintenance of permissions.

[0136] It should be understood that any server in the cluster can divide the preset renewal period to obtain a set of time shards. The set of time shards includes a first number of time shards. The target time shard corresponding to each stateful service instance is selected from the set of time shards, and then the target time shard corresponding to each stateful service instance is sent to the database so that the server where the stateful service instance is located can retrieve it from the database.

[0137] For example, the first server can divide the preset renewal period to obtain a time shard set, which includes a first number of time shards. The target time shard corresponding to each stateful service instance can be selected from the time shard set.

[0138] It should be noted that because the local permission table is updated based on a preset renewal period, when the first renewal time of a stateful service instance within the first preset renewal period is determined, subsequent renewal times (such as the second or third renewal time) are also determined. Therefore, if the second renewal time is the first renewal time within the first preset renewal period (in this case, the local permission table updated at the first renewal time can refer to the local permission table retrieved by the first server from the database at the first renewal time), the first server can perform the following steps: dividing the preset renewal period to obtain a set of time shards, selecting the target time shard corresponding to each stateful service instance from the set of time shards, and then retrieving the updated data from the database at the second renewal time of the target time shard corresponding to the stateful service instance.

[0139] If the second renewal time is not within the first preset renewal period, the first server can directly execute the step of updating data from the database at the second renewal time.

[0140] However, when factors such as network jitter occur, a congestion situation may occur. In this case, some stateful service instances may fail to renew due to database access conflicts, making it impossible to renew all stateful service instances in the cluster after a second number of time shards.

[0141] Therefore, in some embodiments, the first number is greater than the second number of stateful service instances in the local permissions table. Accordingly, the target time shard corresponding to each stateful service instance is selected from the time shard set, including:

[0142] The target time slice is obtained by selecting the second-to-last number of time slices from the set of time slices.

[0143] Assign a target time shard to each stateful service instance to obtain the target time shard corresponding to the stateful service instance;

[0144] Based on the updated data, the first permission expiration time and the first hash value are updated to obtain the local permission table after the first update, including:

[0145] Based on the updated data, the first permission expiration time and the first hash value are updated for the first time;

[0146] If the first update fails, retry time slices are selected from the remaining time slices. The remaining time slices are the time slices in the time slice set other than the target time slice.

[0147] In the retry time sharding, based on the updated data, the first permission expiration time and the first hash value are updated to obtain the local permission table after the first update.

[0148] In this embodiment, the first quantity is greater than the second quantity, so that when the first update of the first permission expiration time and the first hash value based on the updated data fails, a retry time shard can be selected from the remaining time shards. The remaining time shards are the time shards in the time shard set other than the target time shard. Then, in the retry time shard, the first permission expiration time and the first hash value are updated again based on the updated data to obtain the local permission table after the first update. This allows stateful service instances to be renewed again when renewal fails due to database access conflicts, thereby maintaining the availability of the cluster.

[0149] For example, the preset renewal period is 18 seconds, the first quantity is 3, the cluster includes stateful service instance A and stateful service instance B, stateful service instance A is on the first server, and stateful service instance B is on the second server, that is, the second quantity is 2.

[0150] In the first preset renewal period, that is, in the first 1-18 seconds, the first time slice is the first 1-6 seconds, the second time slice is the 7-12 seconds, and the third time slice is the 13-18 seconds. That is, the first two times the number of time slices (target time slices) are the 1-6 seconds and the 7-12 seconds, respectively.

[0151] The time slices corresponding to seconds 1-6 are the target time slices for stateful service instance A. Second 2.5 is the first renewal time for stateful service instance A. That is, the first server retrieves updated data from the database at second 2.5 and updates the first permission expiration time of the local permission table according to the updated data, thus obtaining the first permission expiration time for stateful service instance A at second 2.5 (that is, obtaining the local permission table updated with the first renewal time), and stores the first permission expiration time for stateful service instance A at second 2.5 in the database (i.e., writes it to the database).

[0152] The time slice corresponding to seconds 7-12 is the target time slice for stateful service instance B. Second 8.5 is the first renewal time for stateful service instance B. That is, the second server retrieves the target update data from the database at second 8.5, updates the first permission expiration time of stateful service instance B according to the target update data, obtains the first permission expiration time of stateful service instance B at second 8.5, and stores the first permission expiration time of stateful service instance B at second 8.5 in the database (i.e., writes it to the database).

[0153] If the first server fails to renew the local permissions table at 2.5 seconds, since only one time slice remains, seconds 7-12 can be used as the retry time slice.

[0154] In the second preset renewal period, that is, in the period from 19 to 36 seconds, the first time slice is from 19 to 24 seconds, the second time slice is from 25 to 30 seconds, and the third time slice is from 31 to 36 seconds. That is, the first two time slices (target time slices) are from 19 to 24 seconds and from 25 to 30 seconds, respectively.

[0155] The time slice corresponding to seconds 19-24 is the target time slice for stateful service instance A, and second 20.5 is the second renewal time for stateful service instance A. Similarly, the time slice corresponding to seconds 25-30 is the target time slice for stateful service instance B, and second 26.5 is the second renewal time for stateful service instance B.

[0156] If the first server fails to renew the local permissions table at 20.5 seconds, since only one time slice remains, seconds 31-36 can be used as the retry time slice.

[0157] It should be noted that the failure to update the first permission expiration time and the first hash value based on the updated data may be due to the database being accessed by a second server, preventing the first server from obtaining the updated data (i.e., a database read operation), thus causing the first update to the first permission expiration time and the first hash value to fail.

[0158] Alternatively, the failure to update the first permission expiration time and the first hash value based on the updated data could be due to the database being accessed by a second server, preventing the first server from storing the updated first permission expiration time and the updated first hash value in the database (i.e., a database write operation), thus causing the first update to the first permission expiration time and the first hash value to fail.

[0159] If the first update of the first permission expiration time and the first hash value fails based on the updated data, then retry time slices are selected from the remaining time slices, including:

[0160] If the first update to the first permission expiration time and the first hash value fails based on the updated data, then the target number of update failures is determined.

[0161] If the target number of attempts is less than or equal to the preset number of attempts, then the retry time shards of the stateful service instance are selected from the remaining time shards based on the target time shards and the target number of attempts corresponding to the stateful service instance.

[0162] Among them, the target time shard and target number of times corresponding to the stateful service instance satisfy the following relationship:

[0163]

[0164] w represents the retry time slice, n represents the target number of times, k represents the target time slice, and M represents the second quantity.

[0165] For example, such as Figure 8 As shown, M represents the second quantity. The target time shard corresponding to stateful service instance A is the k-th time shard in the preset renewal period. If the first server fails to update the local permission table in the k-th time shard, then (M+k / 2) time shards are selected from the remaining time shards as retry time shards. If the update fails again, then (M+M / 2+k / 4) time shards are selected for retry, and so on, until the target number is greater than the preset number.

[0166] If the target number of attempts exceeds the preset number of attempts, the first server can continue to perform the first update on the first permission expiration time and the first hash value in the next preset renewal cycle until the first permission expiration time is reached. If the update is not successful, the stateful service instance will lose its permissions on the hash ring.

[0167] The preset number of attempts can be set according to actual needs; for example, the preset number of attempts can be set to 5. This embodiment does not impose any limitations on this.

[0168] In some embodiments, the notification of the newly added stateful service instance going online can also be sent to the first server. Upon receiving the notification, the first server retrieves updated data from the database, including the second hash value of the target stateful service, and then stores this second hash value in its local permissions table. In this case, it is not necessary to obtain the second renewal time to determine that the permissions of the stateful service instance on the hash ring have been granted to the newly added stateful service instance, thereby accelerating permission convergence during deployment and reducing the permission convergence time.

[0169] S203. If the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, then obtain the target update time of the newly added stateful service instance.

[0170] Newly added stateful service instances refer to stateful service instances that were not included in the local permission table updated at the first renewal time; that is, stateful service instances newly added to the cluster. Newly added stateful service instances and stateful service instances in the local permission table updated at the first renewal time can be of the same type.

[0171] For example, if the local permission table updated during the first renewal period includes stateful service instance A, which is used to implement game player login, then the newly added stateful service instance can also be used to implement game player login.

[0172] Cluster scaling up and down can be viewed as the process of bringing stateful service instances online and offline, while stateful service instance failures can be viewed as a special kind of offline process. Therefore, when the logic of bringing stateful service instances online and offline is handled properly, the availability of the cluster can be guaranteed.

[0173] If the cluster only includes the first and second servers, and the stateful service instance is deployed on the third server, then adding the third server to the cluster effectively scales up the cluster. If the stateful service instance is deployed on either the first or second server, then scaling down the cluster is equivalent to scaling down the cluster.

[0174] When a new stateful service instance is launched, it is equivalent to inserting several virtual identifiers of the new stateful service instance into the hash ring. These virtual identifiers will preempt the permissions of the next virtual identifier clockwise on the hash ring.

[0175] Since updating the expiration time of the first permission according to the preset renewal period is a way for stateful service instances to maintain their permissions on the hash ring, it means that the first server's update of the expiration time of the first permission and the first hash value of the second deployed stateful service instance is delayed. That is, the first server needs to wait until the next renewal time of the first deployed stateful service instance to discover the change in the expiration time of the first permission and the first hash value of the second deployed stateful service instance. In abnormal cases, it may even take more than one preset renewal period to discover, which may lead to the phenomenon that the permissions on the hash ring are simultaneously possessed by multiple stateful service instances.

[0176] This means that the second server's update of the first permission expiration time and the first hash value of the first deployed stateful service instance is delayed. That is, the second server needs to wait until the next renewal time of the second deployed stateful service instance to discover the change in the first permission expiration time and the first hash value of the first deployed stateful service instance. In abnormal cases, it may take more than one preset renewal period to discover, which may lead to the phenomenon that permissions on the hash ring are simultaneously possessed by multiple stateful service instances.

[0177] For example, the local permission table includes the first permission expiration time and the first hash value on the hash ring for stateful service instance A. Stateful service instance A is deployed on the first server, and stateful service instance A corresponds to virtual identifiers A1 and A2. The first hash value of stateful service instance A can be as follows: Figure 9 As shown in Figure 901. At this point, the permissions for hash value 270 on the hash ring belong to stateful service instance A.

[0178] If the second server acquires the newly added stateful service instance B at time t0 (that is, the newly added stateful service instance B is the second deployed stateful service instance), the newly added stateful service instance B corresponds to virtual identifiers B1 and B2, and the second hash value of the newly added stateful service instance B can be as follows: Figure 9 As shown in Figure 902, after the second server obtains the newly added stateful service instance B and updates its own target local permission table, the permission of hash value 270 on the hash ring in the updated target local permission table belongs to the newly added stateful service instance B.

[0179] However, since the first server only updates its local permission table at time (t0-e+T), before time (t0-e+T), the permissions of hash value 270 on the hash ring in the first server still belong to stateful service instance A, not the newly added stateful service instance B. This results in hash value 270 having permissions on the hash ring simultaneously in both stateful service instance A and the newly added stateful service instance B.

[0180] To avoid the aforementioned technical issues, if the newly added stateful service instance is deployed on the second server, when the second server obtains the new stateful service instance, it only determines the second hash value of the new stateful service instance on the hash ring. That is, after obtaining the new stateful service instance, although the second server updates its target local permission table (i.e., the updated target local permission table includes the second hash value), it does not yet include the expiration time of the second permission for the new stateful service instance. In other words, the permissions of the target stateful instance on the hash ring are not yet effective; at this point, the permissions on the hash ring still belong to the stateful service instance. Then, the second server sends the second hash value to the database.

[0181] Then, at the second renewal time, the first server retrieves updated data from the database. The updated data includes the second hash value. At this time, although the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, the local permission table after the first update has not yet included the second permission expiration time of the newly added stateful service instance, and the permissions of the target stateful target instance on the hash ring have not yet taken effect.

[0182] If the cluster also includes a fourth server, time can be reserved for the fourth server to update, so that the permissions of the target stateful instance on the hash ring are not effective in the cluster servers, thus ensuring that the permissions on the hash ring are not simultaneously possessed by the stateful service instance and the newly added stateful server instance.

[0183] Since the local permission table after the first update has not yet included the second permission expiration time of the newly added stateful service instance, the first server obtains the target update time of the newly added stateful service instance in order to determine the second permission expiration time of the newly added stateful service instance based on the target update time. At this time, in the first server, the target update time can be the third renewal time in the first server, and the time interval between the third renewal time and the second renewal time is the preset renewal period.

[0184] For example, such as Figure 10As shown, the newly added stateful service instance is stateful service instance A, and stateful service instance B is deployed on the first server. At time t0 (the time stateful service instance A goes online), the second server updates the target local permission table in the second server. That is, the updated target local permission table in the second server includes the second hash value. In other words, at time t0, stateful service instance A is granted permission to write to the hash ring for the first time, but the permission has not yet taken effect, and the second hash value is stored in the main permission table in the database.

[0185] At time (t0+Te), the first server retrieves updated data from the database (time (t0+Te) is the second renewal time). The updated data includes the second hash value. Based on the updated data, the local permission table in the first server is updated (that is, the stateful service instance B is renewed), resulting in the first updated local table (in the first updated local permission table, the first permission expiration time of stateful service instance B is (t0+3T-e)). At this time, the first server discovers the existence of stateful service instance A and transfers the hash permissions of stateful service instance B to stateful service instance A.

[0186] At time (t0+T), the second server also retrieves the target update data from the database. The target update data includes the local table after the first update. Based on the target update data, the target local permission table obtained by the second server after the update at time t0 is updated (that is, the stateful service instance A is renewed for the first time), resulting in the target local permission table at time (t0+T). The target local permission table at time (t0+T) includes the second permission expiration time of stateful service instance A. That is, the permissions of stateful service instance A on the hash ring begin to take effect, and the second permission expiration time of stateful service instance A is sent to the database for storage.

[0187] At this point, the permissions on the hash ring belong to stateful service instance B before time (t0+Te) and to stateful service instance A after time (t0+T).

[0188] The first server then determines the time (t0+2T-e) ((t0+2T-e) is the target update time) to retrieve updated data from the database again at (t0+2T-e). The updated data includes the second permission expiration time of stateful service instance A, so that when the local table after the first update is updated according to the updated data, and the second updated local permission table is obtained, the second updated local permission table includes the second permission expiration time of stateful service instance A.

[0189] If a newly added stateful service instance is deployed on the first server, the first server will also update its local permission table when it receives the new stateful service instance. That is, the local permission table on the first server will be updated at the time the new stateful service instance goes online. Therefore, after obtaining the local permission table updated at the first renewal time, if the first server receives the newly added stateful service instance, it can update the local permission table based on the newly added stateful service instance, thus obtaining the updated local permission table.

[0190] Then, during the second renewal period, updated data is retrieved from the database. Based on the updated data, the updated local permission table is updated for the first time to obtain the first updated local permission table. That is, at this time, both the updated local permission table and the first updated local permission table include the second hash value of the newly added stateful service instance.

[0191] At this point, the target update time for newly added stateful service instances is determined based on the instance's launch time and preset renewal period. The launch time and preset renewal period can be added together to obtain the target update time.

[0192] For example, if the launch time of a newly added stateful service instance is t0 and the preset renewal period is T, then the target update time of the newly added stateful service instance is (t0+T).

[0193] However, in practical applications, situations may arise where the first server fails to update the permission table during the second renewal period. For example, the stateful service instance might be blocked for an extended period, or the first server might be unable to access the database, preventing it from updating its local permission table during the second renewal period. If the target update time is still determined based on the online time of the newly added stateful service instance and the preset renewal period, the permissions on the hash ring will be simultaneously held by both the existing stateful service instance and the newly added stateful service instance. In this case, the newly added stateful service instance acquires the permissions on the hash ring, while the existing stateful service instance delays relinquishing its permissions.

[0194] To further prevent the phenomenon that permissions on the hash ring may be simultaneously held by both stateful service instances and newly added stateful service instances, in some embodiments, if the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, then the target update time of the newly added stateful service instance is obtained, including:

[0195] If the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, then obtain the preset time and the online time of the newly added stateful service instance.

[0196] The target renewal period is determined based on the preset time and preset renewal period;

[0197] Determine the target update time for newly added stateful service instances based on their launch time and target renewal period.

[0198] In this embodiment, a preset time is set, and then the first server adds the preset time and the preset renewal period to obtain the target renewal period. Finally, the online time and the target renewal period are added to obtain the target update time of the newly added stateful service instance. The target update time is after the first permission expiration time of the stateful service instance in the local permission table.

[0199] Since if the first server cannot renew the stateful service instance normally before the first permission expiration time, the stateful service instance will lose its permissions on the hash ring after the first permission expiration time. Therefore, the target update time is after the first permission expiration time. This allows the permissions of the newly added stateful service instance on the hash ring to take effect after the first permission expiration time. This further avoids the phenomenon that the permissions on the hash ring are simultaneously possessed by the stateful service instance and the newly added stateful service instance.

[0200] Furthermore, the time window for granting the newly added stateful service instance the permission to write to the hash ring for the first time can be determined based on the preset renewal period and preset time. Specifically, the time window for granting the newly added stateful service instance the permission to write to the hash ring for the first time can be obtained by adding twice the preset renewal period to the preset time.

[0201] For example, such as Figure 11 As shown, the preset time can be set to a preset renewal period. The target renewal period is twice the preset renewal period. The time window for granting stateful service instance A the permission to write to the hash ring for the first time is 3T. At this time, even if the first server cannot update the local permission table at the second renewal time (t0+2T-e), the permission of stateful service instance B on the hash ring will become invalid after (t0+2T-e). Furthermore, after the second server updates the target permission table at the target update time (t0+2T), the permission of stateful service instance A on the hash ring will become effective. This further avoids the phenomenon that the permission on the hash ring is simultaneously possessed by stateful service instance A and stateful service instance B.

[0202] It should be noted that if the newly added stateful service instance is deployed on the second server, the second server can also determine the target update time using the above method, which will not be elaborated here.

[0203] New stateful service instances will be added to the cluster, while stateful service instances in the local permission table will be taken offline. When a stateful service instance goes offline, its permissions on the hash ring will be passed on to the next stateful service instance in the clockwise direction on the hash ring.

[0204] Since each stateful service instance has a first permission expiration time, if the stateful service instance is not updated according to the preset renewal period, the offline stateful service instance will drop its permissions on the hash ring after the first permission expiration time. The server where the online stateful service instance is located in the local permission table will only know that the offline stateful service instance has gone offline after the first permission expiration time.

[0205] For example, such as Figure 12 As shown, the time interval between the first permission expiration time and the first renewal time is twice the preset renewal period. If the stateful service instance A in the second server is an offline stateful service instance and goes offline at time t0, then the first server will only know that the permissions of the stateful service instance A on the hash ring have expired at (t0-e+3T).

[0206] If, after t0 and before (t0-e+3T), the first server receives a service request for stateful service instance B, but because the permissions of stateful service instance A have not yet expired, the first server fails to verify the permissions of stateful service instance B, thus preventing the request from being processed through stateful service instance B and affecting the availability of the cluster.

[0207] Therefore, in some other embodiments, before retrieving updated data from the database at the second renewal time, the process further includes:

[0208] Receive offline notification;

[0209] Based on the offline notification, filter out the offline stateful service instances from the local permission table;

[0210] Delete the first hash value and the first permission expiration time of the offline stateful service instance in the local permission table to obtain the third updated local permission table.

[0211] In this embodiment, if a state service instance goes offline, all servers in the cluster will be notified. This causes all servers in the cluster to update their own permission tables based on the offline notification. Specifically, at this time, the servers in the cluster not only update their own permission tables according to the preset renewal cycle, but also update their own permission tables upon receiving the offline notification. This accelerates permission convergence during cluster scaling down, reduces permission convergence time, enables the provision of lossy services, reduces cluster downtime, and thus ensures cluster availability.

[0212] For example, if the second server detects that a stateful service instance A in its own instance is offline, it will generate a shutdown notification. Then, based on the shutdown notification, it will delete the first hash value and first permission expiration time of stateful service instance A from the target permission table in the second server and send the shutdown notification to the first server. Upon receiving the shutdown notification, the first server will, based on the notification, filter out the offline stateful service instances from its local permission table. Finally, it will delete the first hash value and first permission expiration time of the offline stateful service instances from its local permission table, resulting in the updated local permission table.

[0213] It should be understood that the offline notification may also be received after the second renewal period. In this case, the offline stateful service instances are selected from the local permission table after the first update based on the offline notification. However, even if the local permission table after the first update includes the second hash value, since the second hash value has not yet taken effect, there is no situation where newly added stateful service instances are offline. That is, the offline stateful service instances are stateful service instances.

[0214] Alternatively, the offline notification can be obtained after the second updated local table is retrieved. In this case, the offline stateful service instances are selected from the second updated local permission table based on the offline notification. Since the second updated local permission table includes not only stateful service instances, but also the second hash value of newly added stateful service instances, the offline stateful service instances can be either existing stateful service instances or newly added stateful service instances.

[0215] S204. Based on the target update time, perform a second update on the local permission table after the first update to obtain the local permission table after the second update. The local permission table after the second update includes the expiration time of the second permission for the newly added stateful service instance.

[0216] After determining the target update time, the first server retrieves the updated data from the database again when the target update time arrives. Then, based on the updated data, it performs a second update on the local permission table after the first update to obtain the second permission expiration time for the newly added stateful service instance.

[0217] For example, both stateful service instance B and stateful service instance A are used to implement game player login. Stateful service instance B is deployed on the first server, and the first server uses stateful service instance B to implement game player AH's login. Stateful service instance A comes online and is deployed on the second server. When the target update time arrives, the updated data is retrieved from the database again, and then the local permission table after the first update is updated according to the updated data. After obtaining the second permission expiration time of stateful service instance A, stateful service instance A has the permission to implement game player DH's login. That is, the permission of stateful service instance B to log in to game player DH is transferred to stateful service instance A.

[0218] It should be noted that, because the stateful service instance is an instance of a stateful service, the first server caches the data of the service requests that the stateful service instance has already responded to. Therefore, when the stateful service instance delegates its permissions on the hash ring to a newly added stateful service instance, that is, when it delegates some of the processing permissions of the first service request to the newly added stateful service instance, the server where the newly added stateful service instance resides can obtain the data of the first service request that the stateful service instance has already responded to from the first server through a preset interface and cache it. This allows the server to respond to new first service requests in the future by using the newly added stateful service instance based on the data of the first service request that has already been responded to.

[0219] Alternatively, when both the newly added stateful service instance and the stateful service instance are on the first server, the first server can retrieve the data of the first service request that has been responded to from the cache corresponding to the stateful service instance through a preset interface, and store the data of the first service request that has been responded to into the cache corresponding to the newly added stateful service instance.

[0220] In this embodiment, when the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, it indicates the existence of a newly launched stateful service instance. This means that some permissions of the existing stateful service instance on the hash ring need to be transferred to the newly added stateful service instance. Because the existence of the newly added stateful service instance can only be known after the second renewal time, an update based on the updated data is required. Therefore, by performing a second update on the local permission table after the first update according to the target update time, the second expiration time of the newly added stateful service instance's permissions is obtained. This avoids the situation where these permissions are simultaneously held by both the existing stateful service instance and the newly added stateful service instance, thereby ensuring the availability of both the existing and newly added stateful service instances. This allows the cluster to provide lossy service during scaling, reducing the cluster's downtime.

[0221] In some embodiments, a routing table is set up on the service caller (i.e., the service caller is the source, or the terminal). The routing table stores the first hash value of the stateful service instance. When a new stateful service instance is added, the routing table is updated to store the second hash value of the new stateful service instance on the routing table. When a stateful service instance is taken offline, the routing table is updated to delete the first hash value of the offline stateful service instance from the routing table.

[0222] When a service caller receives a user instruction and generates a service request based on the user instruction, it first performs a hash operation on the object identifier in the service request to obtain a fifth hash value. Then, based on the fifth hash value and the routing table, it selects the first stateful instance to handle the service request from the routing table. This allows the service caller to ensure correctness while also taking into account the load pressure of the stateful service instance, thereby ensuring that the processing capacity of each stateful service instance is within a healthy range.

[0223] However, in a distributed network environment, it cannot be guaranteed that the routing tables on each service caller are consistent. Therefore, when the first server receives a service request for the first execution stateful service instance, it performs permission verification on the first execution stateful service instance to ensure consistency. For example, when the cluster is being expanded, some routing tables may not have been updated due to faults, and service requests may still be sent to stateful service instance A. However, at this time, the stateful service instance handling these service requests in the cluster is stateful service instance B. If permission verification is not performed, stateful service instance A and stateful service instance B will simultaneously handle these service requests, leading to errors.

[0224] Therefore, in some other embodiments, after performing a second update on the first updated local permission table based on the target update time to obtain a second updated local permission table, the second updated local permission table includes the second permission expiration time of the newly added stateful service instance, and further includes:

[0225] Obtain the service request for the first execution stateful service instance, and perform a hash operation on the object identifier in the service request to obtain the third hash value corresponding to the service request;

[0226] Based on the third hash value and the fourth hash value in the second updated local permission table, select the second execution stateful service instance that executes the service request from the second updated local permission table;

[0227] If the first executing stateful service instance and the second executing stateful service instance are the same, and the current time is before the first permission expiration time of the second executing stateful service instance, then the service request will be responded to through the second executing stateful service instance.

[0228] The hash value included in the second updated local permission table is the fourth hash value. The fourth hash value may include the target first hash value and the second hash value. The target first hash value may be the first hash value in the local permission table, or the target first hash value may be the first hash value in the first updated local permission table.

[0229] In this embodiment, the service caller determines the first stateful service instance that responds to the service request through the routing table. On the server side, the first permission expiration time of the stateful service instance in the local permission table is used to determine whether the stateful service instance has permission to process the server request. This ensures that the first stateful service instance determined by the routing table and the second stateful service instance determined by the local permission table are consistent, so that data consistency can still be guaranteed during cluster expansion, contraction, and disaster recovery. This achieves the goal of ensuring the legality of data processing without broadcasting the running status of the stateful service instance.

[0230] Furthermore, it makes the framework transparent to the business layer, eliminating the need for business logic to concern itself with how to handle erroneous service requests, thus facilitating business logic implementation, making it easier to understand, and improving the framework's usability.

[0231] Furthermore, in this embodiment, the stateful service instance responding to the service request can be determined by the hash value of the object identifier in the service request, without the need to pre-define the service scenario for the stateful service instance (for example, in the ride-hailing service scenario, stateful service instances are divided according to geographical location), making it more universal.

[0232] It should be noted that when the service caller receives the service request and when the first server receives the service request, the above steps can be executed after receiving the service request. There is no need to perform a second update on the local permission table after the first update based on the target update time, and then execute the above steps only after receiving the service request after obtaining the local permission table after the second update.

[0233] For example, after updating the first permission expiration time and the first hash value to obtain the updated local permission table, if a service request is received, then after obtaining the service request for the first execution stateful service instance and performing a hash operation on the object identifier in the service request to obtain the third hash value corresponding to the service request, the second execution stateful service instance that executes the service request can be selected from the updated local permission table based on the third hash value and the first hash value in the updated local permission table.

[0234] At this point, even if the updated local permission table includes the second hash value of the newly added stateful service instance, since the second hash value has not yet taken effect, the second stateful service instance that executes the service request is selected from the updated local permission table based on the third hash value and the first hash value in the updated local permission table.

[0235] For example, after obtaining the local permission table updated with the first renewal time, if a service request is received, then after obtaining the service request for the first execution stateful service instance and performing a hash operation on the object identifier in the service request to obtain the third hash value corresponding to the service request, the second execution stateful service instance that executes the service request can be selected from the local permission table based on the third hash value and the first hash value in the local permission table.

[0236] For example, after obtaining the third updated local permission table and receiving a service request, after obtaining the service request for the first execution stateful service instance and performing a hash operation on the object identifier in the service request to obtain the third hash value corresponding to the service request, the second execution stateful service instance that executes the service request can be selected from the third updated local permission table based on the third hash value and the hash value in the third updated local permission table.

[0237] As described above, in this embodiment, a local permission table updated with a first renewal time is obtained. This local permission table includes the first hash value and first permission expiration time of each stateful service instance on the hash ring. The time interval between the first permission expiration time and the first renewal time is greater than a preset renewal period. Then, at the second renewal time, updated data is retrieved from the database, and based on this updated data, the first permission expiration time and the first hash value are updated to obtain a first-updated local permission table. The first time interval between the second renewal time and the first renewal time is the preset renewal period. If the first-updated local permission table includes the second hash value of a newly added stateful service instance on the hash ring, the target update time of the newly added stateful service instance is obtained. Finally, based on the target update time, the first-updated local permission table is updated a second time to obtain a second-updated local permission table, which includes the second permission expiration time of the newly added stateful service instance.

[0238] In this embodiment, since the first time interval between the first renewal time and the second renewal time is a preset renewal period, and the time interval between the first permission expiration time and the first renewal time is greater than the preset renewal period, updating the local permission table at the second renewal time can ensure that the permissions of each stateful service instance on the hash ring in the local permission table will not expire, thus ensuring the availability of each stateful service instance.

[0239] Furthermore, when the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, it indicates the existence of a newly launched stateful service instance. This means that some permissions of the existing stateful service instance on the hash ring need to be transferred to the newly added stateful service instance. Because the existence of the newly added stateful service instance can only be known after the second renewal time, based on the updated data. Therefore, by performing a second update on the local permission table after the first update according to the target update time, the second expiration time of the newly added stateful service instance's permissions is obtained. This avoids the situation where these permissions are simultaneously held by both the existing stateful service instance and the newly added stateful service instance, thus ensuring the availability of both.

[0240] Based on the methods described in the above embodiments, the following examples will provide further detailed explanations.

[0241] Figure 13 This is a flowchart illustrating an instance processing method provided in an embodiment of this application. The instance processing method may include:

[0242] S1301. The first server obtains the preset renewal period and divides the preset renewal period to obtain a time shard set. The time shard set includes a first number of time shards, and the first number is greater than the second number of stateful service instances in the cluster where the first server is located.

[0243] S1302. The first server selects the second-largest number of time slices from the time slice set to obtain the target time slice, and assigns the target time slice to each stateful service instance to obtain the target time slice corresponding to the stateful service instance.

[0244] S1303. In the first time segment corresponding to the first deployed stateful service instance, the first server retrieves the main permission table from the database, uses the main permission table as the local permission table, stores the first hash value of the first deployed stateful service instance on the hash ring in the local permission table, and sends the first hash value of the first deployed stateful service instance to the database for storage. The first deployed stateful service instance is the stateful service instance deployed on the first server in the cluster.

[0245] S1304. If sending the first hash value of the first deployed stateful service instance to the database for storage fails, the first server determines the target number of update failures.

[0246] If the first hash value of the first deployed stateful service instance is successfully sent to the database for storage in the target time shard, the first server will directly execute step S1307.

[0247] S1305. If the target number of times is less than or equal to the preset number of times, the first server will select the retry time shard of the stateful service instance from the remaining time shards according to the target time shard and the target number of times corresponding to the first deployed stateful service instance. The remaining time shards are the time shards in the time shard set other than the target time shard.

[0248] S1306. In the retry time sharding, the first server sends the first hash value of the first deployed stateful service instance to the database for storage.

[0249] S1307. The first server obtains the preset time and determines the target renewal period based on the preset time and the preset renewal period.

[0250] S1308. The first server determines the write time of the first deployed stateful service instance, and determines the first renewal time of the first deployed stateful service instance based on the target renewal period and the write time.

[0251] If the first hash value of the first deployed stateful service instance is successfully sent to the database for storage within the target time shard, then the write time of the first deployed stateful service instance is the first time (that is, the write time is the online time of the first deployed stateful service instance). If the first hash value of the first deployed stateful service instance is successfully sent to the database for storage within the retry time shard, then the write time of the first deployed stateful service instance is the time when the first hash value of the first deployed stateful service instance is successfully sent to the database for storage.

[0252] S1309. The first server updates the local permission table according to the first renewal time of the first deployed stateful service instance, so that the local permission table for the first renewal time includes the first permission expiration time of the first deployed stateful service instance, and sends the first permission expiration time of the first renewal time to the main permission table of the database for storage. The time interval between the first permission expiration time and the first renewal time is twice the preset renewal period.

[0253] S1310. The second server obtains the second deployed stateful service instance at the second time, retrieves the main permission table from the database, and uses the main permission table as the target local permission table. The main permission table includes the first hash value and the first permission expiration time of the first deployed stateful service instance.

[0254] S1311. The second server stores the second hash value of the second deployed stateful service instance on the hash ring in the target local permission table, and sends the second hash value of the second deployed stateful service instance to the main permission table of the database for storage.

[0255] It should be noted that when the second deployed stateful instance comes online, the number of stateful service instances in the cluster increases, that is, the second number becomes larger. Therefore, the target time shard corresponding to the second deployed stateful instance can be selected from the remaining time shards. Then, in the target time shard corresponding to the second deployed stateful instance, the second hash value of the second deployed stateful service instance is sent to the main permission table of the database for storage.

[0256] Alternatively, when multiple stateful instances of the second deployment are online, the preset renewal period can be re-divided to obtain a new set of time shards. Then, the target time shard corresponding to each stateful instance of the second deployment can be selected from the new set of time shards. In the target time shard corresponding to each stateful instance of the second deployment, the second hash value of each stateful service instance of the second deployment can be sent to the main permission table of the database for storage.

[0257] At this point, due to the re-division of the preset renewal period, the target time shard corresponding to the first deployed stateful service instance changes. However, since the first renewal time of the first deployed stateful service instance has been determined, the second renewal time and subsequent renewal times of the first deployed stateful service instance will not change.

[0258] For example, before the preset renewal period T is re-divided, the target time shard corresponding to the first deployed stateful service instance A in the first preset renewal period is time shard 1, the first renewal time of the first deployed stateful service instance A is time a, and time a is in time shard 1 of the first preset renewal period, then the second renewal time of the first deployed stateful service instance A is time (a+T), and time (a+T) is in time shard 1 of the second preset renewal period.

[0259] After the preset renewal period is re-divided, the target time shard corresponding to the first deployed stateful service instance A is time shard 2, and the second renewal time is still time (a+T), but time (a+T) is in time shard 2 in the second preset renewal period.

[0260] S1312. The first server retrieves the main permission table from the database based on the second renewal time of the first deployed stateful service instance. The main permission table includes the second hash value of the second deployed stateful service instance. The time interval between the second renewal time and the first renewal time is a preset renewal period.

[0261] Alternatively, the online notification of the second deployed stateful service instance can be sent to the first server via the name service communication mesh. Upon receiving the notification, the first server retrieves the second hash value of the second deployed stateful service instance from the database and stores it in its local permissions table. In this case, a second renewal time is not required to determine if the permissions of the first deployed stateful service instance on the hash ring have been granted to the second deployed stateful service instance. For example, ... Figure 14 As shown.

[0262] S1313, The first server stores the second hash value of the second deployed stateful service instance in the local permission table of the first renewal time, and updates the first permission expiration time of the first deployed stateful service instance in the local permission table of the first renewal time, so as to obtain the first permission expiration time of the first deployed stateful service instance in the second renewal time.

[0263] That is, the first server stores the second hash value of the second deployed stateful service instance in the local permission table of the first renewal time, and after updating the first permission expiration time of the first deployed stateful service instance in the local permission table of the first renewal time, it obtains the local permission table of the second renewal time.

[0264] S1314. The first server sends the first deployed stateful service instance to the main permission table of the database for storage at the first permission expiration time of the second renewal time.

[0265] Each time a renewal occurs, the first server reads the master permission table from the database, updates its local permission table accordingly, obtains the first permission expiration time corresponding to the renewal date, and sends this first permission expiration time to the database for storage. In other words, it writes the first permission expiration time corresponding to the renewal date into the database. For example, ... Figure 14 As shown.

[0266] S1315. The second server obtains the preset time and determines the target renewal period based on the preset time and the preset renewal period.

[0267] S1316. The second server determines the first renewal time of the second deployed stateful service instance based on the target renewal period and the second time.

[0268] It should be understood that if the second hash value of the second deployed stateful service instance is successfully sent to the database for storage within the target time shard, the first renewal time of the second deployed stateful service instance is determined based on the target renewal period and the second time. If the second hash value of the second deployed stateful service instance is successfully sent to the database for storage within the retry time shard, the first renewal time of the second deployed stateful service instance is determined based on the target renewal period and the time when the first hash value of the first deployed stateful service instance was successfully sent to the database for storage.

[0269] S1317. During the first renewal time of the second deployed stateful service instance, the second server retrieves the master permission table from the database. The master permission table includes the first permission expiration time of the first deployed stateful service instance during the second renewal time. The server then updates the target local permission table based on the master permission table so that the target local permission table during the first renewal time includes the first permission expiration time of the first deployed stateful service instance during the second renewal time, and also includes the first permission expiration time of the second deployed stateful service instance during the first renewal time.

[0270] S1318. The second server sends the second deployed stateful service instance to the main permission table of the database at the first permission expiration time of the first renewal time for storage.

[0271] S1319. The first server retrieves the master permission table from the database based on the third renewal time of the first deployed stateful service instance. The master permission table includes the first permission expiration time of the second deployed stateful service instance during the first renewal time.

[0272] S1320, The first server stores the first permission expiration time of the second deployed stateful service instance in the local permission table of the second renewal time during the first renewal time, and updates the first permission expiration time of the first deployed stateful service instance in the local permission table of the second renewal time, so as to obtain the first permission expiration time of the first deployed stateful service instance during the third renewal time.

[0273] S1321. The first server sends the first deployed stateful service instance to the main permission table of the database at the first permission expiration time of the third renewal time for storage.

[0274] The process of updating the renewal time of the first stateful service instance by the first server and the process of updating the renewal time of the second stateful service instance by the second server can refer to the above steps, and will not be repeated here in this embodiment.

[0275] S1322. The service caller obtains the routing table, which includes the first hash value of the first deployed stateful service instance and the second hash value of the second deployed stateful service instance.

[0276] It should be understood that when a service caller receives notifications of a stateful service instance going online or going offline, it will update its routing table to add the hash value of the stateful service instance going online or delete the hash value of the stateful service instance going offline.

[0277] For example, such as Figure 14 As shown, the online and offline notifications of stateful service instances are sent to the service caller through the name service communication mesh, so that the service caller is aware of the existence of stateful service instances that are online or offline.

[0278] S1323. When the service caller receives a service request, it performs a hash operation on the object identifier in the service request to obtain a fourth hash value, and selects the first stateful instance to handle the service request from the routing table based on the fourth hash value and the routing table.

[0279] S1324. The service caller sends the service request to the server where the first stateful instance resides.

[0280] If the first executed stateful instance is the first deployed stateful service instance, then the server where the first executed stateful instance resides is the first server. If the first executed stateful instance is the second deployed stateful service instance, then the server where the first executed stateful instance resides is the second server.

[0281] S1325. After receiving the server request, the server where the first stateful instance is located performs a hash operation on the object identifier in the service request to obtain the third hash value corresponding to the service request.

[0282] S1326. The server where the first execution stateful instance is located selects the second execution stateful service instance to execute the service request from the local permission table based on the third hash value and the first and second hash values ​​in the local permission table.

[0283] S1327. If the first execution stateful service instance and the second execution stateful service instance are the same, and the current time is before the first permission expiration time of the second execution stateful service instance, then the service request is responded to through the second execution stateful service instance.

[0284] For example, such as Figure 14 As shown, both the first execution stateful service instance and the second execution stateful service instance are stateful service instance 1. Stateful service instance 1 is on the first server. The first server verifies the permissions of stateful service instance 1 through the local permission table to determine whether stateful service instance 1 has the permission to process the service call request.

[0285] S1328. The first server obtains the offline notification and, based on the offline notification, selects the offline stateful service instances from the first deployed stateful service instances.

[0286] S1329. The first server deletes the first hash value and the first permission expiration time of the offline stateful service instance in the local permission table.

[0287] In this embodiment, a smooth routing transition is achieved through the routing awareness capability of the service caller (by setting up a routing table on the service caller), and server-side permission management is implemented using distributed storage. This ensures that stateful service instances in the cluster can maintain availability and consistency when the cluster is expanded, scaled down, or when a failure occurs. It also allows for adjustments to the preset renewal period based on the actual deployment of the cluster, thereby shortening the time of service loss.

[0288] The following describes the test data for routing convergence time and permission convergence time (the shorter the routing convergence time and permission convergence time, the stronger the data consistency and availability of the cluster) in the embodiments of this application.

[0289] In the test, 100 stateful service instances were tested, and the online and offline processes of stateful service instances were examined. Route convergence time refers to the time elapsed from the first stateful service instance in the cluster to perceive a route change until all stateful service instances in the cluster perceive the route change. Permission convergence time includes the data convergence time and the permission activation convergence time. The data convergence time refers to the time elapsed from the first stateful service instance in the cluster to perceive a permission change until all stateful service instances in the cluster perceive the permission change. The permission activation convergence time refers to the time elapsed from the first stateful service instance in the cluster activating permissions until all stateful service instances in the cluster activate permissions.

[0290] The following example illustrates the use of stateful service instances. When deploying stateful service instances in batches through horizontal scaling in a management system (Kubernetes, Kubernetes), the startup time of the processes corresponding to these instances is uncontrollable, affecting the measurement of route convergence time and permission convergence time. Therefore, it is necessary to address the startup time of the processes corresponding to these stateful service instances.

[0291] Optionally, when starting the process corresponding to a stateful service instance in a container (Pod), it will block and wait for a certain period of time before starting the processes corresponding to each stateful service instance simultaneously. However, even if the processes corresponding to each stateful service instance can be started simultaneously, the initialization process of each process corresponding to a stateful service instance is still uncontrollable. For example, the processes corresponding to some stateful service instances may register earlier than the processes corresponding to other stateful service instances. Theoretically, this situation is unavoidable, but usually the time difference is not too large. Therefore, in subsequent test data, adjustments will be made based on the fluctuations in the registration and online time of the processes corresponding to stateful service instances. Similarly, adjustments will be made for the fluctuations in the offline time of stateful service instances.

[0292] Table 1 shows the convergence of source routing when 10 new stateful service instances are deployed in a cluster containing 90 stateful service instances, i.e., the routing convergence time.

[0293]

[0294] Table 1

[0295] As can be seen from Table 1, if the cluster as a whole converges, that is, the routes of the original stateful service instances and the newly launched stateful service instances are consistent, it takes 395 milliseconds. The routes of the original 90 stateful service instances are consistent, which takes 310 milliseconds. Since we need to wait for all the newly launched stateful service instances to start up before we can examine the route convergence of all stateful service instances in the cluster, the time is relatively long.

[0296] For example, such as Figure 15 As shown, Figure 15 This demonstrates the deployment process of stateful service instances. The top timeline represents the registration process of new stateful service instances, while the bottom timeline represents the process of other stateful service instances receiving route change notifications. From Figure 15 It can be seen that although the 10 newly launched stateful service instances were started simultaneously, the registration time of the 10 newly launched stateful service instances fluctuated within a range of 170 milliseconds. Figure 13 As shown, 170 milliseconds will be included in the cluster's routing convergence time. Additionally, for each stateful service instance, there is an initialization process from the start of the process corresponding to the stateful service instance until it can output routing data. For the stateful service instance that starts latest, it takes 85 milliseconds to output routing data.

[0297] Therefore, if the fluctuation range is corrected, the 140 milliseconds of routing convergence time for deploying 10 new stateful service instances in a cluster containing 90 stateful service instances mainly includes the network latency of each stateful service instance, the latency of processing requests, and the latency of the name service communication mesh processing the deployment process (the deployment of stateful service instances includes two phases of commit, requiring at least two round-trip times; in the test environment, one round-trip time is 14 milliseconds).

[0298] Table 2 shows the convergence of source routes when 10 stateful service instances are taken offline in a cluster containing 100 stateful service instances. The route convergence time when stateful service instances are taken offline is fixed, compared to the online process. That is, the remaining 90 stateful service instances are fixed. Similarly, considering the fluctuations in the actual offline time of stateful service instances, the route convergence time for offline instances is also adjusted.

[0299]

[0300] Table 2

[0301] Table 3 shows the permission convergence time when 10 new stateful service instances were deployed, which included 90 existing stateful service instances. As can be seen from Table 3, the permission convergence time is significantly longer than the routing convergence time, mainly due to two reasons. First, when stateful service instances are deployed, new permissions need to be inserted into the database (i.e., the hash value of the new stateful service instance is stored in the database). Because multiple stateful service instances are deployed simultaneously, database write conflicts can occur, causing fluctuations in the time it takes to write permissions to each stateful service instance, thus affecting the permission convergence time. Second, when other stateful service instances receive permission updates, they rely on the attribute synchronization function of the name service communication mesh, and after receiving the attribute synchronization, they also need to perform a database read operation, thus affecting the permission convergence time.

[0302]

[0303] Table 3

[0304] Modifying permissions for a stateful service instance on a hash ring requires one database read operation and one database write operation. If permissions for a stateful service instance are modified at time T0, and a conflict occurs during the write operation, two strategies can be employed: one is to continuously retry until a preset number of retries is reached; the other is to check whether the stateful service instance has successfully come online within the remaining time shard. If it has not, then the permissions for the stateful service instance are modified again. Theoretically, the maximum time required to write permissions to a stateful service instance is:

[0305]

[0306] Where n represents the number of times, M represents the number of stateful service instances in the cluster (theoretically, an idle time slice can be found by spanning half of the time slice), and T1 is the length of the time slice.

[0307] In the test environment, the average time for a database read operation is 15 milliseconds, and the average time for a database write operation is 35 milliseconds, so the average renewal time is 50 milliseconds. Even under ideal conditions (no write conflicts), it would take at least 5 seconds for 100 stateful service instances to complete the renewal process. However, in real-world applications, write conflicts can extend the renewal time. Therefore, a preset renewal period of 10 seconds and each time shard of 50 milliseconds can be set. Thus, the theoretical maximum time for writing permissions to stateful service instances is 5000 milliseconds. However, in practice, the renewal time is not that long, and it often succeeds after several consecutive retries (consecutive retries are only used for going online and offline).

[0308] In the test environment, the last two stateful service instances retried five times before successfully writing permissions, which explains the slow convergence of permissions during deployment. Theoretically, writing permissions to 10 stateful service instances simultaneously would cause conflicts, and even five retries might not be enough to complete the write. However, due to time fluctuations in the initialization of the processes corresponding to the stateful service instances, the timing of concentrated permission writing was staggered, which explains the 696-second convergence time for the original 90 stateful service instances.

[0309] In addition, when a new stateful service instance is launched, in order to prevent permission conflicts from occurring even if an old stateful service instance is not notified, the permissions of the new stateful service instance only take effect after the first renewal. Therefore, when a stateful service instance is launched, the convergence time for the effective permissions is slower than the data convergence time for permissions by two preset renewal cycles. Moreover, the preset renewal cycle can be set according to the cluster size and network conditions.

[0310] Table 4 shows the permission convergence when 10 stateful service instances are taken offline in a cluster consisting of 100 stateful service instances. When a stateful service instance is taken offline, there will be no permission conflict when updating permissions on other stateful service instances because the stateful service instance is already offline. Therefore, the permission effective convergence time and the permission data convergence time are consistent.

[0311]

[0312] Table 4

[0313] Similar to when a stateful service instance is launched, write conflicts can also occur when a stateful service instance is taken offline. However, unlike the launch phase, even if the permissions of the offline stateful service instance are not deleted, the first permission expiration time of the offline stateful service instance will expire due to the lack of renewal. Therefore, if write conflicts continue, the stateful service instance will be taken offline directly after a preset number of retries, instead of waiting indefinitely.

[0314] Therefore, if a large number of stateful service instances are taken offline (e.g., 30 stateful service instances are taken offline in batches), resulting in severe write conflicts, the system will abandon resolving write conflicts for some stateful service instances and take them offline directly. In this case, the permission acceleration convergence mechanism (which refers to notifying other servers when a stateful service instance is taken offline so that other servers can update their permission tables) will fail. The offline stateful service instance will only be detected when its renewal period expires. This will cause the permission convergence time to be at least two preset renewal periods before it is completed. However, in reality, this situation does not occur when scaling down in a cluster, so it can be avoided.

[0315] It should be noted that when a stateful service instance exits abnormally, the permission convergence time must reach a maximum of two preset renewal cycles.

[0316] Other possible implementation methods and corresponding beneficial effects in this embodiment can be referred to the above example processing method embodiment, which will not be repeated here.

[0317] To facilitate better implementation of the instance processing method provided in the embodiments of this application, the embodiments of this application also provide an apparatus based on the above instance processing method. The meanings of the terms used are the same as in the above instance processing method, and specific implementation details can be found in the descriptions in the method embodiments.

[0318] For example, such as Figure 16 As shown, the instance processing apparatus may include:

[0319] The first acquisition module 1601 is used to acquire the local permission table updated by the first renewal time. The local permission table includes the first hash value of each stateful service instance on the hash ring and the first permission expiration time. The time interval between the first permission expiration time and the first renewal time is greater than the preset renewal period.

[0320] The second acquisition module 1602 is used to acquire updated data from the database during the second renewal time, and based on the updated data, perform a first update on the first permission expiration time and the first hash value to obtain the local permission table after the first update. The first time interval between the second renewal time and the first renewal time is a preset renewal period.

[0321] The third acquisition module 1603 is used to acquire the target update time of the newly added stateful service instance if the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring.

[0322] The update module 1604 is used to perform a second update on the local permission table after the first update based on the target update time, so as to obtain the second updated local permission table. The second updated local permission table includes the second permission expiration time of the newly added stateful service instance.

[0323] Optionally, the third acquisition module 1603 is specifically used to perform:

[0324] If the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, then obtain the preset time and the online time of the newly added stateful service instance.

[0325] The target renewal period is determined based on the preset time and preset renewal period;

[0326] Determine the target update time for newly added stateful service instances based on their launch time and target renewal period.

[0327] Optionally, the instance processing apparatus further includes:

[0328] Delete the module used for execution:

[0329] Receive offline notification;

[0330] Based on the offline notification, filter out the offline stateful service instances from the local permission table;

[0331] Delete the first hash value and the first permission expiration time of the offline stateful service instance in the local permission table to obtain the third updated local permission table.

[0332] Optionally, the second acquisition module 1602 is specifically used to perform:

[0333] The preset renewal period is divided into time slice sets, which include a first number of time slices.

[0334] Filter the target time shard corresponding to each stateful service instance from the time shard set;

[0335] In the second renewal time of the target time shard corresponding to the stateful service instance, updated data is retrieved from the database. The second renewal time and the first renewal time are in different preset renewal periods.

[0336] Optionally, the first number is greater than the second number of stateful service instances in the local permissions table.

[0337] Accordingly, the second acquisition module 1602 is specifically used to perform:

[0338] The target time slice is obtained by selecting the second-to-last number of time slices from the set of time slices.

[0339] Assign a target time shard to each stateful service instance to obtain the target time shard corresponding to the stateful service instance.

[0340] The instance processing device also includes:

[0341] The second acquisition module 1602 is specifically used to perform:

[0342] Based on the updated data, the first permission expiration time and the first hash value are updated for the first time;

[0343] If the first update fails, the retry time slices are selected from the remaining time slices. The remaining time slices are the time slices in the time slice set other than the target time slice.

[0344] In the retry time sharding, based on the updated data, the first permission expiration time and the first hash value are updated to obtain the local permission table after the first update.

[0345] Optionally, the second acquisition module 1602 is specifically used to perform:

[0346] If the first update fails, determine the target number of update failures;

[0347] If the target number of attempts is less than or equal to the preset number of attempts, then the retry time shards of the stateful service instance are selected from the remaining time shards based on the target time shards and the target number of attempts corresponding to the stateful service instance.

[0348] Optionally, the instance processing apparatus further includes:

[0349] The request / response module is used to execute:

[0350] Obtain the service request for the first execution stateful service instance, and perform a hash operation on the object identifier in the service request to obtain the third hash value corresponding to the service request;

[0351] Based on the third hash value and the fourth hash value in the second updated local permission table, select the second stateful service instance that executes the service request from the second updated local permission table;

[0352] If the first executing stateful service instance and the second executing stateful service instance are the same, and the current time is before the first permission expiration time of the second executing stateful service instance, then the service request is responded to through the second executing stateful service instance.

[0353] In practice, each of the above modules can be implemented as an independent entity or can be combined arbitrarily to be implemented as the same or several entities. For the specific implementation methods and corresponding beneficial effects of each of the above modules, please refer to the previous method embodiments, which will not be repeated here.

[0354] This application also provides an electronic device, which may be a server or a terminal, etc. Figure 17 As shown, it illustrates a structural schematic diagram of the electronic device involved in the embodiments of this application, specifically:

[0355] The electronic device may include components such as a processor 1701 with one or more processing cores, a memory 1702 with one or more computer-readable storage media, a power supply 1703, and an input unit 1704. Those skilled in the art will understand that... Figure 17 The electronic device structure shown does not constitute a limitation on the electronic device and may include more or fewer components than shown, or combine certain components, or have different component arrangements. Wherein:

[0356] Processor 1701 is the control center of the electronic device, connecting various parts of the device via various interfaces and lines. It executes computer programs and / or modules stored in memory 1702, and calls data stored in memory 1702, to perform various functions and process data. Optionally, processor 1701 may include one or more processing cores; preferably, processor 1701 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user interface, and applications, and the modem processor mainly handles wireless communication. It is understood that the modem processor may not be integrated into processor 1701.

[0357] The memory 1702 can be used to store computer programs and modules. The processor 1701 executes various functional applications and data processing by running the computer programs and modules stored in the memory 1702. The memory 1702 may mainly include a program storage area and a data storage area. The program storage area may store the operating system, computer programs required for at least one function (such as sound playback function, image playback function, etc.), etc.; the data storage area may store data created according to the use of the electronic device, etc. In addition, the memory 1702 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 1702 may also include a memory controller to provide the processor 1701 with access to the memory 1702.

[0358] The electronic device also includes a power supply 1703 that supplies power to the various components. Preferably, the power supply 1703 can be logically connected to the processor 1701 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system. The power supply 1703 may also include one or more DC or AC power supplies, recharging systems, power fault detection circuits, power converters or inverters, power status indicators, and other arbitrary components.

[0359] The electronic device may also include an input unit 1704, which can be used to receive input digital or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

[0360] Although not shown, the electronic device may also include a display unit, etc., which will not be described in detail here. Specifically, in this embodiment, the processor 1701 in the electronic device loads the executable files corresponding to the processes of one or more computer programs into the memory 1702 according to the following instructions, and the processor 1701 runs the computer programs stored in the memory 1702 to realize various functions, such as:

[0361] Obtain the local permission table updated with the first renewal time. The local permission table includes the first hash value of each stateful service instance on the hash ring and the first permission expiration time. The time interval between the first permission expiration time and the first renewal time is greater than the preset renewal period.

[0362] During the second renewal period, updated data is retrieved from the database, and based on the updated data, the first permission expiration time and the first hash value are updated to obtain the local permission table after the first update. The first time interval between the second renewal period and the first renewal period is the preset renewal period.

[0363] If the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, then obtain the target update time of the newly added stateful service instance.

[0364] Based on the target update time, the local permission table after the first update is updated a second time to obtain the local permission table after the second update. The local permission table after the second update includes the expiration time of the second permission for the newly added stateful service instance.

[0365] For details on the specific implementation methods and corresponding beneficial effects of the above operations, please refer to the detailed description of the example processing methods above, which will not be repeated here.

[0366] Those skilled in the art will understand that all or part of the steps in the various methods of the above embodiments can be performed by a computer program, or by a computer program controlling related hardware. The computer program can be stored in a computer-readable storage medium and loaded and executed by a processor.

[0367] Therefore, embodiments of this application provide a computer-readable storage medium storing a computer program that can be loaded by a processor to execute the steps in any of the example processing methods provided in the embodiments of this application. For example, the computer program can execute the following steps:

[0368] Obtain the local permission table updated with the first renewal time. The local permission table includes the first hash value of each stateful service instance on the hash ring and the first permission expiration time. The time interval between the first permission expiration time and the first renewal time is greater than the preset renewal period.

[0369] During the second renewal period, updated data is retrieved from the database, and based on the updated data, the first permission expiration time and the first hash value are updated to obtain the local permission table after the first update. The first time interval between the second renewal period and the first renewal period is the preset renewal period.

[0370] If the local permission table after the first update includes the second hash value of the newly added stateful service instance on the hash ring, then obtain the target update time of the newly added stateful service instance.

[0371] Based on the target update time, the local permission table after the first update is updated a second time to obtain the local permission table after the second update. The local permission table after the second update includes the expiration time of the second permission for the newly added stateful service instance.

[0372] For details on the specific implementation methods and corresponding beneficial effects of the above operations, please refer to the previous embodiments, which will not be repeated here.

[0373] The computer-readable storage medium may include: read-only memory (ROM), random access memory (RAM), disk or optical disk, etc.

[0374] Since the computer program stored in the computer-readable storage medium can execute the steps in any of the instance processing methods provided in the embodiments of this application, the beneficial effects that any of the instance processing methods provided in the embodiments of this application can achieve can be realized. For details, please refer to the previous embodiments, which will not be repeated here.

[0375] According to one aspect of this application, a computer program product or computer program is provided, comprising computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the above-described example processing method.

[0376] The foregoing has provided a detailed description of an example processing method, apparatus, electronic device, and computer-readable storage medium provided in the embodiments of this application. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the methods and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.

Claims

1. An instance processing method, characterized in that, include: Obtain the local permission table updated with the first renewal time. The local permission table includes the first hash value of each stateful service instance on the hash ring and the first permission expiration time. The time interval between the first permission expiration time and the first renewal time is greater than the preset renewal period. During the second renewal period, updated data is retrieved from the database, and based on the updated data, the first permission expiration time and the first hash value are updated to obtain the local permission table after the first update. The first time interval between the second renewal period and the first renewal period is the preset renewal period. If the first updated local permission table includes the second hash value of the newly added stateful service instance on the hash ring, then obtain the target update time of the newly added stateful service instance. Based on the target update time, the first updated local permission table is updated a second time to obtain the second updated local permission table. The second updated local permission table includes the second permission expiration time of the newly added stateful service instance.

2. The instance processing method according to claim 1, characterized in that, If the first updated local permission table includes the second hash value of the newly added stateful service instance on the hash ring, then obtaining the target update time of the newly added stateful service instance includes: If the first updated local permission table includes the second hash value of the newly added stateful service instance on the hash ring, then obtain the preset time and the online time of the newly added stateful service instance. The target renewal period is determined based on the preset time and the preset renewal period; The target update time for the newly added stateful service instance is determined based on the launch time and the target renewal period.

3. The instance processing method according to claim 1, characterized in that, Before retrieving updated data from the database during the second renewal period, the process also includes: Receive offline notification; Based on the offline notification, select offline stateful service instances from the local permission table; The first hash value and first permission expiration time of the offline stateful service instance are deleted from the local permission table to obtain the third updated local permission table.

4. The instance processing method according to claim 1, characterized in that, During the second renewal period, the updated data is retrieved from the database, including: The preset renewal period is divided to obtain a time slice set, the time slice set including a first number of time slices; Filter the target time slice corresponding to each stateful service instance from the time slice set; In the second renewal time of the target time slice corresponding to the stateful service instance, updated data is obtained from the database. The second renewal time and the first renewal time are in different preset renewal periods.

5. The instance processing method according to claim 4, characterized in that, The first number is greater than the second number of stateful service instances in the local permission table; The step of selecting the target time slice corresponding to each stateful service instance from the time slice set includes: The first second number of time slices are selected from the set of time slices to obtain the target time slice; Assign a target time slice to each stateful service instance to obtain the target time slice corresponding to the stateful service instance; The step of updating the first permission expiration time and the first hash value based on the updated data to obtain the local permission table after the first update includes: Based on the updated data, the first permission expiration time and the first hash value are updated for the first time; If the first update fails, a retry time slice is selected from the remaining time slices, wherein the remaining time slices are the time slices in the time slice set other than the target time slice; In the retry time shard, the first permission expiration time and the first hash value are updated according to the updated data to obtain the local permission table after the first update.

6. The instance processing method according to claim 5, characterized in that, If the first update fails, the retry time slices are selected from the remaining time slices, including: If the first update fails, then determine the target number of update failures; If the target number is less than or equal to the preset number, then the retry time shard of the stateful service instance is selected from the remaining time shards based on the target time shard corresponding to the stateful service instance and the target number.

7. The instance processing method according to any one of claims 1-6, characterized in that, After performing a second update on the first updated local permission table according to the target update time to obtain the second updated local permission table, the method further includes: Obtain the service request for the first execution stateful service instance, and perform a hash operation on the object identifier in the service request to obtain the third hash value corresponding to the service request; Based on the third hash value and the fourth hash value in the second updated local permission table, select the second execution stateful service instance that executes the service request from the second updated local permission table; If the first execution stateful service instance and the second execution stateful service instance are the same, and the current time is before the first permission expiration time of the second execution stateful service instance, then the service request is responded to through the second execution stateful service instance.

8. An instance processing apparatus, characterized in that, include: The first acquisition module is used to acquire the local permission table updated by the first renewal time. The local permission table includes the first hash value of each stateful service instance on the hash ring and the first permission expiration time. The time interval between the first permission expiration time and the first renewal time is greater than the preset renewal period. The second acquisition module is used to acquire updated data from the database at the second renewal time, and based on the updated data, perform a first update on the first permission expiration time and the first hash value to obtain the local permission table after the first update. The first time interval between the second renewal time and the first renewal time is the preset renewal period. The third acquisition module is used to acquire the target update time of the newly added stateful service instance if the first updated local permission table includes the second hash value of the newly added stateful service instance on the hash ring. The update module is used to perform a second update on the first updated local permission table according to the target update time to obtain a second updated local permission table, wherein the second updated local permission table includes the second permission expiration time of the newly added stateful service instance.

9. An electronic device, characterized in that, It includes a processor and a memory, the memory storing a computer program, and the processor running the computer program in the memory to perform the instance processing method according to any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program adapted for loading by a processor to execute the instance processing method according to any one of claims 1 to 7.

11. A computer program product, characterized in that, The computer program product stores a computer program adapted for loading by a processor to execute the instance processing method according to any one of claims 1 to 7.