Probe method and device of server, access device and computer readable storage medium

By establishing probe sessions through access devices and adjusting probe cycles and traffic weights, user access issues caused by server failures were resolved, enabling rapid restoration of server usage and business continuity, and improving user experience.

CN122247897APending Publication Date: 2026-06-19STATE GRID HENAN ELECTRIC POWER COMPANY ZHENGZHOU POWER SUPPLY CO

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
STATE GRID HENAN ELECTRIC POWER COMPANY ZHENGZHOU POWER SUPPLY CO
Filing Date
2026-04-24
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In port-based network access control and Web authentication/mandatory portal scenarios, access devices cannot switch over in a timely manner when the server fails, resulting in new users being denied access and online users being unable to log off, creating a silent failure and impacting user experience.

Method used

Access devices establish probe sessions and periodically send probe messages. Based on the server's response, they adjust the probe cycle and the weight of new user traffic, increase the probe frequency, and reduce the allocation of new user traffic. This ensures that the server can quickly restore traffic capacity when its status fluctuates, thus avoiding load pressure.

Benefits of technology

When server status fluctuates, quickly restore server usage, reduce interference from new users, ensure uninterrupted service for online users, and improve user experience.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122247897A_ABST
    Figure CN122247897A_ABST
Patent Text Reader

Abstract

This disclosure provides a method, apparatus, access device, and computer-readable storage medium for server detection. The method includes: in response to a target event being triggered, establishing a detection session between the access device and the target server based on source parameters of the sending real authentication messages; in each of multiple detection cycles, sending detection messages corresponding to various authentication scenarios to the target server based on the detection session to obtain detection results; if the detection result indicates that the state of the target server has changed from available to unavailable, adjusting the duration of the detection cycle from a first duration to a second duration, and adjusting the new user traffic weight of the target server to a first preset weight value; if the detection result indicates that the state of the target server has changed from unavailable to available, adjusting the duration of the detection cycle from the second duration to a third duration, and adjusting the new user traffic weight of the target server from the first preset weight value to a second preset weight value.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of computer communication technology, and more specifically, to a method, apparatus, access device, and computer-readable storage medium for detecting a server. Background Technology

[0002] In port-based network access control scenarios such as 802.1X and web authentication / mandatory portals, the access switch acts as the "authentication enforcement point." The end user first sends identity credentials, such as username, password, and certificate, to the access switch. The access switch encapsulates these credentials into a Remote Authentication Dial-In User Service (RADIUS) message and forwards it to the backend RADIUS server. Only when the server returns an Access-Accept message does the access switch open the port or issue a Virtual Local Area Network (VLAN) or Internet Protocol (IP) message to allow user access; if an Access-Reject message or timeout is returned, access is denied.

[0003] In the above authentication process, the service is a key node in the authentication chain. If a failure occurs, such as protocol stack deadlock, database hang, shared key drift, or link interruption, the access device in the current authentication mode can only wait for the real user authentication to time out before switching to another server. During this period, for the terminal, new users will be rejected and online users will be unable to log off, resulting in a silent failure. Summary of the Invention

[0004] In view of this, the present disclosure provides a method, apparatus, access device, and computer-readable storage medium for detecting servers to solve the above problems.

[0005] Specifically, this disclosure is achieved through the following technical solution: In a first aspect, embodiments of this disclosure provide a server detection method applied to an access device. The method includes: in response to a target event being triggered, establishing a detection session between the access device and a target server based on source parameters of the sending real authentication messages; in each of multiple detection cycles, sending detection messages corresponding to various authentication scenarios to the target server based on the detection session to obtain detection results for the target server; wherein the detection messages have the same message format as the service messages of remote authentication dial-up user services; and when the detection results indicate that the state of the target server has changed from available to unavailable, adjusting the duration of the detection cycle from a first duration to a later duration. The second duration is longer than the second duration. If the detection result indicates that the target server's status has changed from unavailable to available, the duration of the detection cycle is adjusted from the second duration to the third duration, and the new user traffic weight of the target server is adjusted from the first preset weight value to the second preset weight value, entering the next detection cycle. The second preset weight value is greater than the first preset weight value, and the second preset weight value is less than the target traffic weight value of the target server. The third duration is longer than the second duration and shorter than the first duration.

[0006] Optionally, the method further includes: determining the number of consecutive periods in which the target server is available within a plurality of detection cycles after the detection result indicates that the state of the target server has changed from unavailable to available; determining a target value for the new user traffic weight corresponding to the target server based on the number of consecutive periods in which the target server is available; adjusting the user traffic weight of the target server to the target value; and adjusting the duration of the detection cycle from the third duration to the first duration when the number of consecutive periods in which the target server is available reaches a first preset threshold.

[0007] Optionally, the step of sending probe messages corresponding to multiple authentication scenarios to the server based on the probe session to obtain the probe result of the target server includes: continuously sending probe messages corresponding to multiple authentication scenarios to the target server based on the probe session within the same probe period; if a corresponding message from the target server based on any probe message is received within a first preset time, the probe result of the target server is determined to be available; if no response message corresponding to the multiple probe messages is received from the target server within the first preset time, the probe result of the target server is determined to be unavailable.

[0008] Optionally, the step of sending probe packets corresponding to multiple authentication scenarios to the server based on the probe session to obtain the probe result of the target server includes: sending a first probe packet corresponding to a first authentication scenario among the multiple authentication scenarios to the target server based on the probe session; if a first response packet from the target server for the first probe packet is received within a second preset time, then the probe result of the target server is determined to be available; if the first response packet is not received within the second preset time, then sending a second probe packet corresponding to a second authentication scenario among the multiple authentication scenarios to the target server based on the probe session; if a second response packet from the target server for the second probe packet is received within a third preset time, then the probe result of the target server is determined to be available; if the second response packet is not received, then the probe result of the target server is determined to be unavailable.

[0009] Optionally, the method further includes: if a second response message from the target server to the second probe message is received within a third preset time period, marking the target server as a minor fault state; if the number of consecutive probe cycles in which the target server is marked as a minor fault state reaches a second preset number threshold, adjusting the user traffic weight of the target server to a third preset weight value; the third preset weight value is less than the target traffic weight value.

[0010] Optionally, the method further includes: maintaining a disconnection counter for the target server; the initial value of the disconnection counter is a first value; if no first response message from the target server to the first probe message is received within a second preset time, the value of the disconnection counter is increased by a second value; and if the first response message from the target server to the first probe message is received within the second preset time, the value of the disconnection counter is set back to the initial value; if a second response message from the target server to the second probe message is received within a third preset time, the value of the disconnection counter is increased by a third value; and if the value of the disconnection counter is the sum of the first value, the second value, and the third value, the status of the target server is marked as faulty.

[0011] Optionally, the method further includes: marking the state of the target server as faulty when the number of consecutive periods indicated by the detection result as the state of the target server being unavailable reaches a third quantity threshold.

[0012] Optionally, the method further includes: when the state of the target server is marked as faulty, retaining the sessions of online users and caching the billing update messages of the online users; after the state of the target server recovers to availability, sending the cached billing update messages to the target server; and when the cached billing update messages expire, sending a billing stop message to the target server, so that the target server can perform billing processing on the online users corresponding to the cached billing update messages after the state recovers to availability.

[0013] Optionally, the method further includes: when the probe result indicates that the state of the target server has changed from available to unavailable, sending probe messages corresponding to various authentication scenarios to the backup server through a probe session with the backup server to obtain the probe result of the backup server; when the probe result of the backup server indicates that the backup server is available, marking the backup server as hot standby; and after marking the state of the target server as faulty, connecting new user traffic to the backup server marked as hot standby.

[0014] Secondly, this disclosure also provides a server detection device, comprising: a session establishment module, configured to establish a detection session between the access device and the target server based on the source parameters of the access device sending the real authentication message in response to a target event being triggered; a detection module, configured to send detection messages corresponding to various authentication scenarios to the target server in each of multiple detection cycles based on the detection session, to obtain a detection result of the target server; wherein the detection messages have the same message format as the service messages of the remote authentication dial-up user service; and a processing module, configured to, when the detection result indicates that the state of the target server has changed from available to unavailable, reduce the duration of the detection cycle from the first cycle to the second cycle. The duration of the detection cycle is adjusted from the second duration to the third duration, and the new user traffic weight of the target server is adjusted to the first preset weight value, and the next detection cycle begins; the first duration is longer than the second duration; if the detection result indicates that the status of the target server has changed from unavailable to available, the duration of the detection cycle is adjusted from the second duration to the third duration, and the new user traffic weight of the target server is adjusted from the first preset weight value to the second preset weight value, and the next detection cycle begins; wherein, the second preset weight value is greater than the first preset weight value, and the second preset weight value is less than the target traffic weight value of the target server; the third duration is longer than the second duration and shorter than the first duration.

[0015] Optionally, the processing module is further configured to: determine the number of consecutive periods in which the target server is available within a plurality of detection cycles after the detection result indicates that the state of the target server has changed from unavailable to available; determine a target value for the new user traffic weight corresponding to the target server based on the number of consecutive periods in which the target server is available; adjust the user traffic weight of the target server to the target value; and, if the number of consecutive periods in which the target server is available reaches a first preset threshold, adjust the duration of the detection cycle from the third duration to the first duration.

[0016] Optionally, when the detection module sends detection packets corresponding to multiple authentication scenarios to the server based on the detection session to obtain the detection result of the target server, it is configured to: continuously send detection packets corresponding to multiple authentication scenarios to the target server based on the detection session within the same detection period; if a corresponding packet from the target server based on any detection packet is received within a first preset time, the detection result of the target server is determined to be available; if no response packets corresponding to multiple detection packets are received from the target server within the first preset time, the detection result of the target server is determined to be unavailable.

[0017] Optionally, when the detection module sends detection packets corresponding to multiple authentication scenarios to the server based on the detection session to obtain the detection result of the target server, it is configured to: send a first detection packet corresponding to a first authentication scenario among the multiple authentication scenarios to the target server based on the detection session; if a first response packet from the target server for the first detection packet is received within a second preset time, then the detection result of the target server is determined to be available; if the first response packet is not received within the second preset time, then send a second detection packet corresponding to a second authentication scenario among the multiple authentication scenarios to the target server based on the detection session; if a second response packet from the target server for the second detection packet is received within a third preset time, then the detection result of the target server is determined to be available; if the second response packet is not received, then the detection result of the target server is determined to be unavailable.

[0018] Optionally, the processing module is further configured to: mark the target server as having a minor fault state when a second response message from the target server to the second probe message is received within a third preset time period; and adjust the user traffic weight of the target server to a third preset weight value when the number of consecutive probe cycles in which the target server is marked as having a minor fault state reaches a second preset number threshold; wherein the third preset weight value is less than the target traffic weight value.

[0019] Optionally, the processing module is further configured to: maintain a disconnection counter for the target server; the initial value of the disconnection counter is a first value; if no first response message from the target server to the first probe message is received within a second preset time, the value of the disconnection counter is increased by a second value; and if the first response message from the target server to the first probe message is received within the second preset time, the value of the disconnection counter is set back to the initial value; if a second response message from the target server to the second probe message is received within a third preset time, the value of the disconnection counter is increased by a third value; and if the value of the disconnection counter is the sum of the first value, the second value, and the third value, the status of the target server is marked as faulty.

[0020] Optionally, the processing module is further configured to: mark the state of the target server as faulty if the number of consecutive cycles in which the detection result indicates that the state of the target server is unavailable reaches a third quantity threshold.

[0021] Optionally, the processing module is further configured to: retain the sessions of online users and cache the billing update messages of the online users when the state of the target server is marked as faulty; send the cached billing update messages to the target server after the state of the target server recovers to availability; and send a billing stop message to the target server when the cached billing update messages expire, so that the target server can perform billing processing on the online users corresponding to the cached billing update messages after the state recovers to availability.

[0022] Optionally, the processing module is further configured to: when the detection result indicates that the state of the target server has changed from available to unavailable, send detection messages corresponding to various authentication scenarios to the backup server through a detection session with the backup server to obtain the detection result of the backup server; when the detection result of the backup server indicates that the backup server is available, mark the backup server as hot standby; and after marking the state of the target server as faulty, connect new user traffic to the backup server marked as hot standby.

[0023] Thirdly, an optional implementation of this disclosure also provides an access device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of the first aspect above, or any possible implementation of the first aspect.

[0024] Fourthly, an optional implementation of this disclosure also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the first aspect or any possible implementation of the first aspect.

[0025] Fifthly, an optional implementation of this disclosure also provides a computer program product carrying program code, the program code including instructions that can be used to perform the steps of the method as described in the first aspect or any of the first aspects.

[0026] For a description of the effectiveness of the aforementioned server detection device, access equipment, and computer-readable storage medium, please refer to the description of the server detection method above; it will not be repeated here.

[0027] It should be understood that the above general description and the following detailed description are exemplary and explanatory only. In order to make the above-mentioned objects, features and advantages of this disclosure more apparent and understandable, preferred embodiments are described in detail below with reference to the accompanying drawings.

[0028] The server detection method provided in this disclosure pre-establishes a corresponding detection session for the server based on the source parameters of the access device sending the real authentication message. It periodically sends test messages to the server according to this detection session, and determines the detection result based on the server's response to the detection messages. When the detection result indicates that the target server's status has changed from available to unavailable, the duration of the detection period is adjusted from a first duration to a second duration. This increases the detection frequency when the server is unavailable, enabling rapid restoration of server usage after it becomes available again. Simultaneously, the new user traffic weight of the server is adjusted to a first preset weight value to reduce... Alternatively, no new user traffic can be allocated to the server. If the detection result indicates that the target server's status has changed from unavailable to available, the duration of the detection period can be adjusted from the second duration to the third duration, and the weight of the new user traffic to the target server can be adjusted from the first preset weight value to the second preset weight value. The second preset weight value is less than the target traffic weight value of the target server. This way, when the server is detected to have just become available, the server's traffic capacity can be restored lightly first, avoiding the pressure on the server caused by a full restoration, reducing the state fluctuations of the server due to load pressure, and ensuring that the use of most new users will not be disturbed when the server's state fluctuates. Attached Figure Description

[0029] Figure 1 This is a specific example of message interaction for 802.1X authentication as illustrated in an exemplary embodiment of this disclosure; Figure 2 This is a specific example of a specific process for performing server probing in a user authentication or accounting request, as illustrated in an exemplary embodiment of this disclosure; Figure 3 This is a flowchart illustrating a server detection method according to an exemplary embodiment of this disclosure; Figure 4 This is a schematic diagram illustrating an access device according to an exemplary embodiment of this disclosure; Figure 5 This is a schematic diagram of a server detection device illustrated in an exemplary embodiment of this disclosure. Detailed Implementation

[0030] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numerals in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this disclosure. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this disclosure as detailed in the appended claims.

[0031] The terminology used in this disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The singular forms “a,” “the,” and “the” as used in this disclosure and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.

[0032] It should be understood that although the terms first, second, third, etc., may be used in this disclosure to describe various information, such information should not be limited to these terms. These terms are used only to distinguish information of the same type from one another. For example, without departing from the scope of this disclosure, first information may also be referred to as second information, and similarly, second information may also be referred to as first information. Depending on the context, the word "if" as used herein may be interpreted as "when," "when," or "in response to determination."

[0033] like Figure 1 The image shows a specific example of message exchange for 802.1X authentication. This process involves the client, access device, and RADIUS server, and the message exchange process is as follows: 1. Client-initiated authentication phase: ① The client actively sends an EAPOL-Start message to the access device, triggering the 802.1X authentication process.

[0034] ② The access device sends an EAP-Request / Identity to the client to request the client's identity information (username).

[0035] ③ The client sends an EAP-Response / Identity message to the access device to reply with its own identity information. The access device then obtains the identity information.

[0036] 2. CHAP Challenge - Response Phase: ① After receiving the username sent by the client, the access device sends an EAP-Request / challenge to the client, carrying a CHAP challenge random number in it.

[0037] ② The client uses the password to perform a hash calculation on the challenge random number, encapsulates the result into a CHAP-Response, and sends the CHAP-Response in EAP-Response / challenge to the access device.

[0038] 3. RADIUS Server Authentication Phase: ①: The access device encapsulates the client's CHAP response into a RADIUS message, carries it in the RADIUS Access-Request (CHAP-Response), and sends it to the RADIUS server to initiate an authentication request.

[0039] ②: After the RADIUS server verifies the validity of the hash result, it returns a challenge response message indicating that the authentication has been successful, which is to send a RADIUS Access-Challenge (CHAP-Success) message to the access device.

[0040] 4. Port Authorization and Billing Phase: ①: The access device sends an authentication success message EAP-Success to the client, switches the port to authorized status, and allows the client to access the network.

[0041] ②: The access device sends a RADIUS Accounting-Request to the RADIUS server to record user online information.

[0042] ③: The RADIUS server confirms the billing request and completes the billing session establishment by sending a RADIUS Accounting-Response to the billing device.

[0043] In the above process, after receiving the terminal credentials, the access device constructs an Access-Request (carrying attributes such as User-Name, CHAP-Password, or EAP-Message) and sends it to the RADIUS server. Upon successful verification, the RADIUS server returns an Access-Accept (carrying authorization attributes such as Filter-ID and Tunnel-Private-Group-ID) or an Access-Reject / Challenge to the access device. If an Access-Challenge is received, the access device continues to complete a two-way handshake with the terminal (EAP-Request / Response or CHAP-Challenge / Response) and resends the Access-Request until acceptance or rejection occurs. During the user's online period, the access device periodically sends Accounting-Requests (Start / Interim-Update / Stop) for billing.

[0044] Therefore, the RADIUS server becomes a "single point of choke" in the authentication chain. New users must complete a request-to-response process before going online; the persistence, billing updates, and logout of existing online users also rely on this link. In scenarios such as large campuses and carrier wireless local area networks (WLANs), a single access device typically connects to multiple RADIUS servers (primary / backup / weighted), selecting different servers based on priority and weight. RADIUS, an application-layer protocol based on UDP, inherently lacks a reliable end-to-end confirmation mechanism. If a server experiences failures such as protocol stack deadlock, database hangs, shared key drift, or link interruption, traditional access devices can only wait for the real user authentication timeout (20–60 seconds) before switching over. During this period, new users are rejected, and existing users cannot log off, creating a silent failure that directly impacts user experience.

[0045] like Figure 2 The diagram illustrates a specific process for server probing based on user authentication or accounting requests. After sending a request, the switch starts a timer. If a response is received from the server within the specified time, the server is considered to be working normally; if no response is received within the timeout, a retransmission will be performed. If no response is received after the preset maximum number of retransmissions, the switch will mark the RADIUS server's status as "Down" or "Abnormal".

[0046] Current server probing methods involve sending probe messages to the server. If a response is received from the server within a certain time after sending the probe message, the server is considered available; otherwise, it is considered unavailable. However, this probing method has the following problems: Firstly, current detection methods allocate user traffic to a server after it is detected as available. However, servers can have various problems, and their status may fluctuate over time, such as constantly changing between available and unavailable. This fluctuation forces clients accessing the server during this period to re-authenticate on other servers, degrading the user experience.

[0047] Secondly, the current detection method, after detecting that the server is unavailable, will switch new traffic to other servers. For user devices that are already connected to the server, the access device will send a disconnect message to the user device according to the original logic. After receiving the message, the user device will disconnect from the original server and re-initiate the authentication request, thereby degrading the user experience.

[0048] To address the aforementioned issues, this disclosure provides a server detection method. This method pre-establishes a corresponding detection session for the server based on the source parameters of the access device sending the real authentication message. It periodically sends test messages to the server according to this detection session. The detection result is determined based on the server's response to the detection messages. When the detection result indicates that the target server's status has changed from available to unavailable, the duration of the detection period is adjusted from a first duration to a second duration. This increases the detection frequency when the server is unavailable, enabling rapid restoration of server usage after it becomes available again. Simultaneously, the new user traffic weight of the server is adjusted to a first preset value. The weight value is adjusted to reduce or stop allocating new user traffic to the server. When the detection result indicates that the target server's status has changed from unavailable to available, the duration of the detection period is adjusted from the second duration to the third duration, and the weight of the new user traffic of the target server is adjusted from the first preset weight value to the second preset weight value. The second preset weight value is less than the target traffic weight value of the target server. This allows for a light restoration of the server's traffic capacity when the server is detected to have just become available, avoiding the pressure on the server caused by a full restoration, reducing the state fluctuations of the server due to load pressure, and ensuring that the use of most new users will not be disturbed when the server's state fluctuates.

[0049] In addition, in another embodiment of this disclosure, a keep-alive mechanism is adopted for online users when the server is unavailable, ensuring that the services of online users are not interrupted during server failure and improving the user experience.

[0050] The shortcomings of the above solutions are the result of the inventor's practical experience and careful research. Therefore, the discovery process of the above problems and the solutions proposed in this disclosure below should be considered as the inventor's contribution to this disclosure.

[0051] To facilitate understanding of the technical solutions disclosed herein, the technical terms used in the embodiments of this disclosure will first be explained: EAP, Extensible Authentication Protocol.

[0052] EAPOL-Start: EAP over LAN – Start.

[0053] LAN, Local Area Network.

[0054] Request / Identity: Identity information request.

[0055] Response / Identity: Identity response or identity reply.

[0056] CHAP, Challenge-Handshake Authentication Protocol.

[0057] RADIUS: Remote Authentication Dial-In User Service.

[0058] SW: Access Switch, responsible for forwarding user authentication requests to the RADIUS server.

[0059] Probe-Challenge: CHAP-Challenge messages used for probing, without carrying real user data.

[0060] Probe-Access: Uses RADIUS Access-Request messages for probing, without carrying real user data.

[0061] To facilitate understanding of this embodiment, a server detection method disclosed in this disclosure will first be described in detail. The execution entity of the server detection method provided in this disclosure is generally an access device, such as a switch, wireless access point, wireless access controller (AC / WAC), broadband access or gateway device, broadband remote access server, enterprise-level gateway or router, etc. These access devices have port control capabilities and can interface with a RADIUS server for AAA authentication. In some possible implementations, the server detection method can be implemented by a processor calling computer-readable instructions stored in memory.

[0062] The server detection method provided in the embodiments of this disclosure will be described below.

[0063] See Figure 3 The diagram shows a flowchart of a server detection method provided in this embodiment of the present disclosure. This method is applied to an access device and includes steps S301-S304, wherein: S301: In response to the target event being triggered, a probe session is established between the access device and the server based on the source parameters of the real authentication message sent by the access device.

[0064] S302: In each of the multiple probing cycles, based on the probing session, probing messages corresponding to various authentication scenarios are sent to the target server to obtain the probing results of the target server; wherein the probing messages have the same message format as the service messages of the remote authentication dial-up user service; S303: If the detection result indicates that the state of the target server has changed from available to unavailable, the duration of the detection cycle is adjusted from a first duration to a second duration, and the new user traffic weight of the target server is adjusted to a first preset weight value, and the next detection cycle is entered; the first duration is longer than the second duration. S304: If the detection result indicates that the state of the target server has changed from unavailable to available, the duration of the detection period is adjusted from the second duration to the third duration, and the new user traffic weight of the target server is adjusted from the first preset weight value to the second preset weight value, and the next detection period begins; wherein, the second preset weight value is greater than the first preset weight value, and the second preset weight value is less than the target traffic weight value of the target server; the third duration is greater than the second duration and less than the first duration.

[0065] The following provides a detailed explanation of S301 to S304.

[0066] Regarding the above S301: In practice, target events include, for example, status change events corresponding to access devices and status change events corresponding to servers. Status change events corresponding to access devices include, for example, the startup of relevant systems deployed on the access device, or a change in the access device's status, such as a change in any of the access device's source parameters (local source IP address, configured VLAN interface, or outgoing interface). Since server probing requires establishing a probing session with the server based on these source parameters, if these source parameters change, a new probing session needs to be established based on the changed source parameters.

[0067] Server status change events include, for example, server startup, or the addition, deletion, or modification of configurations on a server that is already started.

[0068] When the target event is triggered, the access device will establish a probe session with the server based on the source parameters used when sending the real authentication message.

[0069] Here, source parameters include, for example, the source IP address, VLAN interface, and outgoing interface of the access device used when performing genuine 802.1X / Portal authentication. This ensures that probe packets and genuine service packets have the same source IP, same route, and same firewall policy, completely eliminating false positives caused by "probe success but authentication failure".

[0070] Specifically, when establishing a probe session with the server, for example, the following method can be used: based on the source parameters, establish a Transmission Control Protocol (TCP) connection or a User Datagram Protocol (UDP) connection with the server. This probe session includes either a TCP connection or a UDP connection established with the server.

[0071] Regarding the above S302: In practice, after establishing a probe session with the target server, the access device periodically sends probe packets to the target server. Different probe packets simulate different authentication scenarios, achieving heterogeneous redundancy coverage of the protocol path.

[0072] For example, various authentication scenarios and corresponding probe messages may include: (1) Probe-Access: Simulate Access-Request scenarios such as PAP, MS-CHAP, and EAP-MD5. The constructed probe message carries forged User-Name, NAS-IP-Address, NAS-Port, Called-Station-Id (called station identifier, i.e., the identifier of the access device, such as MAC address), and Calling-Station-Id (calling station identifier, also the identifier of the client, such as the MAC address of the terminal). However, the CHAP-Password (the password value after the client encrypts the Challenge) or EAP-Message (EAP information, i.e., the message carrier of the EAP authentication protocol) fields in the probe message are filled with random numbers and do not contain any real user information.

[0073] (2) Probe-Challenge: Simulates the CHAP two-way handshake interaction scenario. The constructed probe message includes: a forged CHAP-Challenge attribute. The Challenge value and Identifier are randomly generated and do not carry real user data.

[0074] In addition, there may be other authentication scenarios, but the specific embodiments disclosed herein are not limited to these.

[0075] When sending probe messages, either method a1 or a2 can be used: a1: Within the same detection period, based on the detection session, probe messages corresponding to various authentication scenarios are continuously sent to the target server.

[0076] If a response message from the target server for any probe message is received within a first preset time, the target server is determined to be available; if no response messages from the target server for each of the multiple probe messages are received within the first preset time, the target server is determined to be unavailable.

[0077] For example, taking the authentication scenario as an example, the probe messages corresponding to the two authentication scenarios are sent out at 10 milliseconds (ms) interval. Within one round-trip time (RTT), the two code paths most prone to failure in mainstream authentication processes such as 802.1X and Portal, namely "Access-Request decoding" and "CHAP-Challenge state machine", can be covered, thus achieving protocol-level deep liveness detection.

[0078] a2: Within the same probing period, a first probing message corresponding to the first authentication scenario among the multiple authentication scenarios is sent to the target server based on the probing session; if a first response message from the target server to the first probing message is received within a second preset time, the target server is determined to be in an available state; if the first response message is not received within the second preset time, a second probing message corresponding to the second authentication scenario among the multiple authentication scenarios is sent to the target server based on the probing session; if a second response message from the target server to the second probing message is received within a third preset time, the target server is determined to be in an available state; if the second response message is not received, the target server is determined to be unavailable.

[0079] In this way, within the same detection cycle, a detection message for the first authentication scenario is sent first. Only if no feedback is received from the target server after a timeout is triggered, a detection message for the second authentication scenario is sent to the target server. This ensures that during stable server operation, only one detection message is needed to complete the detection of the server status, saving about 50% of the number of detection messages. The second detection message is only triggered when a suspected fault occurs, achieving "normal light load, deep fault detection".

[0080] For example, taking Probe-Access as the first authentication scenario and Probe-Challenge as the second authentication scenario, the process of sending probe packets corresponding to various authentication scenarios to the target server based on the probe session is explained as follows: Step 1: Forge a Probe-Access message as a probe message, send this probe message to the target server, and start the T1 timeout timer (default 3s). This probe message simulates Access-Request scenarios such as PAP, MS-CHAP, and EAP-MD5, carrying forged standard attributes such as User-Name, NAS-IP-Address, NAS-Port, Called-Station-Id, and Calling-Station-Id, but the CHAP-Password or EAP-Message field is filled with random numbers and does not contain any real user privacy information.

[0081] Step 2: Decision based on the target server's response to the first probe message. Within the T1 timeout window, if any RADIUS response (Access-Accept / Reject / Challenge) is received: the server protocol stack is deemed normal, the disconnection counter is immediately cleared, step 3 is skipped, and the next probe cycle is directly entered; if T1 times out (no response is received within 3 seconds), step 3 is triggered, and the second probe message is sent.

[0082] Step 3: Forge a Probe-Challenge message as a second probe message, send this probe message to the target server, and start the T2 timeout timer (default 3s). This probe message carries a forged CHAP-Challenge attribute; the Challenge value and Identifier are randomly generated, and it also does not carry real user data. The Probe-Challenge simulates the CHAP two-way handshake scenario and belongs to a different RADIUS message type than the Probe-Access message. They respectively cover the "Access-Request decoding path" and the "CHAP-Challenge state machine path." Even if the server does not respond to the first message due to a specific policy, the second heterogeneous message may trigger a response, reducing single point of failure.

[0083] Step 4: Within the T2 timeout window, determine whether a response to the second probe message (or the first probe message) has been received from the target server. If any RADIUS response is received within T2, the server's CHAP path is considered normal. In this case, the probe result of the target server is determined to be usable.

[0084] By employing either of the two methods described above, the detection results for the target server in each detection cycle can be obtained. Alternatively, the two methods can be used in combination. For example, when the currently recorded target server status is available, detection using method a2 can be performed; while when the recorded target server status is unavailable, detection using method a1 can be performed, enabling faster detection of the target server status. Furthermore, other detection methods are possible, such as different detection cycles. Detection can be performed on selected scenarios from multiple authentication scenarios. If the target server is detected to be unavailable in a certain detection cycle, detection can be performed on all authentication scenarios in the next detection cycle. Specifically, this disclosure does not limit the scope of the embodiments.

[0085] In another embodiment of this disclosure, the failure of the target server may also gradually accumulate and intensify over time until it transitions from an available state to an unavailable state. During this gradual accumulation of failure, there may be instances where the target server can only respond to a portion of the probe packets in a timely manner. In this case, if excessive new user traffic continues to be allocated to the target server, it will exacerbate and accelerate the failure process, causing the target server to enter an unavailable state more quickly. Therefore, in another embodiment of this disclosure, by reducing the new user traffic allocated to the target server during this failure accumulation process, the load pressure on the target server is alleviated. This not only reduces the probability of the target server failure but also reduces the maintenance overhead for users by the access device after the target server enters a failure state. The maintenance process for users by the access device after the target server enters a failure state can be found in the following embodiments and will not be repeated here. Specifically, the following methods can be used to reduce the new user traffic allocated to the target server during the failure accumulation process: If the first response message of the target server to the first probe message is not received within the second preset time, but the second response message of the target server to the second probe message is received within the third preset time, the target server is marked as a minor fault state. If the number of consecutive detection cycles in which the target server is marked as having a minor fault reaches a second preset threshold, the user traffic weight of the target server is adjusted to a third preset weight value; the third preset weight value is less than the target traffic weight value.

[0086] Here, for example, if the first response message from the target server to the first probe message is not received within the second preset time, but the second response message from the target server to the second message is received within the second preset time, this indicates that some authentication scenarios may be unavailable. In this case, the target server can be marked as a minor fault state, that is, although the target server is available, it may have a minor fault, which may cause it to be unable to respond to messages for certain authentication scenarios or to respond slowly.

[0087] If the number of consecutive detection cycles in which the target server is marked as having a minor fault reaches a second preset threshold, the user traffic weight of the target server will be adjusted to a third preset weight value. This third preset weight value is less than the target traffic weight value of the target server. In this way, the new user traffic allocated to the target server will be reduced, so as to gradually reduce the new user traffic allocated to the target server during the process of fault accumulation.

[0088] In another embodiment of this disclosure, to reduce the transition from the target server's fault accumulation process to a fault state caused by excessive user traffic, the target server's state can be directly switched to unavailable after the fault accumulation process reaches a certain level. At this time, the target server has not actually entered a fault state and can still provide services to the already connected user traffic. However, the access device will consider the target server unavailable and perform a detection process until the target server returns to normal. To implement the above solution, in another embodiment of this disclosure, a disconnection counter is maintained for the target server; the initial value of the disconnection counter is a first value. If no first response message is received from the target server in response to the first probe message within the second preset time, the value of the disconnection counter is increased by a second value; and if a first response message is received from the target server in response to the first probe message within the second preset time, the value of the disconnection counter is set back to the initial value. If a second response message is received from the target server in response to the second probe message within the third preset time, the value of the disconnection counter is increased by a third value. If the value of the disconnection counter is the sum of the first, second, and third values, the status of the target server will be switched to unavailable.

[0089] For example, a disconnection counter can be maintained for the target server. The initial value of this disconnection counter is, for example, a first value, such as 0, 1, 2, etc., which can be set according to actual needs, and this embodiment does not specify the value. When the count value of the disconnection counter is the first value, it indicates that the target server is in a normal working state, such as state S0.

[0090] If no first response message from the target server is received in response to the first probe message within the second preset time, the value of the disconnection counter is increased by a second value; and if a first response message from the target server is received in response to the first probe message within the second preset time, the value of the disconnection counter is set back to the initial value.

[0091] If a second response message is received from the target server in response to the second probe message within the third preset time, the value of the disconnection counter is increased by a third value. If the value of the disconnection counter is the sum of the first, second, and third values, the status of the target server will be switched to unavailable.

[0092] For example, the disconnection counter is used to count the number of consecutive probe cycles in which the target server is marked as having a minor fault state, and the initial value can be, for example, 0.

[0093] Within each probing cycle, if a first probe message is sent to the target server and no first response message is received from the target server, the disconnection counter is incremented by 1. The access device will also send a second probe message to the target server in this situation. If no second response message is received from the target server within a second preset time, the disconnection counter is incremented by 2. At this point, the disconnection counter reaches a count of 3. The count of the disconnection counter is then equal to the sum of the first, second, and third counts, indicating that the target server is unavailable.

[0094] In addition, this disconnection counter can also be used to count the number of consecutive probe cycles in which the target server is marked as having a minor fault state. When using the count value of the disconnection counter to indicate the number of consecutive probe cycles in which the target server is marked as having a minor fault state, a second preset quantity threshold is, for example, the sum of the first, second, and third values ​​of the counter.

[0095] If, in multiple consecutive probing cycles, a second response message is received but the first response message is not received, the disconnection counter will be incremented by a first value in each probing cycle. When the disconnection counter reaches the sum of the first, second, and third values, indicating that the number of consecutive probing cycles in which the target server is marked as having a minor fault has reached a second preset threshold, the target server's state will be switched to unavailable, and the probing process for the target server in the unavailable state will be executed.

[0096] During this process, the counter can be continuously incremented. That is, as the detection cycle increases, if there are instances where the first response message is not received but the second response message is received, the disconnection counter can be continuously incremented. When the first response message is received in a certain detection cycle, or when both the first and second response messages are received, the disconnection counter value can be reset to the initial value, meaning the target server's status switches to available.

[0097] Regarding S303 and S304 above: In specific implementations, in this embodiment, when the target server is in an available state, it means that the target server will respond to at least one of the multiple probe messages within the same probe period. In this case, if the target server responds to all probe messages, it is considered a normal available state, which can be marked as s0. If the target server only responds to some of the multiple probe messages, it is considered a state with minor faults but still usable. Alternatively, if the target server was unavailable in the previous probe period and is available in the current probe period, although the server has recovered to available status, its available status may be unstable and may change back to unavailable at any time. In this case, the target server's state can also be considered a state with minor faults but still usable. This state with minor faults but still usable can be marked as s1.

[0098] If the target server is in an unavailable state, it means that the target server will not respond to multiple probe packets within the same probe period. In this case, the target server's state is marked as s2.

[0099] The access device records the current state of the target server. After obtaining the detection results of the target server based on the above process, the following situations may occur: b1: If the probe result is available, and the recorded current status of the target server is also available, it means that the target server is in normal use. We can wait for the current probe cycle to end before starting the next probe cycle and performing the same probe process. In this case, the probe cycle duration is the first duration.

[0100] b2: If the probe result is unavailable, but the recorded current status of the target server is available, it indicates that the target server has entered a fault state from a normal usage state. In this case, the following two operations can be performed: 1. Adjust the duration of the detection cycle from the first duration to the second duration. The first duration is longer than the second duration. When the target server transitions from a normal operating state to a fault state, the duration of the detection cycle needs to be reduced to increase the detection frequency of the target server, so that the target server can be put back into use as soon as possible after recovery.

[0101] For example, the switch maintains multiple probing cycles for each server: (1) Normal cycle X: Default 30 seconds (s), used when the target server is in the state of S0 (available); that is, the duration of the probe cycle is the first duration, which is 30 seconds.

[0102] (2) Acceleration cycle Y: Default 3s, used when the status is marked as S2 / S3 (fault / unavailable).

[0103] This variable frequency detection strategy keeps the normal bandwidth / CPU usage at <0.1%, while once it enters fault acceleration mode, it can complete three protocol-level detections within 9 seconds, achieving second-level fault confirmation.

[0104] 2. Adjust the new user traffic weight of the target server to the first preset weight value.

[0105] Here, under normal server conditions, the access device allocates new user traffic to each target server according to a certain traffic strategy, while maintaining the original user traffic for each server. Typically, when allocating new user traffic to different servers, a new user traffic allocation weight can be set for each server. When a new user connects, the user traffic can be allocated to different servers according to this allocation weight. In this embodiment of the disclosure, the new user traffic allocation weight value set for the server under normal server conditions is the target traffic weight value.

[0106] When the target server's status changes from available to unavailable, its new user traffic weight will be adjusted to a first preset weight value. This first preset weight value is the maximum weight value of new user traffic allocated to the target server after its status changes to unavailable. For example, it can be 0%, meaning that no new user traffic will be allocated to the target server after its status changes to unavailable. Furthermore, in some cases, such as when other servers are under high load and allocating more new user traffic to them could lead to their failure, the first preset weight value can be set to a relatively low value instead of 0%. This distributes the pressure on other servers, preventing them from failing due to excessive load before the target server becomes available again.

[0107] Furthermore, in another embodiment of this disclosure, if the number of consecutive cycles in which the detection results indicate that the state of the target server is unavailable reaches a third quantity threshold, the state of the target server is marked as faulty.

[0108] Specifically, since the target server may experience state fluctuations, there might be a situation where the detection result for a certain period is unavailable, while the detection results for other adjacent detection periods are available. To avoid the computational overhead caused by frequent adjustments to user traffic, for detection periods where the detection result is unavailable but the recorded current state of the target server is available, instead of directly adjusting the new user traffic weight of the target server to 0%, the duration of the detection period is reduced, and the detection frequency is increased. If the detection result indicates that the target server is unavailable after N consecutive detection periods, the target server's state can be marked as faulty. Here, N represents a third quantity threshold and is an integer greater than 0. The value of N can be set according to actual needs, and this embodiment does not impose any limitations. After marking the target server's state as faulty, its new user traffic weight is then adjusted to 0%.

[0109] Alternatively, the weight of new user traffic to the target server can be gradually reduced. For example, the weight of new user traffic can be determined based on the number of consecutive periods during which the target server is detected as unavailable. This weight is negatively correlated with the number of consecutive periods; that is, the larger the number of consecutive periods during which the target server is detected as unavailable, the smaller the weight of new user traffic. Once the number of consecutive periods reaches a certain threshold, the weight of new user traffic is adjusted to 0%.

[0110] b3: If the probe result is unavailable, and the recorded current state of the target server is also unavailable, it indicates that the target server is in a faulty state in both the current and previous probe cycles. In this case, the current probe cycle can be allowed to continue until its duration ends, before starting the next probe cycle and performing the same probe process. In this case, the duration of the probe cycle is the second duration. This increases the probe frequency when the target server is unavailable, enabling more efficient load balancing between different servers and improving service quality when the server recovers its availability.

[0111] b4: If the probe result is available, and the recorded current state of the target server is unavailable, it indicates that the target server has transitioned from a faulty state to an available state. At this time, the server may experience state fluctuations, switching between available and unavailable states. The state of the target server in this case is marked as s1.

[0112] In this situation, the following two operations can be performed: 1. Adjust the duration of the detection cycle from the second duration to the third duration.

[0113] Here, the third duration is longer than the second duration, but shorter than the first duration.

[0114] For example, in addition to maintaining the two-level detection cycle for each server as described in the above embodiments, the access device also maintains a third-level detection cycle for each server, namely: Half-acceleration period Z: Default 15s, used when the target server's state is marked as s1.

[0115] 2. Adjust the new user traffic weight of the target server from the first preset weight value to the second preset weight value.

[0116] Here, the second preset weight value is greater than the first preset weight value, and the second preset weight value is less than the target traffic weight value of the target server. In this way, the limited new user traffic is allocated to the target server, which can avoid the target server's state changing again due to excessive traffic allocation, and can also avoid the interference of new user traffic caused by fluctuations in the target server's state, thus preventing business disruptions caused by fluctuations.

[0117] In another embodiment of this disclosure, the number of consecutive periods in which the target server is available is determined within a plurality of detection periods after the detection result indicates that the state of the target server has changed from unavailable to available; The target value of the new user traffic weight corresponding to the target server is determined based on the number of consecutive periods available for the target server. Adjust the user traffic weight of the target server to the target value; If the number of consecutive cycles available on the target server reaches a first preset threshold, the duration of the detection cycle is adjusted from the third duration to the first duration.

[0118] In practice, after the target server recovers from an unavailable state to an available state, there will be a period of instability. In order to reduce the load on the target server during this period and reduce the probability of it failing again, in this embodiment of the disclosure, after the target server recovers to an available state, excessive traffic will not be allocated to the target server immediately. Instead, the load on the target server will be gradually restored as the detection period increases after the target server recovers to an available state.

[0119] To achieve the above objectives, in any probing cycle, if the target server recovers from unavailability to availability, the number of consecutive probing cycles after the target server recovers to availability can be recorded. Based on this number, a new target value for the user traffic weight of the target server can be determined. Specifically, the number of consecutive probing cycles after the target server recovers to availability and the target value for the new user traffic weight determined for the target server are positively correlated.

[0120] For example, in the first to third probing cycles, the target value for the new user traffic weight determined for the target server is 10% of the target traffic weight value; here, the first probing cycle refers to the current probing cycle in which the target server recovers from being unavailable to being available.

[0121] During the 4th to 6th probing cycles, the target value for the new user traffic weight determined for the target server is 50% of the target traffic weight value.

[0122] When the number of consecutive periods after the target server's probe results recover from unavailable to available reaches 7, that is, starting from the 7th probe period, the target value for the new user traffic weight determined for the target server is 100% of the target traffic weight value.

[0123] If, during this process, the detection result of the target server is unavailable in any detection cycle, the gradual traffic recovery process for the target server is interrupted, and the process proceeds to step 303.

[0124] In another embodiment of this disclosure, in order to ensure normal service for already connected user traffic when the target server is unavailable, this embodiment of the disclosure also uses an online user keep-alive mechanism to ensure that the services of already online users are not interrupted during the target server failure.

[0125] Specifically, if the target server is marked as faulty, the sessions of online users can be retained, and the billing update messages of the online users can be cached; After the target server's status is restored to availability, a cached billing update message is sent to the target server; If the cached billing update message expires, a billing stop message is sent to the target server so that the target server can perform billing processing on the online users corresponding to the cached billing update message after the target server is restored to an available state.

[0126] Here, when the target server is marked as faulty, the access device initiates session protection for online users on the target server: ①: Preserve the sessions of users who are already online. This means that an Accounting-Stop message will not be sent immediately after the target server is marked as faulty, and user sessions will not be forcibly disconnected.

[0127] ②: By marking the target server's status as "faulty", the fault status of the target service request is recorded at the access device level, and subsequent billing requests will no longer attempt to access the server, reducing invalid interactions.

[0128] ③: The core business data (L2 / L3 layer) of traffic for online users will be forwarded normally. Here, only the billing process will be affected.

[0129] ④: Temporarily store billing update messages to be sent to avoid loss of billing data due to server unavailability; if the server is still unreachable after the cache expires, the corresponding fallback strategy (such as discarding / local records) may be triggered.

[0130] ⑤: When a previously cached billing update request exceeds its validity period (e.g., 30 minutes, which can be adjusted as needed), if the billing server still has not recovered, the device will send an Accounting-Stop message to the server and terminate the user session in a "gentle offline" manner.

[0131] Furthermore, in another embodiment of this disclosure, if the detection result indicates that the state of the target server has changed from available to unavailable, the access device will send detection messages corresponding to various authentication scenarios to the backup server through a detection session with the backup server to obtain the detection result of the backup server; if the detection result of the backup server indicates that the backup server is available, the backup server will be marked as hot standby. After the target server is marked as faulty, new user traffic is routed to the standby server marked as hot standby.

[0132] In this way, if the target server is detected to be unavailable, the status of each backup server can be immediately detected, and the available backup server can be used as a hot standby server. When the target server is marked as faulty, new user traffic can be immediately routed to the hot standby server. By pre-warming up the servers through server switching, a backup server can be prepared in advance and put into use immediately after the target server fails, achieving seamless switching between primary and backup servers and improving the quality of service for users.

[0133] This disclosure adopts the above-mentioned server detection method. Under the three constraints of zero modification (no modification is made to the server), zero dependence (the timeliness of fault detection does not depend on business traffic), and zero awareness (the user is unaware), it achieves second-level discovery and zero-wait switching of RADIUS server protocol-level faults. It solves the problems of long silence period in traditional detection schemes, shallow false alarms in ping detection, and incomplete detection type coverage in Status-Server.

[0134] The entire probe logic runs entirely within the access device, requiring no proxy, plugin, or protocol extension to be installed on the RADIUS server. Probe packets strictly adhere to standard formats such as RFC 2865, RFC 2866, and RFC 3576, and can interface with all mainstream implementations of SBRRADIUS / AAA servers, including FreeRADIUS, Windows NPS, Cisco ISE, and Juniper. It also supports complex networking scenarios such as IPv4 / IPv6 dual-stack, VRF multi-instance, NAT traversal, and Option 82, making it suitable for scenarios in telecommunications, finance, government and enterprise, education, and healthcare.

[0135] The various embodiments of the server detection method provided in this disclosure have the following advantages: 1. Same-source address binding: The probe packet reuses the source IP of the real authentication service to ensure that the probe path and the service path are the same, and prevents erroneous probe results caused by firewall and policy routing drift, which may result in "probe success but authentication failure".

[0136] 2: Heterogeneous dual-message sequential probing: First, a first probing message is sent. If the timeout occurs, a second probing message is sent. The two messages are of different types and cover different protocol paths. Normally, only one message is sent, saving probing overhead.

[0137] 3: Variable speed cycle adjudication: Different detection cycles are used for "available" and "unavailable" servers to balance normal low load and fault detection within seconds.

[0138] 4. Progressive fault recovery: Multi-state finite state machine (S0 / S1 / S3). After the server recovers, it first enters an observation period to carry low user traffic. After several consecutive probe cycles to verify that the state is stable, the server's traffic capacity is gradually increased to prevent jitter.

[0139] 5. Online User Keep-Alive: Protects online user sessions, caches Accounting-Update, achieves zero session interruption, and improves user experience.

[0140] 6: Zero-wait failover: When the detection result indicates that the target server is unavailable, a hot standby backup server is executed. Once the target server is marked as faulty, the system immediately switches to the hot standby backup server, achieving zero-wait primary / standby failover without relying on user authentication timeout.

[0141] 7. Zero-modification compatibility: The probe packets adopt the standard RADIUS format without relying on proprietary extensions. No agent or upgrade version needs to be installed on the server side, making it suitable for existing networks of operators, financial institutions, government and enterprises.

[0142] Corresponding to the aforementioned embodiments of server detection methods, this disclosure also provides embodiments of server detection devices.

[0143] The embodiments of the detection device disclosed herein can be applied to access devices. The device embodiments can be implemented through software, hardware, or a combination of both. Taking software implementation as an example, as a logical device, it is formed by the processor of the access device loading the corresponding computer program instructions from non-volatile memory into memory for execution. From a hardware perspective, such as... Figure 4 The diagram shown is a hardware structure diagram of the access device where the detection device of the server disclosed in this paper is located. (Except for...) Figure 4 In addition to the processor, memory, network interface, and non-volatile memory shown, the access device in the embodiment may also include other hardware depending on the actual function of the server's detection device, which will not be described in detail here.

[0144] Please refer to Figure 5 The server detection device provided in this embodiment includes: The session establishment module 51 is used to establish a probe session between the access device and the target server based on the source parameters of the real authentication message sent by the access device in response to the triggering of the target event. The detection module 52 is used to send detection packets corresponding to various authentication scenarios to the target server in each of the multiple detection cycles based on the detection session, so as to obtain the detection results of the target server; wherein the detection packets have the same message format as the service packets of the remote authentication dial-up user service. Processing module 53 is configured to, when the detection result indicates that the status of the target server has changed from available to unavailable, adjust the duration of the detection cycle from a first duration to a second duration, and adjust the new user traffic weight of the target server to a first preset weight value, and enter the next detection cycle; the first duration is longer than the second duration. If the detection result indicates that the state of the target server has changed from unavailable to available, the duration of the detection cycle is adjusted from the second duration to the third duration, and the new user traffic weight of the target server is adjusted from the first preset weight value to the second preset weight value, and the next detection cycle begins; wherein, the second preset weight value is greater than the first preset weight value, and the second preset weight value is less than the target traffic weight value of the target server; the third duration is greater than the second duration and less than the first duration.

[0145] Optionally, the processing module 53 is further configured to: determine the number of consecutive periods in which the target server is available within a plurality of detection periods after the detection result indicates that the state of the target server has changed from unavailable to available; The target value of the new user traffic weight corresponding to the target server is determined based on the number of consecutive periods available for the target server. Adjust the user traffic weight of the target server to the target value; If the number of consecutive cycles available on the target server reaches a first preset threshold, the duration of the detection cycle is adjusted from the third duration to the first duration.

[0146] Optionally, the detection module 52, when sending detection messages corresponding to various authentication scenarios to the server based on the detection session to obtain the detection results of the target server, is used to: Within the same detection period, based on the detection session, detection packets corresponding to various authentication scenarios are continuously sent to the target server. If a corresponding message based on any probe message is received from the target server within a first preset time, then the probe result of the target server is determined to be usable; If no response message is received from the target server for each of the multiple probe messages within the first preset time period, the probe result of the target server is determined to be unavailable.

[0147] Optionally, the detection module 52, when sending detection messages corresponding to various authentication scenarios to the server based on the detection session to obtain the detection results of the target server, is used to: Based on the probe session, a first probe message corresponding to the first authentication scenario among the multiple authentication scenarios is sent to the target server; if a first response message from the target server to the first probe message is received within a second preset time, then the probe result of the target server is determined to be available; If the first response message is not received within the second preset time, a second probe message corresponding to the second authentication scenario among the multiple authentication scenarios is sent to the target server based on the probe session; if a second response message from the target server for the second probe message is received within the third preset time, the probe result of the target server is determined to be available; if the second response message is not received, the probe result of the target server is determined to be unavailable.

[0148] Optionally, the processing module 53 is further configured to: mark the target server as having a minor fault state if a second response message from the target server to the second probe message is received within a third preset time. If the number of consecutive detection cycles in which the target server is marked as having a minor fault reaches a second preset threshold, the user traffic weight of the target server is adjusted to a third preset weight value; the third preset weight value is less than the target traffic weight value.

[0149] Optionally, processing module 53 is also used for: Maintain a disconnection counter for the target server; the initial value of the disconnection counter is a first value; If no first response message is received from the target server in response to the first probe message within the second preset time, the value of the disconnection counter is increased by a second value; and if a first response message is received from the target server in response to the first probe message within the second preset time, the value of the disconnection counter is set back to the initial value. If a second response message is received from the target server in response to the second probe message within the third preset time, the value of the disconnection counter is increased by a third value. If the value of the disconnection counter is the sum of the first value, the second value, and the third value, the target server is marked as faulty.

[0150] Optionally, the processing module 53 is further configured to: mark the state of the target server as faulty if the number of consecutive cycles in which the detection result indicates that the state of the target server is unavailable reaches a third quantity threshold.

[0151] Optionally, processing module 53 is also used for: If the target server is marked as faulty, the sessions of online users are preserved, and the billing update messages of the online users are cached. After the target server's status is restored to availability, a cached billing update message is sent to the target server; If the cached billing update message expires, a billing stop message is sent to the target server so that the target server can perform billing processing on the online users corresponding to the cached billing update message after the target server is restored to an available state.

[0152] Optionally, processing module 53 is also used for: When the detection result indicates that the state of the target server has changed from available to unavailable, probe messages corresponding to various authentication scenarios are sent to the backup server through a probe session with the backup server to obtain the detection result of the backup server. If the detection results of the backup server indicate that the backup server is available, the backup server is marked as a hot standby; After the target server is marked as faulty, new user traffic is routed to the standby server marked as hot standby.

[0153] The specific implementation process of the functions and roles of each unit in the above device can be found in the implementation process of the corresponding steps in the above method, and will not be repeated here.

[0154] For the device embodiments, since they basically correspond to the method embodiments, the relevant parts can be referred to in the description of the method embodiments. The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this disclosure according to actual needs. Those skilled in the art can understand and implement this without creative effort.

[0155] This disclosure also provides a computer-readable storage medium storing a computer program, which, when executed by a processor, performs the steps of the server detection method described in the above method embodiments. The storage medium can be a volatile or non-volatile computer-readable storage medium.

[0156] This disclosure also provides a computer program product carrying program code. The program code includes instructions that can be used to execute the steps of the server detection method described in the above method embodiments. For details, please refer to the above method embodiments, which will not be repeated here.

[0157] The aforementioned computer program product can be implemented through hardware, software, or a combination thereof. In one optional embodiment, the computer program product is specifically embodied in a computer storage medium; in another optional embodiment, the computer program product is specifically embodied in a software product, such as a software development kit (SDK), etc.

[0158] The computer program or instructions may be stored in a computer-readable storage medium or transferred from one computer-readable storage medium to another. For example, the computer program or instructions may be transferred from one website, computer, server, or data center to another website, computer, server, or data center via wired or wireless means. The computer-readable storage medium may be any available medium that a computer can access, or a data storage device such as a server or data center that integrates one or more available media. The available medium may be a magnetic medium, such as a floppy disk, hard disk, or magnetic tape; or an optical medium, such as a digital video optical disc; or a semiconductor medium, such as a solid-state drive. The computer-readable storage medium may be a volatile or non-volatile storage medium, or may include both volatile and non-volatile types of storage media.

[0159] The embodiments of the subject matter and functional operation described in this specification can be implemented in the following ways: digital electronic circuits, tangibly embodied computer software or firmware, computer hardware including the structures disclosed in this specification and their structural equivalents, or combinations thereof. Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible, non-transitory program carrier for execution by a data processing apparatus or for controlling the operation of a data processing apparatus. Alternatively or additionally, the program instructions may be encoded on artificially generated propagation signals, such as machine-generated electrical, optical, or electromagnetic signals, which are generated to encode information and transmit it to a suitable receiving device for execution by the data processing apparatus. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or combinations thereof.

[0160] The processing and logic flow described in this specification can be executed by one or more programmable computers that execute one or more computer programs to perform corresponding functions by operating on input data and generating output. The processing and logic flow can also be executed by dedicated logic circuitry—such as FPGAs (Field-Programmable Gate Arrays) or ASICs (Application-Specific Integrated Circuits), and the device can also be implemented as dedicated logic circuitry.

[0161] Suitable computers for executing computer programs include, for example, general-purpose and / or special-purpose microprocessors, or any other type of central processing unit. Typically, the central processing unit receives instructions and data from read-only memory and / or random access memory. The basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Typically, a computer will also include one or more mass storage devices for storing data, such as disks, magneto-optical disks, or optical disks, or the computer will be operatively coupled to such mass storage devices to receive data from or transfer data to them, or both. However, a computer is not required to have such devices. Furthermore, a computer can be embedded in another device, such as a mobile phone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a global positioning system (GPS) receiver, or a portable storage device such as a universal serial bus (USB) flash drive, to name a few.

[0162] Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, such as semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., internal hard disks or removable disks), magneto-optical disks, and CD-ROM and DVD-ROM disks. Processors and memory may be supplemented by or incorporated into dedicated logic circuitry.

[0163] While this specification contains numerous specific implementation details, these should not be construed as limiting the scope of any invention or the scope of the claims, but rather are primarily intended to describe features of specific embodiments of a particular invention. Certain features described in the various embodiments herein may also be implemented in combination in a single embodiment. Conversely, various features described in a single embodiment may also be implemented separately in various embodiments or in any suitable sub-combination. Furthermore, while features may function in certain combinations as described above and even initially claimed in this way, one or more features from a claimed combination may be removed from that combination in some cases, and a claimed combination may refer to a sub-combination or a variation thereof.

[0164] Similarly, although the operations are depicted in a specific order in the accompanying drawings, this should not be construed as requiring these operations to be performed in the specific order shown or sequentially, or requiring all illustrated operations to be performed to achieve the desired result. In some cases, multitasking and parallel processing may be advantageous. Furthermore, the separation of various system modules and components in the above embodiments should not be construed as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

[0165] Thus, specific embodiments of the subject matter have been described. Other embodiments are within the scope of the appended claims. In some cases, the actions recited in the claims may be performed in a different order and still achieve the desired result. Furthermore, the processes depicted in the drawings are not necessarily shown in a specific order or sequence to achieve the desired result. In some implementations, multitasking and parallel processing may be advantageous.

[0166] The above description is merely a preferred embodiment of this disclosure and is not intended to limit this disclosure. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this disclosure should be included within the scope of protection of this disclosure.

Claims

1. A method for detecting a server, characterized in that, Applied to access devices, the method includes: In response to the target event being triggered, a probe session is established between the access device and the target server based on the source parameters of the real authentication message sent by the access device. In each of the multiple probing cycles, based on the probing session, probing messages corresponding to various authentication scenarios are sent to the target server to obtain the probing results of the target server; wherein the probing messages have the same message format as the service messages of the remote authentication dial-up user service; If the detection result indicates that the status of the target server has changed from available to unavailable, the duration of the detection cycle is adjusted from a first duration to a second duration, and the new user traffic weight of the target server is adjusted to a first preset weight value, and the next detection cycle begins; the first duration is longer than the second duration. If the detection result indicates that the state of the target server has changed from unavailable to available, the duration of the detection cycle is adjusted from the second duration to the third duration, and the new user traffic weight of the target server is adjusted from the first preset weight value to the second preset weight value, and the next detection cycle begins; wherein, the second preset weight value is greater than the first preset weight value, and the second preset weight value is less than the target traffic weight value of the target server; the third duration is greater than the second duration and less than the first duration.

2. The method according to claim 1, characterized in that, The method further includes: determining the number of consecutive periods in which the target server is available within multiple detection periods after the detection result indicates that the state of the target server has changed from unavailable to available; The target value of the new user traffic weight corresponding to the target server is determined based on the number of consecutive periods available for the target server. Adjust the user traffic weight of the target server to the target value; If the number of consecutive cycles available on the target server reaches a first preset threshold, the duration of the detection cycle is adjusted from the third duration to the first duration.

3. The method according to claim 1, characterized in that, The step of sending probe messages corresponding to various authentication scenarios to the server based on the probe session to obtain the probe results of the target server includes: Within the same detection period, based on the detection session, detection packets corresponding to various authentication scenarios are continuously sent to the target server. If a corresponding message based on any probe message is received from the target server within a first preset time, then the probe result of the target server is determined to be usable; If no response message is received from the target server for each of the multiple probe messages within the first preset time period, the probe result of the target server is determined to be unavailable.

4. The method according to claim 1, characterized in that, The step of sending probe messages corresponding to various authentication scenarios to the server based on the probe session to obtain the probe results of the target server includes: Based on the probe session, a first probe message corresponding to the first authentication scenario among the multiple authentication scenarios is sent to the target server; if a first response message from the target server to the first probe message is received within a second preset time, then the probe result of the target server is determined to be available; If the first response message is not received within the second preset time, a second probe message corresponding to the second authentication scenario among the multiple authentication scenarios is sent to the target server based on the probe session; if a second response message from the target server for the second probe message is received within the third preset time, the probe result of the target server is determined to be available; if the second response message is not received, the probe result of the target server is determined to be unavailable.

5. The method according to claim 4, characterized in that, The method further includes: if a second response message from the target server to the second probe message is received within a third preset time, the target server is marked as being in a minor fault state; If the number of consecutive detection cycles in which the target server is marked as having a minor fault reaches a second preset threshold, the user traffic weight of the target server is adjusted to a third preset weight value; the third preset weight value is less than the target traffic weight value.

6. The method according to claim 4, characterized in that, The method further includes: Maintain a disconnection counter for the target server; the initial value of the disconnection counter is a first value; If no first response message is received from the target server in response to the first probe message within the second preset time, the value of the disconnection counter is increased by a second value; and if a first response message is received from the target server in response to the first probe message within the second preset time, the value of the disconnection counter is set back to the initial value. If a second response message is received from the target server in response to the second probe message within the third preset time, the value of the disconnection counter is increased by a third value. If the value of the disconnection counter is the sum of the first value, the second value, and the third value, the target server is marked as faulty.

7. The method according to claim 1, characterized in that, The method further includes: marking the state of the target server as faulty when the number of consecutive cycles in which the detection result indicates that the state of the target server is unavailable reaches a third quantity threshold.

8. The method according to claim 6 or 7, characterized in that, The method further includes: If the target server is marked as faulty, the sessions of online users are preserved, and the billing update messages of the online users are cached. After the target server's status is restored to availability, a cached billing update message is sent to the target server; If the cached billing update message expires, a billing stop message is sent to the target server so that the target server can perform billing processing on the online users corresponding to the cached billing update message after the target server is restored to an available state.

9. The method according to claim 1, characterized in that, The method further includes: When the detection result indicates that the state of the target server has changed from available to unavailable, probe messages corresponding to various authentication scenarios are sent to the backup server through a probe session with the backup server to obtain the detection result of the backup server. If the detection results of the backup server indicate that the backup server is available, the backup server is marked as a hot standby; After the target server is marked as faulty, new user traffic is routed to the standby server marked as hot standby.

10. A server detection device, characterized in that, Applied to access devices, the device includes: The session establishment module is used to establish a probe session between the access device and the target server based on the source parameters of the real authentication message sent by the access device in response to the triggering of the target event. The detection module is used to send detection packets corresponding to various authentication scenarios to the target server in each of the multiple detection cycles, based on the detection session, to obtain the detection results of the target server; wherein the detection packets have the same message format as the service packets of the remote authentication dial-up user service. The processing module is configured to, when the detection result indicates that the status of the target server has changed from available to unavailable, adjust the duration of the detection cycle from a first duration to a second duration, and adjust the new user traffic weight of the target server to a first preset weight value, and enter the next detection cycle; the first duration is longer than the second duration. If the detection result indicates that the state of the target server has changed from unavailable to available, the duration of the detection cycle is adjusted from the second duration to the third duration, and the new user traffic weight of the target server is adjusted from the first preset weight value to the second preset weight value, and the next detection cycle begins; wherein, the second preset weight value is greater than the first preset weight value, and the second preset weight value is less than the target traffic weight value of the target server; the third duration is greater than the second duration and less than the first duration.

11. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by a processor, it implements the steps of the method according to any one of claims 1-9.

12. An access device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the steps of the method according to any one of claims 1-9.

13. A computer program product, characterized in that, The computer program product carries program code, the program code including instructions that can be used to perform the steps of the method as described in any one of claims 1-9.