Operation and maintenance operation auditing system, method, device, equipment, medium and program product

By connecting the agent and audit SCS units in the bastion host system through the federated communication bus, independent operation and asynchronous message interaction are achieved, which solves the system instability problem caused by the failure of centralized components and improves the stability and high availability of the system.

CN122204580APending Publication Date: 2026-06-12BEIJING QIYI CENTURY SCI & TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING QIYI CENTURY SCI & TECH CO LTD
Filing Date
2026-02-26
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In existing bastion host systems, under dual-active/multi-active modes, failures or performance fluctuations of centralized components lead to poor system stability, expand the scope of failures, and affect the overall stability of the system.

Method used

Multiple agent self-accommodating system (SCS) units and audit SCS units are connected by a federated communication bus. Each SCS unit operates independently and performs asynchronous and reliable message interaction through the federated communication bus to generate audit event messages and store them in persistent storage, avoiding reliance on a shared database.

🎯Benefits of technology

This improves system stability, prevents fault propagation between SCS units, ensures that audit event messages are not lost, and achieves high availability and high reliability.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122204580A_ABST
    Figure CN122204580A_ABST
Patent Text Reader

Abstract

Embodiments of the present application provide an operation and maintenance operation auditing system, method, device, equipment, medium and program product. The operation and maintenance operation auditing system comprises a federal communication bus, a plurality of proxy SCS units and at least one auditing SCS unit connected to the federal communication bus. A target proxy SCS unit acquires a session data copy flowing through a session proxy channel and writes the session data copy into a local first buffer. An auditing event message is published on the federal communication bus according to an auditing subject. The federal communication bus delivers the auditing event message to a target auditing SCS unit after storing the auditing event message in a persistent storage. The target auditing SCS unit processes the auditing event message and obtains an auditing log. The auditing log is stored in a long-term storage system. The technical solution provided by the embodiments of the present application helps to improve system stability.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of information security technology, and in particular to an operation and maintenance auditing system, method, device, medium and program product. Background Technology

[0002] In an enterprise information security system, a bastion host serves as a security portal for server assets in the network environment. It provides a unified access point for server assets, centrally implementing functions such as identity authentication, access control, and operation auditing, thereby improving the security and compliance of enterprise operations and maintenance. Therefore, the high availability of the bastion host itself is crucial.

[0003] Currently, one approach to achieve high availability of bastion hosts is through an active-active / multi-active mode. This involves deploying multiple active nodes that are available simultaneously for business processing. These active nodes rely on centralized components such as a shared database to ensure consistency in business processing.

[0004] If the centralized component fails or experiences performance fluctuations, all active nodes will fail simultaneously, drastically amplifying the scope of the failure and resulting in poor overall system stability. Summary of the Invention

[0005] The purpose of this application is to provide an operation and maintenance auditing system, method, apparatus, equipment, medium, and program product to improve system stability. The specific technical solution is as follows: Firstly, an operation and maintenance auditing system is provided, comprising a federated communication bus, multiple agent self-hosting system (SCS) units connected to the federated communication bus, and at least one audit SCS unit, wherein the at least one audit SCS unit is a subscriber to audit topic messages; wherein, The target proxy SCS unit is configured to receive an operation and maintenance session connection request for a target asset from a terminal device; establish a session proxy channel between the terminal device and the target asset based on the operation and maintenance session connection request; obtain a copy of the session data flowing through the session proxy channel; write the copy of the session data into a local first buffer; generate an audit event message based on the copy of the session data in the local first buffer; and publish the audit event message to the federated communication bus according to the audit topic. The target proxy SCS unit is one of the plurality of proxy SCS units. The federated communication bus is used to receive the audit event message from the target agent SCS unit; after storing the audit event message in persistent storage, it delivers the audit event message to the target audit SCS unit; the target audit SCS unit is one of the at least one audit SCS units. The target audit SCS unit is used to obtain audit event messages from the federated communication bus; process the audit event messages to obtain audit logs; and store the audit logs in a long-term storage system.

[0006] Secondly, a method for auditing operations and maintenance is provided, applied to a target agent self-contained SCS unit in an operations and maintenance auditing system. The operations and maintenance auditing system includes a federated communication bus, and multiple agent SCS units and at least one audit SCS unit connected to the federated communication bus. The at least one audit SCS unit is a subscriber to audit topic messages, and the target agent SCS unit is one of the multiple agent SCS units. The method includes: Receive maintenance session connection requests for the target asset from the terminal device; Based on the maintenance session connection request, a session proxy channel is established between the terminal device and the target asset; Obtain a copy of the session data flowing through the session proxy channel; Write the session data copy into the local first buffer; An audit event message is generated based on the session data copy in the local first buffer; The audit event message is published to the federated communication bus according to the audit topic, so that after the federated communication bus stores the audit event message in persistent storage, it delivers the audit event message to the target audit SCS unit, so that the target audit SCS unit processes the audit event message, obtains the audit log, and stores the audit log in the long-term storage system. The target audit SCS unit is one of the at least one audit SCS units.

[0007] Thirdly, an operation and maintenance auditing device is provided, applied to a target agent self-accommodating SCS unit of an operation and maintenance auditing system. The operation and maintenance auditing system includes a federated communication bus, and multiple agent SCS units and at least one audit SCS unit connected to the federated communication bus. The at least one audit SCS unit is a subscriber to audit topic messages, and the target agent SCS unit is one of the multiple agent SCS units. The device includes: The receiving module is used to receive maintenance session connection requests for the target asset from the terminal device; A module is established to establish a session proxy channel between the terminal device and the target asset based on the operation and maintenance session connection request. The acquisition module is used to acquire a copy of the session data flowing through the session proxy channel; The write module is used to write the session data copy into the local first buffer; The generation module is used to generate audit event messages based on the session data copy in the local first buffer; The publishing module is used to publish the audit event message to the federated communication bus according to the audit topic, so that after the federated communication bus stores the audit event message in persistent storage, it delivers the audit event message to the target audit SCS unit, so that the target audit SCS unit processes the audit event message, obtains the audit log, and stores the audit log in the long-term storage system. The target audit SCS unit is one of the at least one audit SCS units.

[0008] Fourthly, an electronic device is provided, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; Memory, used to store computer programs; When a processor executes a program stored in memory, it implements the steps of the operation and maintenance auditing method described in the second aspect.

[0009] Fifthly, a computer-readable storage medium is provided, on which a computer program is stored, characterized in that, when the program is executed by a processor, it implements the steps of the operation and maintenance auditing method described in the second aspect.

[0010] In a sixth aspect, a computer program product is provided, the computer program product including computer instructions stored in a computer-readable storage medium and adapted to be read and executed by a processor to cause an electronic device having the processor to perform the steps of the operation and maintenance auditing method described in the second aspect.

[0011] Using the technical solution provided in this application, multiple agent SCS units and at least one audit SCS unit are connected via a federated communication bus. The target agent SCS unit processes maintenance session connection requests, obtains a copy of the session data, writes it to a local first buffer, and then publishes an audit event message generated based on the session data copy in the local first buffer to the federated communication bus. The federated communication bus stores the audit event message in persistent storage and then delivers it to the target audit SCS unit. The target audit SCS unit processes the audit event message, obtains audit logs, and stores the audit logs in a long-term storage system. Each agent SCS unit and each audit SCS unit operates independently, without relying on centralized components such as a shared database. Faults in each agent SCS unit and each audit SCS unit are confined within their own scope and will not be propagated to other SCS units through centralized components such as a shared database, thus improving system stability. Attached Figure Description

[0012] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the accompanying drawings used in the description of the embodiments or the prior art will be briefly introduced below.

[0013] Figure 1 This is a schematic diagram of the operation and maintenance audit system in one embodiment of this application; Figure 2 This is a schematic diagram of another structure of the operation and maintenance audit system in the embodiments of this application; Figure 3 This is a flowchart illustrating the implementation of an operation and maintenance auditing method according to an embodiment of this application. Figure 4 This is a schematic diagram of the structure of an operation and maintenance auditing device according to an embodiment of this application; Figure 5 This is a schematic diagram of the structure of an electronic device according to an embodiment of this application. Detailed Implementation

[0014] The technical solutions in the embodiments of this application will now be described with reference to the accompanying drawings.

[0015] See Figure 1 The diagram shown is a structural schematic of an operation and maintenance audit system provided in this application embodiment. This operation and maintenance audit system can also be called a bastion host system or a distributed bastion host system. It includes a federated communication bus, multiple self-contained system (SCS) units connected to the federated communication bus, and at least one audit SCS unit. The at least one audit SCS unit is a subscriber to audit topic messages. The target agent SCS unit is used to receive maintenance session connection requests for the target asset from the terminal device; establish a session agent channel between the terminal device and the target asset based on the maintenance session connection requests; obtain a copy of the session data flowing through the session agent channel; write the copy of the session data into a local first buffer; generate audit event messages based on the copy of the session data in the local first buffer; and publish the audit event messages to the federated communication bus according to the audit topic. The target agent SCS unit is one of multiple agent SCS units. The federated communication bus is used to receive audit event messages from the target agent SCS unit; after storing the audit event messages in persistent storage, it delivers the audit event messages to the target audit SCS unit; the target audit SCS unit is one of at least one audit SCS unit. The target audit SCS unit is used to obtain audit event messages from the federated communication bus; process the audit event messages to obtain audit logs; and store the audit logs in the long-term storage system.

[0016] In this embodiment, each agent SCS unit and each audit SCS unit are functionally cohesive and independently operating autonomous units. That is, each agent SCS unit and each audit SCS unit integrates all the core components required for its operation, has independent local private storage space for storing operation-related data, and can start, run, and complete core tasks without relying on any external shared databases or other centralized components. Each agent SCS unit and each audit SCS unit are connected to a federated communication bus, which is configured as a highly available, persistent message middleware. Asynchronous and reliable message interaction is conducted between the agent SCS units and the audit SCS units in an event-driven manner.

[0017] Each proxy SCS unit can contain an access gateway subunit, a protocol proxy subunit, a session recording subunit, and a local data storage subunit. The access gateway subunit is responsible for handling operation and maintenance session connection requests, the protocol proxy subunit is responsible for protocol proxying, the session recording subunit is responsible for recording sessions, and the local data storage subunit is responsible for storing audit event messages.

[0018] The Federation communication bus is responsible for the transmission of audit event messages between the agent SCS unit and the audit SCS unit.

[0019] Each audit SCS unit is responsible for asynchronously consuming audit event messages from the federated communication bus, obtaining audit logs, and storing them in a long-term storage system. The long-term storage system can be understood as a system capable of storing data permanently, outside of the operation and maintenance audit system. Audit logs may include session metadata, operation data, etc. Session metadata may include session identifier, operation time, user identifier, terminal device identifier, target asset identifier, and the protocol used. Operation data may include user input information and information returned by the target asset.

[0020] During the operation and maintenance auditing system, one of the multiple proxy SCS units, namely the target proxy SCS unit, can receive an operation and maintenance session connection request for the target asset from the terminal device. Based on the operation and maintenance session connection request, a session proxy channel is established between the terminal device and the target asset. This session proxy channel can also be called a bidirectional data proxy channel. Optionally, the target proxy SCS unit can receive the operation and maintenance session connection request from the terminal device through a network interface. The operation and maintenance session connection request can be a Transmission Control Protocol (TCP) connection request.

[0021] The target proxy SCS unit can be any one of the multiple proxy SCS units, or the proxy SCS unit with the lowest load among the multiple proxy SCS units. Each of the multiple proxy SCS units can perform similar operations to the target proxy SCS unit when it receives an operation and maintenance session connection request.

[0022] After establishing a session proxy channel between the terminal device and the target asset, the target agent SCS unit can obtain a copy of the session data flowing through the session proxy channel. In other words, all session data flowing through the session proxy channel (including data from the terminal device to the target asset and data from the target asset to the terminal device) is copied in real time. The target asset refers to authorized and managed Internet Technology (IT) infrastructure, which may include servers, network devices, databases, or middleware. Data from the terminal device to the target asset can be understood as user commands issued by the terminal device to the target asset, and data from the target asset to the terminal device can be understood as feedback data from the target asset's execution of operations.

[0023] The target agent SCS unit can write the acquired session data copy into its local first buffer. Each agent SCS unit has its own local first buffer, located on the local disk, and features a rolling and capacity-limiting mechanism. The rolling mechanism can be understood as a management strategy that makes room for new data by moving data within the limited local first buffer. When the data size in the local first buffer reaches a preset size or a preset duration, new data is written, and old data is removed from one end of the local first buffer or rolled out.

[0024] The target agent SCS unit's operation of writing the acquired session data copy to its local first buffer, also known as local persistence, takes precedence over any reporting action; this is a "write-ahead logging" mechanism. This mechanism ensures the integrity of the session data copy because the target agent SCS unit has already saved the session data copy. Even if the federated communication bus or the audit SCS unit fails, the session data copy will not be lost or the agent function will be blocked.

[0025] After the target agent SCS unit writes the acquired session data copy into its local first buffer, it can generate audit event messages based on the session data copy in the local first buffer. Then, it publishes the audit event messages to the federated communication bus according to the audit topic, performing asynchronous publication of audit event messages. That is, the target agent SCS unit uses the locally persistent session data copy as an event to notify the federated communication bus in a loosely coupled manner.

[0026] It should be noted that the Federation Communication Bus has multiple topics, such as audit topics, configuration update topics, operation instruction topics, and alarm and event topics. Publishing audit event messages to the Federation Communication Bus according to audit topics can be understood as publishing audit event messages to the audit topics on the Federation Communication Bus, so that the audit SCS units that subscribe to audit topic messages can process the audit event messages. An audit topic refers to a topic related to operation and maintenance auditing, and audit topic messages refer to messages related to operation and maintenance auditing.

[0027] Optionally, the target agent SCS unit is also used for: Upon receiving an operation and maintenance session connection request, a session proxy thread is allocated for the operation and maintenance session connection request. The session proxy thread is used to perform the following steps based on the operation and maintenance session connection request: establishing a session proxy channel between the terminal device and the target asset, obtaining a copy of the session data flowing through the session proxy channel, and writing the copy of the session data into the local first buffer. The first buffer is monitored using a publishing thread independent of the session broker thread. If the publishing thread detects a new copy of session data in the local first buffer, the publishing thread performs the following steps: generating an audit event message based on the copy of session data in the local first buffer, and publishing the audit event message to the federated communication bus according to the audit topic.

[0028] The target agent SCS unit can process operation and maintenance session connection requests through the session agent thread. The session agent thread establishes a session agent channel between the terminal device and the target asset, obtains a copy of the session data flowing through the session agent channel, and writes the copy of the session data into the local first buffer.

[0029] Optionally, a listening service can be started on the target proxy SCS unit. This listening service is bound to a preset Internet Protocol (IP) address and port. When a new TCP connection request is accepted, the target proxy SCS unit allocates a separate session proxy thread for this TCP connection request. A session proxy thread can also be understood as an execution thread or coroutine. This allows different TCP connection requests to be processed in isolation, preventing a problem in one TCP connection request from affecting other TCP connection requests.

[0030] The target agent SCS unit can monitor the first buffer through the publishing thread. The publishing thread is independent of the session agent thread. When the publishing thread detects a new copy of session data in the first buffer, it can read the new copy of session data, generate an audit event message, and then publish the audit event message to the federated communication bus according to the audit topic.

[0031] The actions executed by the publishing thread and the actions executed by the session proxy thread are asynchronous. Whether the operation executed by the publishing thread is successful or not does not affect the operation executed by the session proxy thread, thus decoupling the proxy function from the audit function.

[0032] The federated communication bus, acting as the asynchronous neural network of the operations and maintenance auditing system, connects all agent SCS units and audit SCS units, providing reliable and persistent message delivery. It is the core of achieving service decoupling and architectural resilience. Audit event messages flow in from agent SCS units, are persisted and routed through the federated communication bus, and finally flow to one or more audit SCS units.

[0033] Specifically, after receiving an audit event message from the target agent SCS unit, the federated communication bus can first store the audit event message in persistent storage, and then deliver the audit event message to the target audit SCS unit. The target audit SCS unit is one of at least one audit SCS units that subscribes to audit topic messages. That is, after receiving the audit event message, the federated communication bus does not deliver it immediately, but performs a persistence operation first. This ensures the integrity of the audit event message. If the target audit SCS unit fails, the federated communication bus can still deliver the audit event message to other audit SCS units.

[0034] Optionally, the federated communication bus includes multiple bus nodes, each of which is equipped with persistent storage. Specifically, the federated communication bus is used to store audit event messages in a first number of bus nodes that are pre-defined in the persistent storage.

[0035] The federated communication bus is a cluster of multiple bus nodes, which can be implemented based on NATS JetStream (a high-performance streaming message engine) or Apache Pulsar (an open-source distributed messaging platform). The first quantity can be understood as a quorum. The federated communication bus stores audit event messages in persistent storage on the first number of bus nodes, ensuring that audit event messages are not lost even if some bus nodes fail.

[0036] Optionally, the federated communication bus can maintain a mapping between topics and a list of subscribers. Based on the audit topic of the audit event message, it searches for all subscribers to the audit topic message, i.e., at least one audit SCS unit. Then, it identifies the target audit SCS unit among these at least one audit SCS unit and delivers the audit event message to it. The audit SCS unit is the consumer of the audit event message. Multiple audit SCS units can subscribe to audit topic messages in a "Queue Group" mode to enhance the load balancing and failover capabilities of the audit SCS units. The target audit SCS unit can be a healthy audit SCS unit among the at least one audit SCS unit, or the audit SCS unit with the lowest load.

[0037] It should be noted that, for any SCS unit (such as any agent SCS unit or any audit SCS unit), the health status of the SCS unit can be determined through its health status information, which may include at least one of the following: The liveness status of the core processes of this SCS unit; Network reachability of the external service ports of this SCS unit; The communication status of this SCS unit with the Federation communication bus.

[0038] The target audit SCS unit obtains audit event messages from the federated communication bus, processes these messages to obtain audit logs, and then stores the audit logs in the long-term storage system. The audit SCS unit has a single data source: it subscribes to audit topic messages from the federated communication bus. As a consumer of audit event messages delivered by the federated communication bus, it can asynchronously process audit event messages to obtain audit logs, which ultimately flow to the external long-term storage system.

[0039] The system provided in this application's embodiments processes maintenance session connection requests, obtains session data copies, writes them to a local first buffer, and then publishes audit event messages generated based on the session data copies in the local first buffer to the federated communication bus. The federated communication bus stores the audit event messages in persistent storage and then delivers them to the target audit SCS unit. The target audit SCS unit processes the audit event messages, obtains audit logs, and stores the audit logs in a long-term storage system. Each agent SCS unit and each audit SCS unit operate independently, without relying on centralized components such as shared databases. Faults in each agent SCS unit and each audit SCS unit are confined within their own scope and will not be propagated to other SCS units through centralized components such as shared databases, thus improving system stability.

[0040] In some embodiments of this application, the target proxy SCS unit is specifically used to serialize a newly written copy of session data in a local first buffer using a preset data format to generate an audit event message; The target audit SCS unit is specifically used to deserialize audit event messages to obtain audit logs.

[0041] In this embodiment, when there is a newly written copy of session data in the local first buffer, the target proxy SCS unit can use a preset data format (such as Protobuf (Protocol Buffer, a data exchange format)) to serialize the newly written copy of session data in the local first buffer, generate an audit event message, and then publish the audit event message to the federated communication bus. After storing the audit event message in persistent storage, the federated communication bus delivers the audit event message to the target audit SCS unit. The audit event message obtained by the target audit SCS unit from the federated communication bus is a byte stream serialized by the proxy SCS unit. The target audit SCS unit can deserialize the audit event message to restore it into a structured data object, obtain the audit log, and then store the audit log in the long-term storage system.

[0042] Serializing session data copies gives the generated audit event messages a standardized format and compact structure, reducing network transmission overhead and parsing complexity, and helping to improve message delivery efficiency.

[0043] In some embodiments of this application, the federated communication bus is also used to select a first audit SCS unit from at least one audit SCS unit based on a preset selection strategy, and determine the first audit SCS unit as the target audit SCS unit.

[0044] In this embodiment, after receiving an audit event message from the target agent SCS unit and storing it in persistent storage, the federated communication bus can select a first audit SCS unit from at least one audit SCS unit based on a preset selection strategy. The selection strategy can be a load balancing strategy or a health selection strategy. For example, the audit SCS unit with the lowest load can be selected from at least one audit SCS unit, or a healthy audit SCS unit can be selected from at least one audit SCS unit. After selecting the first audit SCS unit, it is determined as the target audit SCS unit, and the audit event message is delivered to the target audit SCS unit.

[0045] Optionally, the federated communication bus is also configured to, after delivering the audit event message to the target audit SCS unit, disconnect before the target audit SCS unit returns an acknowledgment message, or if no acknowledgment message is received from the target audit SCS unit within a preset first time period, select a second audit SCS unit from at least one audit SCS unit, determine the second audit SCS unit as the target audit SCS unit, and repeat the step of delivering the audit event message to the target audit SCS unit.

[0046] After the federated communication bus delivers the audit event message to the target audit SCS unit, the target audit SCS unit processes the audit event message and returns an acknowledgment message upon successful processing. If the federated communication bus detects that the target audit SCS unit disconnects before returning an acknowledgment message, or if the federated communication bus detects that it has not received an acknowledgment message from the target audit SCS unit within a preset first time period, it can be considered that the target audit SCS unit has malfunctioned or its processing is abnormal. In this case, a second audit SCS unit can be selected from at least one audit SCS unit and designated as the target audit SCS unit. The step of delivering the audit event message to the target audit SCS unit is then repeated so that the target audit SCS unit can process the audit event message.

[0047] By re-delivering audit event messages, it can be ensured that audit event messages are not lost when the audit SCS unit fails or processes abnormally, ensuring that audit event messages are processed successfully "at least once" and achieving high availability of the audit SCS unit.

[0048] The second audit SCS unit is an audit SCS unit other than the first audit SCS unit among at least one audit SCS units.

[0049] In some embodiments of this application, the target audit SCS unit is specifically used to store the audit logs into a local second buffer after obtaining the audit logs; and to write the audit logs in the local second buffer into a long-term storage system in batches.

[0050] In this embodiment, each audit SCS unit has its own local second buffer. After processing the audit event message and obtaining the audit log, the target audit SCS unit can first store the audit log in its local second buffer, and then write the audit log in the local second buffer in batches to the long-term storage system.

[0051] Optionally, the target audit SCS unit may perform a batch write operation to the long-term storage system when the size of the audit log in the local second buffer is greater than or equal to a preset threshold. Alternatively, the target audit SCS unit may trigger a batch write operation to the long-term storage system based on the duration of a timer. The duration of the timer can be a preset second duration.

[0052] This batch processing mechanism helps improve the system's throughput, enabling audit logs to be efficiently and persistently stored in the final long-term storage system.

[0053] In some embodiments of this application, the target audit SCS unit is also used to determine and record the context information associated with the audit log after obtaining the audit log.

[0054] After obtaining the audit logs, the target audit SCS unit can further determine and record the context information associated with the audit logs. The context information associated with the audit logs can include user context information, asset context information, and audit enhancement information. User context information can include the user's real name, department / role, and user permissions. For example, the user's real name, department information, and permission information can be queried based on the user identifier corresponding to the audit log. Asset context information can include the business ownership of the target asset and the person responsible for the asset. Audit enhancement information can include the security policy associated with this operation and the operation risk level marker.

[0055] Recording contextual information associated with audit logs can enrich the content of audit logs and facilitate further querying and processing of audit logs, such as compliance review, behavior analysis, and security incident tracing.

[0056] In some embodiments of this application, the target agent SCS unit is further configured to: Before establishing a session proxy channel between the terminal device and the target asset, obtain the target authentication credentials from the local private database; Based on the target authentication credentials, the identity of the user who initiates the operation and maintenance session connection request through the terminal device is authenticated. If identity authentication is successful, proceed with the steps to establish a session proxy channel between the terminal device and the target asset.

[0057] In this embodiment, each agent SCS unit has its own corresponding local private database, in which authentication credentials, such as Secure Shell (SSH) public keys or password hashes, can be pre-synchronized and cached. The local private database can be an embedded database coexisting with the agent SCS unit.

[0058] After receiving a maintenance session connection request for a target asset from the terminal device, the target agent SCS unit can first retrieve the target authentication credential from its local private database. Optionally, the target agent SCS unit can retrieve the target authentication credential from its local private database based on the terminal device information, target asset information, user identification information, etc., associated with the received maintenance session connection request. Different terminal devices, different assets, and different user identifiers can correspond to the same or different authentication credentials.

[0059] The target proxy SCS unit, based on the target authentication credentials, can authenticate users who initiate maintenance session connection requests through terminal devices to determine the legitimacy of their access to the target asset. If authentication is successful, the target proxy SCS unit can establish a session proxy channel between the terminal device and the target asset and continue with subsequent operations. If authentication fails, the target proxy SCS unit can return feedback information indicating authentication failure.

[0060] Each agent SCS unit stores authentication credentials in its local private database. Each agent SCS can authenticate users who initiate maintenance session connection requests through terminal devices based on its local private database, without relying on an external database. This avoids the situation where all agent SCS units are unable to authenticate users due to external database failure.

[0061] In some embodiments of this application, such as Figure 2 As shown, a consensus subunit is deployed in each proxy SCS unit and each audit SCS unit; Any consensus subunit is used to periodically select one or more consensus subunits other than itself, and exchange with the selected consensus subunit the identification information and health status information of the proxy SCS unit or audit SCS unit that they have perceived, so that the operation and maintenance session connection request is routed to the healthy proxy SCS unit, and / or the audit event message is delivered to the healthy audit SCS unit.

[0062] In this embodiment, a consensus subunit is deployed in each proxy SCS unit and each audit SCS unit. Each consensus subunit can perceive the identification information and health status information of the proxy SCS unit or audit SCS unit. Any consensus subunit can periodically select (e.g., randomly select) one or more consensus subunits other than itself, and then exchange the identification information and health status information of the proxy SCS unit or audit SCS unit that it has perceived with the selected consensus subunit. Optionally, the consensus subunit can perform service discovery and health monitoring operations through the Gossip protocol (an information propagation protocol).

[0063] By using a decentralized approach and consensus sub-units with minimal network overhead, each proxy SCS unit or audit SCS unit can become aware of the existence and health status of other SCS units. This enables each proxy SCS unit and each audit SCS unit to obtain a consistent cluster view. The cluster view can be synchronized to the load balancer and federated communication bus at the front end of the operation and maintenance audit system. This allows the load balancer at the front end of the operation and maintenance audit system to route operation and maintenance session connection requests to healthy proxy SCS nodes based on the cluster view, and allows the federated communication bus to deliver audit event messages to healthy audit SCS units based on the cluster view.

[0064] The cluster view can include the identifier and health status of each agent SCS unit and each audit SCS unit in the cluster, as well as metadata such as role, version number, and load metric for each agent SCS unit and each audit SCS unit.

[0065] In some embodiments of this application, the first consensus subunit is used for: Upon receiving an update request for the target data, the updated target data is written as a log entry to the local log. Log entries are sent in parallel to multiple second consensus subunits; After receiving a successful response message from a second number of pre-defined second consensus subunits from multiple second consensus subunits, the log entry is marked as committed, so that each consensus subunit updates its local state machine based on the log entry. The first consensus subunit is the leader elected by the consensus subunits deployed in each agent SCS unit and each audit SCS unit. The multiple second consensus subunits include other consensus subunits besides the first consensus subunit.

[0066] In this embodiment, each proxy SCS unit and each audit SCS unit deploys a consensus subunit. The consensus subunit can perform strong consistency replication of the target data using the Raft protocol (a log-replication-based consistency algorithm) to ensure the consistency of the target data across all proxy SCS units and audit SCS units, reducing the risk of "split-brain" scenarios. This avoids data conflicts and service state splits caused by multiple nodes writing simultaneously under abnormal conditions. The target data can be understood as critical data or strongly consistent data, such as authentication credentials and global security policies.

[0067] Each consensus subunit can elect a leader, which is the first consensus subunit. Optionally, the Raft component of each consensus subunit can elect a unique leader within the cluster. All update requests for the target data must first be routed to the first consensus subunit. Consensus subunits other than the first consensus subunit, i.e., second consensus subunits, can be called followers.

[0068] Upon receiving an update request for target data, the first consensus subunit can first write the updated target data as a log entry to its local log, and then send the log entry in parallel to the second consensus subunit. Each second consensus subunit, upon receiving the log entry, can write it to its local log and then return a success response message to the first consensus subunit. Updates to the target data can include adding or modifying the target data.

[0069] If the first consensus subunit receives a successful acknowledgment message from a predetermined second number of second consensus subunits (among multiple second consensus subunits) indicating that a log entry has been written to its local log, then that log entry is marked as committed. This allows each consensus subunit to update its local state machine based on that log entry, applying the instructions from that log entry sequentially to its local state machine, thereby updating the data state in memory. This signifies that updating the target data has become an established fact for the cluster. This effectively reduces the risk of split-brain. The second number can be understood as a quorum.

[0070] Optionally, after the first consensus subunit marks the log entry as committed, it can publish the change message of the log entry to the federated communication bus as an update notification topic. Each agent SCS unit that subscribes to the configuration update notification topic can obtain the updated target data in its local state machine after receiving the change notification and apply it in subsequent session processing.

[0071] The consensus subunit embedded within each SCS unit (including the proxy SCS unit and the audit SCS unit) solves the problems of service discovery and critical configuration data consistency by using a hybrid consensus mechanism, which is the foundation of system dynamism. Through the consensus subunit, health status data (heartbeat) can flow freely among all SCS units in a gossip manner, while "write" requests for target data flow unidirectionally to the leader, and then the leader replicates it to all followers.

[0072] To facilitate understanding, the technical solutions provided in the embodiments of this application will be further described below through specific examples.

[0073] A possible process for establishing an operations and maintenance session connection and transferring audit data is as follows: Operation and maintenance session connection request -> Layer 4 load balancer -> Target agent SCS unit -> Target agent SCS unit establishes session agent channel and writes session data copy (session recording) to its local first buffer -> Target agent SCS unit publishes audit event message (generated based on session data copy) to the federated communication bus -> Audit SCS unit subscribes to and consumes the audit event message from the federated communication bus -> Stores it in the long-term storage system (long-term database).

[0074] Take the following business scenarios as examples: An internet company needs to provide its operations and maintenance team with a 24 / 7 uninterrupted bastion host service for accessing tens of thousands of core servers. Auditing requirements are stringent; no operation records can be lost. The operations and maintenance auditing system deploys three proxy SCS units (101a, 101b, 101c) and two audit SCS units (102a, 102b).

[0075] Operating steps and working principle: Connection Distribution: Operations personnel A requests a connection to target server S1 via an SSH client. This connection request first reaches a Layer 4 load balancer (L4 LB) at the front end of the operations audit system. The L4 LB forwards this connection request to one of the healthy proxy SCS units, such as proxy SCS unit 101a, using a round-robin or least-connections strategy.

[0076] Condition: L4 LB determines the health status of all agent SCS units (101a, 101b, 101c) by performing health checks on their service ports.

[0077] Session proxy channel establishment and local authentication credential verification: When the proxy SCS unit 101a receives the operation and maintenance session connection request, it loads the synchronized static SSH authentication credentials from its local private database (local private data storage) (or triggers other authentication processes). After successful verification, the proxy SCS unit 101a establishes an SSH connection with the target server S1 on behalf of user A.

[0078] Technical principle: The localization of authentication credentials makes this step independent of any external network or database service. Even if other SCS units (such as 101b, 101c, 102a, 102b) or the federated communication bus fail at this time, the proxy SCS unit 101a can still independently complete the establishment of the session proxy channel.

[0079] Session proxy and local log recording: Proxy SCS unit 101a establishes a session proxy channel (bidirectional data flow pipeline) between user A and server S1. All session data flowing through this channel (such as operation commands issued by the user through the terminal device, and feedback data of asset operation) is copied in real time to obtain a session data copy, and written in a structured format (e.g., {session_id, timestamp, data_chunk}) to the rolling log file of the local first buffer of proxy SCS unit 101a.

[0080] Technical principle (write-before-send): This is a crucial step in ensuring that session data copies (audit data) are not lost. Before sending a copy of the session data to any network component, it is first persisted to the local disk. This creates a reliable buffer that can withstand the failure of any subsequent components.

[0081] Asynchronous event publishing: An independent thread (or coroutine) within the agent SCS unit 101a monitors the aforementioned local log files. When new content is written to the local log files, this thread reads the new log data blocks, encapsulates them into a Protobuf-formatted audit event message, and then publishes it to the federated communication bus, such as audit.logs.ssh, according to the audit topic.

[0082] Parameter: The message delivery guarantee level is set to "At-Least-Once".

[0083] Technical Principle (Asynchronous Decoupling): The publish operation is a non-blocking "fire-and-forget" operation. The core proxy performance of the proxy SCS unit 101a is not affected by the processing speed or availability of the downstream audit SCS unit.

[0084] Audit event message consumption and persistence: Both audit SCS units 102a and 102b subscribe to audit topic messages, such as the audit.logs.ssh topic message. The federated communication bus, based on its load balancing strategy (e.g., QueueGroup mode), delivers audit event messages published by proxy SCS unit 101a to one of the consumers, such as audit SCS unit 102a. Upon receiving the audit topic message, audit SCS unit 102a parses the content and writes it to the backend long-term storage system (long-term audit database) (e.g., ClickHouse).

[0085] Result: The operations performed by maintenance personnel A were recorded securely and reliably.

[0086] Fault-tolerant scenario: Suppose that during a session, both audit SCS unit 102a and audit SCS unit 102b fail simultaneously.

[0087] System Behavior: Agent SCS unit 101a continues to operate normally, and sessions are unaffected. Session data copies are continuously written to the local first buffer, and attempts are made to publish audit event messages to the Federation Communication Bus. Because the Federation Communication Bus is configured as a persistent cluster, these unconsumed audit event messages will be securely stored by the Federation Communication Bus.

[0088] Recovery: Once the audit SCS unit 102a or audit SCS unit 102b returns to normal, they will reconnect to the federated communication bus and continue consuming the backlog of audit event messages from where they left off.

[0089] In one related technology, the bastion host is deployed in an active-standby mode. Audit event messages from all nodes are transmitted in real time to a dedicated audit server via the network. When the audit server fails and becomes unreachable, the bastion host master node will block session operations due to the failure to transmit audit event messages, resulting in connection interruption. Audit event messages that were not successfully transmitted will also be lost because they are not persistently stored. After the audit server recovers, it is impossible to retrieve the audit event messages lost during the failure period, causing an audit gap. However, in this embodiment, the core function of the proxy SCS unit is not affected by the failure of the audit SCS unit. Audit event messages are first written to a local first buffer and then asynchronously published. The federated communication bus persists the message queue to ensure that audit event messages are reliably stored. The audit SCS unit continues to consume the backlog of audit event messages from where it was last interrupted, achieving zero loss of audit event messages and zero service interruption.

[0090] One possible static SSH key update data flow process is as follows: The administrator initiates a key update request by calling the Application Programming Interface (API) -> the request is routed to the Raft Leader (located within a certain agent SCS unit) -> the Raft consensus sub-unit synchronizes the encrypted key data -> the Raft Leader publishes a configuration update event message to the federated communication bus -> all agent SCS units subscribe to the configuration update event message -> the new data is read from the local Raft consensus sub-unit and the local authentication credential cache is updated.

[0091] Take the following business scenarios as examples: The security administrator needs to add a new SSH private key for a new server cluster and ensure that the private key can be used by all proxy SCS units in future sessions, and the entire update process should not interrupt existing services.

[0092] Operating steps and working principle: Initiating an update request: The security administrator submits a new SSH private key via an administrative API. This API request is routed to the Raft Leader node elected by the current cluster consensus sub-unit (assuming the Leader role is currently on proxy SCS unit 101b).

[0093] Distributed Consensus: The Raft consensus subunit on proxy SCS unit 101b receives a write request. First, it encrypts the credential data using an envelope, then writes the "proposed change" to its Raft log. Next, this log entry is replicated in parallel to the Raft consensus subunits of all other Follower nodes (including proxy SCS units 101a, proxy SCS units 101c, audit SCS units 102a, and audit SCS units 102b).

[0094] Technical principle (strong consistency): The Raft protocol guarantees that the agent SCS unit 101b will only "commit" the change and apply it to its own state machine (i.e., update the encrypted authentication credential data) after a majority (Quorum) of nodes (at least 3 in this case) has confirmed that the log entry has been successfully written. This avoids the data inconsistency problem caused by "split-brain".

[0095] Publish Change Notification: After a change is successfully committed, the Raft consensus subunit of the proxy SCS unit 101b triggers a callback. This callback function is responsible for constructing a "Configuration Updated" event message and publishing it to the configuration update notification topic on the federated communication bus, such as the config.update.notify topic.

[0096] Message content: can be very lightweight, such as {config_type: "static_credential",version: "v2.1.0"}, serving only as a notification and not containing sensitive data.

[0097] Each node pulls updates: All proxy SCS units (including proxy SCS unit 101a, proxy SCS unit 101b, and proxy SCS unit 101c) subscribe to the config.update.notify topic. Upon receiving this notification, they do not directly use the data in the message, but instead trigger an internal action: reading the latest version of the encrypted authentication credential data from their respective local Raft consensus subunit's state machine.

[0098] Update local cache: After each agent SCS unit reads new authentication credential data, it uses it to update its own runtime authentication credential cache in memory.

[0099] Result: At this point, all agent SCS units have the latest copy of the SSH private key. All new sessions initiated from this point onward will be able to use this new key.

[0100] Beneficial effects: The entire process is dynamic, asynchronous, and secure. Authentication credentials themselves are not transmitted through the message bus; only update signals are delivered. Each SCS unit pulls data from a local, consensus-guaranteed copy, achieving high security and fault tolerance. This mechanism makes authentication credential management both centralized (through a unified Raft Leader entry point) and highly available in a distributed manner (data is replicated to each consensus subunit).

[0101] From the above description of the technical solutions provided in the embodiments of this application, the embodiments of this application have the following characteristics: Uniqueness in system architecture: A federated system architecture composed of Shared-Nothing, autonomous SCS units; Uniqueness in communication mechanism: An event-driven communication paradigm based entirely on an asynchronous federated communication bus, used to achieve loose coupling between core functions and auxiliary functions; Uniqueness in data management: A data management approach that combines runtime local data privatization with global configuration data distributed consensus.

[0102] The core idea of ​​this application's embodiments is "federated instead of centralized, asynchronous instead of synchronous, and dynamic instead of static," which solves the problems of related technologies through the following features: Fault isolation is achieved through a self-contained system (SCS) federation: This application's embodiments decompose a monolithic bastion host into multiple autonomous SCS units. Each SCS unit (specifically, the proxy SCS unit handling user sessions) is an independent entity with "shared nothingness." Faults in a single SCS unit (such as downtime or software defects) are confined within it and do not propagate to other SCS units through centralized components such as shared databases, thereby minimizing the fault domain and solving the problem of fault domain expansion in traditional active-active models.

[0103] Service decoupling is achieved through asynchronous event communication: This application's embodiments employ a highly available federated communication bus for communication between SCS units. For example, after a proxy SCS unit completes session operation auditing, it only needs to publish the audit event message to the federated communication bus to complete its task. The auditing SCS unit, acting as a subscriber, asynchronously consumes these audit event messages. This design completely decouples the core proxy functionality from the non-core auditing functionality in time. Even if the auditing system is unavailable for an extended period, it does not affect the establishment of new sessions or ongoing sessions, thus solving the problem in related technologies where auditing system failures may block core business operations.

[0104] Achieving dynamic high availability through hybrid consensus and credential replication: This application's embodiment forms a hybrid consensus mechanism by embedding the Gossip protocol and Raft consensus algorithm within the SCS unit. Gossip is used for efficient service discovery and health checks, while Raft is used for strong consistency synchronization of a few critical global configurations (such as static credential master data). Authentication credentials are securely replicated to the local cache of each proxy SCS unit. This avoids real-time dependence on an external credential center, making session establishment faster and more reliable, while the Raft protocol helps prevent the "split-brain" problem.

[0105] As can be seen from the above, the embodiments of this application can have high availability, fault isolation capability and data reliability without the need for centralized dependence.

[0106] Corresponding to the above method embodiments, this application embodiment also provides an operation and maintenance auditing method, applied to the target agent self-accommodating SCS unit of the operation and maintenance auditing system. The operation and maintenance auditing system includes a federated communication bus, and multiple agent SCS units and at least one audit SCS unit connected to the federated communication bus. The at least one audit SCS unit is a subscriber to audit topic messages, and the target agent SCS unit is one of the multiple agent SCS units. The operation and maintenance auditing method described below can be referred to in correspondence with the operation and maintenance auditing system described above.

[0107] See Figure 3 As shown, the method may include the following steps: S310: Receives a maintenance session connection request for the target asset from the terminal device; S320: Based on the operation and maintenance session connection request, establish a session proxy channel between the terminal device and the target asset; S330: Obtain a copy of the session data flowing through the session proxy channel; S340: Write a copy of the session data to the local first buffer; S350: Generate audit event messages based on the session data copy in the local first buffer; S360: Publish audit event messages to the federated communication bus according to the audit topic, so that after the federated communication bus stores the audit event messages in persistent storage, it delivers the audit event messages to the target audit SCS unit, so that the target audit SCS unit processes the audit event messages, obtains the audit logs, and stores the audit logs in the long-term storage system. The target audit SCS unit is one of at least one audit SCS unit.

[0108] Using the method provided in this application, multiple agent SCS units and at least one audit SCS unit are connected via a federated communication bus. The target agent SCS unit processes the operation and maintenance session connection request, obtains a copy of the session data, writes it to a local first buffer, and then publishes an audit event message generated based on the session data copy in the local first buffer to the federated communication bus. The federated communication bus stores the audit event message in persistent storage and then delivers it to the target audit SCS unit. The target audit SCS unit processes the audit event message, obtains audit logs, and stores the audit logs in a long-term storage system. Each agent SCS unit and each audit SCS unit operates independently, without relying on centralized components such as a shared database. Faults in each agent SCS unit and each audit SCS unit are confined within their own scope and will not be propagated to other SCS units through centralized components such as a shared database, thus improving system stability.

[0109] In some embodiments of this application, the method further includes the following steps: Upon receiving an operation and maintenance session connection request, a session proxy thread is allocated for the operation and maintenance session connection request. The session proxy thread is used to perform the following steps based on the operation and maintenance session connection request: establishing a session proxy channel between the terminal device and the target asset, obtaining a copy of the session data flowing through the session proxy channel, and writing the copy of the session data into the local first buffer. The first buffer is monitored using a publishing thread independent of the session broker thread. If the publishing thread detects a new copy of session data in the local first buffer, the publishing thread performs the following steps: generating an audit event message based on the copy of session data in the local first buffer, and publishing the audit event message to the federated communication bus according to the audit topic.

[0110] In some embodiments of this application, step S350 generates an audit event message based on a copy of session data in a local first buffer, including: The newly written session data copy in the local first buffer is serialized using a preset data format to generate an audit event message.

[0111] In some embodiments of this application, the method further includes: Before establishing a session proxy channel between the terminal device and the target asset, obtain the target authentication credentials from the local private database; Based on the target authentication credentials, the identity of the user who initiates the operation and maintenance session connection request through the terminal device is authenticated. If identity authentication is successful, proceed with the steps to establish a session proxy channel between the terminal device and the target asset.

[0112] Regarding the methods in the above embodiments, the specific manner in which each step is performed has been described in detail in the embodiments of the system, and will not be elaborated here.

[0113] Corresponding to the above method embodiments, this application embodiment also provides an operation and maintenance audit device, applied to the target agent self-accommodating SCS unit of the operation and maintenance audit system. The operation and maintenance audit system includes a federated communication bus, and multiple agent SCS units and at least one audit SCS unit connected to the federated communication bus. The at least one audit SCS unit is a subscriber to audit topic messages, and the target agent SCS unit is one of the multiple agent SCS units. The operation and maintenance audit device described below can be referred to in correspondence with the operation and maintenance audit method described above.

[0114] See Figure 4 As shown, the operation and maintenance audit device 400 may include the following modules: The receiving module 410 is used to receive a maintenance session connection request for the target asset from the terminal device; Establish module 420 to establish a session proxy channel between the terminal device and the target asset based on the operation and maintenance session connection request; The acquisition module 430 is used to acquire a copy of the session data flowing through the session proxy channel; The write module 440 is used to write a copy of the session data into the local first buffer; The generation module 450 is used to generate audit event messages based on the session data copy in the local first buffer. The publishing module 460 is used to publish audit event messages to the federated communication bus according to the audit topic, so that after the federated communication bus stores the audit event messages in persistent storage, it delivers the audit event messages to the target audit SCS unit, so that the target audit SCS unit processes the audit event messages, obtains the audit logs, and stores the audit logs in the long-term storage system. The target audit SCS unit is one of at least one audit SCS unit.

[0115] The apparatus provided in this application processes maintenance session connection requests, obtains session data copies, writes them to a local first buffer, and then publishes audit event messages generated based on the session data copies in the local first buffer to the federated communication bus. The federated communication bus stores the audit event messages in persistent storage and then delivers them to the target audit SCS unit. The target audit SCS unit processes the audit event messages, obtains audit logs, and stores the audit logs in a long-term storage system. Each agent SCS unit and each audit SCS operates independently, without relying on centralized components such as shared databases. Faults in each agent SCS unit and each audit SCS unit are confined within themselves and will not be propagated to other SCS units through centralized components such as shared databases, thus improving system stability.

[0116] In some embodiments of this application, an allocation module and a monitoring module are also included; The allocation module is used to allocate a session proxy thread for the operation and maintenance session connection request after receiving the operation and maintenance session connection request. The session proxy thread is used to perform the steps of establishing a session proxy channel between the terminal device and the target asset based on the operation and maintenance session connection request, obtaining a copy of the session data flowing through the session proxy channel, and writing the copy of the session data into the local first buffer. The monitoring module is used to monitor the first buffer using a publishing thread independent of the session agent thread. When the publishing thread detects a new copy of session data in the local first buffer, it uses the publishing thread to perform the following steps: generating an audit event message based on the copy of session data in the local first buffer, and publishing the audit event message to the federated communication bus according to the audit topic.

[0117] In some embodiments of this application, the generation module 450 is specifically used for: The newly written session data copy in the local first buffer is serialized using a preset data format to generate an audit event message.

[0118] In some embodiments of this application, an authentication module is also included, for: Before establishing a session proxy channel between the terminal device and the target asset, obtain the target authentication credentials from the local private database; Based on the target authentication credentials, the identity of the user who initiates the operation and maintenance session connection request through the terminal device is authenticated. If the identity authentication is successful, the establishment module 420 is triggered to execute the step of establishing a session proxy channel between the terminal device and the target asset.

[0119] Regarding the apparatus in the above embodiments, the specific manner in which each module performs its operation has been described in detail in the embodiments related to the method, and will not be elaborated upon here.

[0120] This application also provides an electronic device, such as... Figure 5 As shown, it includes a processor 501, a communication interface 502, a memory 503, and a communication bus 504, wherein the processor 501, the communication interface 502, and the memory 503 communicate with each other through the communication bus 504. Memory 503 is used to store computer programs; When processor 501 executes the program stored in memory 503, it performs the following steps: Receive maintenance session connection requests for the target asset from the terminal device; Based on the operation and maintenance session connection request, establish a session proxy channel between the terminal device and the target asset; Obtain a copy of the session data flowing through the session proxy channel; Write a copy of the session data to the local first buffer; Generate audit event messages based on the session data copy in the local first buffer; Audit event messages are published to the federated communication bus according to the audit topic. After the federated communication bus stores the audit event messages in persistent storage, it delivers the audit event messages to the target audit SCS unit. The target audit SCS unit processes the audit event messages, obtains the audit logs, and stores the audit logs in the long-term storage system. The target audit SCS unit is one of at least one audit SCS unit.

[0121] The communication bus 504 mentioned in the aforementioned electronic device can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. This communication bus 504 can be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, it is represented by only one thick line in the figure, but this does not indicate that there is only one bus or one type of bus.

[0122] Communication interface 502 is used for communication between the aforementioned terminal and other devices.

[0123] The memory 503 may include random access memory (RAM) or non-volatile memory, such as at least one disk storage device. Optionally, the memory 503 may also be at least one storage device located remotely from the aforementioned processor.

[0124] The processor 501 mentioned above can be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc.; it can also be a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.

[0125] In another embodiment provided in this application, a computer-readable storage medium is also provided, which stores instructions that, when executed on a computer, cause the computer to perform the steps of any of the operation and maintenance auditing methods described in the above embodiments.

[0126] In another embodiment provided in this application, a computer program product containing instructions is also provided, which, when run on a computer, causes the computer to perform the steps of any of the operation and maintenance auditing methods in the above embodiments.

[0127] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of this application is generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., a solid-state disk (SSD)).

[0128] It should be noted that, in this document, relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes the element.

[0129] The various embodiments in this specification are described in a related manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0130] The above are merely preferred embodiments of this application and are not intended to limit the scope of protection of this application. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application are included within the scope of protection of this application.

Claims

1. An operation and maintenance auditing system, characterized in that, This includes a federated communication bus, multiple agent self-hosting system (SCS) units connected to the federated communication bus, and at least one audit SCS unit, wherein the at least one audit SCS unit is a subscriber to audit topic messages; wherein, The target proxy SCS unit is configured to receive an operation and maintenance session connection request for a target asset from a terminal device; establish a session proxy channel between the terminal device and the target asset based on the operation and maintenance session connection request; obtain a copy of the session data flowing through the session proxy channel; write the copy of the session data into a local first buffer; generate an audit event message based on the copy of the session data in the local first buffer; and publish the audit event message to the federated communication bus according to the audit topic. The target proxy SCS unit is one of the plurality of proxy SCS units. The federated communication bus is used to receive the audit event message from the target agent SCS unit; after storing the audit event message in persistent storage, it delivers the audit event message to the target audit SCS unit; the target audit SCS unit is one of the at least one audit SCS units. The target audit SCS unit is used to obtain audit event messages from the federated communication bus; process the audit event messages to obtain audit logs; and store the audit logs in a long-term storage system.

2. The system according to claim 1, characterized in that, The target agent SCS unit is further configured to: Upon receiving the maintenance session connection request, a session proxy thread is allocated for the maintenance session connection request, and the session proxy thread is used to execute the steps of establishing a session proxy channel between the terminal device and the target asset based on the maintenance session connection request, obtaining a copy of the session data flowing through the session proxy channel, and writing the copy of the session data into the local first buffer. The first buffer is monitored using a publishing thread independent of the session broker thread, so that if the publishing thread detects a new copy of session data in the local first buffer, the publishing thread performs the steps of generating an audit event message based on the copy of session data in the local first buffer, and publishing the audit event message to the federated communication bus according to the audit topic.

3. The system according to claim 1, characterized in that, The target proxy SCS unit is specifically used to serialize the newly written session data copy in the local first buffer using a preset data format to generate an audit event message; The target audit SCS unit is specifically used to deserialize the audit event message to obtain the audit log.

4. The system according to claim 1, characterized in that, The federated communication bus includes multiple bus nodes, and persistent storage is deployed on each bus node; The federated communication bus is specifically used to store the audit event message in a persistent storage on a preset first number of bus nodes among the plurality of bus nodes.

5. The system according to claim 1, characterized in that, The federated communication bus is also used to select a first audit SCS unit from the at least one audit SCS unit based on a preset selection strategy, and determine the first audit SCS unit as the target audit SCS unit.

6. The system according to claim 5, characterized in that, The federated communication bus is further configured to, after delivering the audit event message to the target audit SCS unit, disconnect before the target audit SCS unit returns an acknowledgment message, or if no acknowledgment message is received from the target audit SCS unit within a preset first time period, select a second audit SCS unit from the at least one audit SCS unit, determine the second audit SCS unit as the target audit SCS unit, and repeat the step of delivering the audit event message to the target audit SCS unit.

7. The system according to claim 1, characterized in that, The target audit SCS unit is specifically used to store the audit logs in a local second buffer after obtaining the audit logs; and to write the audit logs in the local second buffer in batches into the long-term storage system.

8. The system according to claim 1, characterized in that, The target audit SCS unit is also used to determine and record the context information associated with the audit log after obtaining the audit log.

9. The system according to claim 1, characterized in that, The target agent SCS unit is further configured to: Before establishing a session proxy channel between the terminal device and the target asset, the target authentication credentials are obtained from the local private database. Based on the target authentication credentials, the user who initiates the maintenance session connection request through the terminal device is authenticated. If identity authentication is successful, the step of establishing a session proxy channel between the terminal device and the target asset is executed.

10. The system according to any one of claims 1 to 9, characterized in that, Each of the agent SCS units and each of the audit SCS units is equipped with a consensus subunit; Any consensus subunit is used to periodically select one or more consensus subunits other than itself, and exchange with the selected consensus subunit the identification information and health status information of the proxy SCS unit or audit SCS unit that they have perceived, so that the operation and maintenance session connection request is routed to the healthy proxy SCS unit, and / or the audit event message is delivered to the healthy audit SCS unit.

11. The system according to claim 10, characterized in that, The first consensus subunit is used for: Upon receiving an update request for target data, the updated target data is written as a log entry to the local log. The log entries are sent in parallel to multiple second consensus subunits; After receiving a successful response message from a second number of pre-set second consensus subunits from the plurality of second consensus subunits, the log entry is marked as committed, so that each consensus subunit updates its local state machine based on the log entry; The first consensus subunit is a leader elected by the consensus subunits deployed in each proxy SCS unit and each audit SCS unit, and the plurality of second consensus subunits include other consensus subunits besides the first consensus subunit.

12. A method for auditing operation and maintenance, characterized in that, A target agent self-contained SCS unit is applied to an operation and maintenance auditing system. The operation and maintenance auditing system includes a federated communication bus, and multiple agent SCS units and at least one audit SCS unit connected to the federated communication bus. The at least one audit SCS unit is a subscriber to audit topic messages, and the target agent SCS unit is one of the multiple agent SCS units. The method includes: Receive maintenance session connection requests for the target asset from the terminal device; Based on the maintenance session connection request, a session proxy channel is established between the terminal device and the target asset; Obtain a copy of the session data flowing through the session proxy channel; Write the session data copy into the local first buffer; An audit event message is generated based on the session data copy in the local first buffer; The audit event message is published to the federated communication bus according to the audit topic, so that after the federated communication bus stores the audit event message in persistent storage, it delivers the audit event message to the target audit SCS unit, so that the target audit SCS unit processes the audit event message, obtains the audit log, and stores the audit log in the long-term storage system. The target audit SCS unit is one of the at least one audit SCS units.

13. A maintenance operation auditing device, characterized in that, A target agent self-contained SCS unit is applied to an operation and maintenance auditing system. The operation and maintenance auditing system includes a federated communication bus, and multiple agent SCS units and at least one audit SCS unit connected to the federated communication bus. The at least one audit SCS unit is a subscriber to audit topic messages, and the target agent SCS unit is one of the multiple agent SCS units. The device includes: The receiving module is used to receive maintenance session connection requests for the target asset from the terminal device; A module is established to establish a session proxy channel between the terminal device and the target asset based on the operation and maintenance session connection request. The acquisition module is used to acquire a copy of the session data flowing through the session proxy channel; The write module is used to write the session data copy into the local first buffer; The generation module is used to generate audit event messages based on the session data copy in the local first buffer; The publishing module is used to publish the audit event message to the federated communication bus according to the audit topic, so that after the federated communication bus stores the audit event message in persistent storage, it delivers the audit event message to the target audit SCS unit, so that the target audit SCS unit processes the audit event message, obtains the audit log, and stores the audit log in the long-term storage system. The target audit SCS unit is one of the at least one audit SCS units.

14. An electronic device, characterized in that, It includes a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; Memory, used to store computer programs; When a processor executes a program stored in memory, it implements the steps of the operation and maintenance auditing method as described in claim 12.

15. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the steps of the operation and maintenance auditing method as described in claim 12.

16. A computer program product, characterized in that, The computer program product includes computer instructions stored in a computer-readable storage medium and adapted to be read and executed by a processor to cause an electronic device having the processor to perform the steps of the operation and maintenance auditing method as described in claim 12.