An AI large model-oriented financial special-purpose network optimization method and system

By collecting AI business needs and network status in real time, generating differentiated path strategies using SRv6's SID programming capabilities, dynamically adjusting paths based on computing node load information, embedding a path authentication mechanism, and employing flow detection and proactive early warning mechanisms, this approach solves the problems of low network resource utilization, wasted computing resources, and insufficient security in existing AI large-scale model financial application scenarios. It achieves efficient and secure fulfillment of business needs and visualized operation and maintenance.

CN121924006BActive Publication Date: 2026-06-23SHANDONG CITY COMMERCIAL BANK COOP ALLIANCE CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANDONG CITY COMMERCIAL BANK COOP ALLIANCE CO LTD
Filing Date
2026-03-23
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing technologies cannot accurately match the dynamic traffic characteristics of large AI models in financial application scenarios, resulting in low network resource utilization, wasted computing resources, insufficient path security, and a lack of end-to-end visualization capabilities in the operation and maintenance system, making it difficult to meet the needs of efficient and secure business.

Method used

By collecting AI business needs and network status in real time, generating differentiated path strategies using SRv6's SID programming capabilities, dynamically adjusting paths by combining computing node load information, embedding a path authentication mechanism, and adopting flow detection and proactive early warning mechanisms, a closed-loop optimization system is formed.

Benefits of technology

It has achieved deep adaptation of network resources to AI business needs, improved computing power collaboration efficiency and path security, shortened troubleshooting time, ensured business continuity and data security, and improved network transmission efficiency and visualization capabilities.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121924006B_ABST
    Figure CN121924006B_ABST
Patent Text Reader

Abstract

The application discloses a kind of financial special network optimization method and system for AI big model, it is related to communication network technical field.The method is by collecting AI business demand and network state, using the SID programming ability of SRv6, generates exclusive path strategy for AI training, inference and sensitive data flow;Real-time monitoring of computing power node load, link computing power state and network scheduling, realize the collaborative optimization of "network-computing power";Introduce the hop-by-hop path authentication mechanism based on dynamic authentication segment identifier Authen-SID, enhance the security of data transmission path;End-to-end visualization operation and maintenance and closed-loop feedback of service flow are realized by using stream detection technology.The application solves the problems of traffic scheduling mismatch, network and computing power collaboration deficiency, path security guarantee deficiency and weak operation and maintenance visualization capability in the prior art, realizes the deep adaptation of network resources and AI business demand, improves the computing power collaborative efficiency, path security and operation and maintenance intelligent level.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of communication network technology, and more specifically, to a method and system for optimizing a financial dedicated network for large AI models. Background Technology

[0002] With the deep penetration of artificial intelligence (AI) and large-scale modeling technologies in the financial sector, core businesses such as intelligent risk control, quantitative investment research, and intelligent customer service are posing unprecedented challenges to the performance of dedicated financial networks. These applications not only require networks with extremely low latency, ultra-high bandwidth, and high reliability, but also impose stringent requirements on intelligent network scheduling. SRv6 (Segment Routing over IPv6) technology, with its flexible path programming capabilities and simplified network architecture, is considered a key enabling technology for building the next-generation intelligent financial backbone network.

[0003] In existing technologies, solutions combining SRv6 technology with financial scenarios have emerged. Among them, patent CN120915738B (hereinafter referred to as "Prior Document 1") discloses a "Method and System for Traffic Scheduling of Financial Intelligent Backbone Network Based on SRv6," which to some extent achieves differentiated services and dynamic optimization of financial businesses, and has positive significance for improving the intelligence level of the financial backbone network. However, the inventors have found that this solution still has the following insurmountable technical defects when facing the emerging and complex financial application scenario of AI large-scale models:

[0004] First, the traffic scheduling mechanism is mismatched with the dynamic traffic characteristics of large-scale AI models. The business classification granularity (transaction, monitoring, etc.) in Comparison Document 1 is designed based on traditional financial business and cannot accurately match the dynamic traffic characteristics within large-scale AI models. Specifically, the AI ​​model training phase generates massive, bursty data streams, requiring the network to provide high bandwidth and high throughput; while the AI ​​inference phase is characterized by high-frequency, low-latency small data packet interactions, making it extremely sensitive to network latency and jitter. Comparison Document 1 cannot finely adapt these two drastically different AI traffic types within the same network. For example, it cannot plan high-bandwidth, multi-path aggregation transmission channels for training traffic while reserving the shortest path and highest priority for inference traffic. This could lead to situations where training traffic crowds out core business bandwidth, or inference traffic cannot meet millisecond-level response requirements due to path redundancy.

[0005] Second, there is insufficient coordination between network scheduling and AI computing resources. Distributed training of large AI models requires cross-data center computing power coordination, and the quality of network scheduling directly affects the utilization rate of computing resources. The path calculation in Comparison Document 1 is mainly based on network status (link latency, bandwidth utilization), failing to incorporate the real-time load of AI computing nodes (such as GPU utilization, CPU utilization) into the scheduling decision. This can lead to a contradictory situation where "idle computing nodes cannot be effectively utilized, while overloaded nodes still bear a large amount of network traffic," resulting in wasted computing resources and decreased training efficiency. Furthermore, when new AI computing nodes are deployed, cumbersome network configuration is still required, making "plug-and-play" rapid deployment impossible and failing to meet the demands of rapid iteration in AI business.

[0006] Third, the path security guarantee mechanism is lacking. Financial AI large-scale models involve massive amounts of sensitive data, placing extremely high demands on the security of transmission paths. The path scheduling mechanism in Comparison Document 1 primarily focuses on performance indicators (latency, bandwidth), lacking dynamic security guarantees for the SRv6 path itself. When network attacks such as path hijacking or packet tampering occur, existing solutions cannot verify the legitimacy of the path in real time, potentially leading to AI data streams being redirected to malicious nodes or illegally tampered with, posing serious security risks. Furthermore, because the security mechanism and path scheduling are independent, rapid scheduling adjustments cannot be triggered when path anomalies are detected, impacting the continuity of AI services and data security.

[0007] Fourth, the operations and maintenance system lacks end-to-end visualization capabilities geared towards AI business needs. The operations and maintenance monitoring in Comparison Document 1 primarily focuses on the underlying network infrastructure (link utilization, device status), lacking end-to-end visualization capabilities from an AI business perspective. When network performance deteriorates, operations and maintenance personnel struggle to quickly distinguish between network congestion and computing overload, failing to accurately pinpoint the fault to specific nodes or links. This results in lengthy troubleshooting cycles, severely impacting the continuity of AI business operations.

[0008] In summary, existing technologies, represented by Comparative Document 1, have not yet provided an integrated solution that deeply integrates SRv6 technology with the dynamic business characteristics, real-time computing power requirements, and path security guarantees of large-scale financial AI models. Therefore, how to construct a dedicated financial network system that coordinates and optimizes "network-computing power-security" for large-scale AI models to accurately match AI business needs, improve resource utilization, and ensure path security has become a pressing technical challenge in this field. Summary of the Invention

[0009] To address the problems existing in the aforementioned background technology, this invention provides a financial dedicated network optimization method and system for AI large-scale models, so as to achieve deep adaptation of network resources and AI business needs, improve computing power collaboration efficiency and path security, and enhance operation and maintenance visualization capabilities.

[0010] To achieve the above objectives, the present invention adopts the following technical solution:

[0011] In a first aspect, the present invention provides a financial-specific network optimization method for large AI models, comprising the following steps:

[0012] Step 1, AI Business Awareness and Network Status Acquisition: Real-time acquisition of business requirement parameters, traffic data, and real-time status information of the SRv6 network layer for the financial AI big data model; wherein, the business requirement parameters include AI business type, data sensitivity, and latency requirements; the traffic data includes burst peaks of training traffic, transmission duration, and packet size and interaction frequency of inference traffic; and the network status information includes link bandwidth utilization, latency, jitter, and fault status.

[0013] Step 2: Business-Aware Differentiated SRv6 Path Calculation: Based on the data collected in Step 1, utilize the SID programming capabilities of SRv6 to calculate and generate exclusive SRv6 path strategies for different types of AI businesses, generating corresponding SID lists; specifically including:

[0014] For AI training traffic, which is characterized by burstiness and massive data, multi-path load balancing paths are calculated. By aggregating multiple physical links into a logical high-bandwidth pipeline, a first SID list (training-specific SID list) is generated, and redundant bandwidth is reserved to meet bursty transmission needs.

[0015] For AI inference traffic, which is characterized by high frequency, low latency and small data packets, the shortest path between the source node and the destination node is calculated, a second SID list (inference-specific SID list) is generated, and the highest forwarding priority is set for this traffic to ensure that it is processed first when the network node is queuing and scheduling.

[0016] For sensitive data traffic involving financial user privacy information, the path that is bound to the calculation and security policy will be deeply coupled with security capabilities such as quantum encryption and data desensitization to generate a third SID list (security-specific SID list), which will only allow nodes that have been security certified to access the path;

[0017] Step 3, Dynamic Path Adjustment Based on Computing Power Awareness: Real-time collection of AI computing power node load status, linking computing power load with the SRv6 path strategy generated in Step 2; when a computing power node load exceeds a preset threshold, identifying affected service flows passing through the overloaded node and filtering candidate target computing power nodes with lower loads; for each candidate node, recalculating the optimal network path with the constraint of minimizing end-to-end latency and avoiding congestion along the path; comprehensively evaluating path quality and target node idle computing power, selecting the optimal node and path combination as the new scheduling strategy, and dynamically adjusting AI traffic paths through SRv6 SID programming, scheduling traffic from overloaded nodes to target computing power nodes with lower loads, while updating the corresponding SID list; during traffic switching, a smooth switching mechanism of "build first, then disconnect" is adopted. After the new path is successfully established and connectivity is confirmed to be normal, traffic is gradually migrated from the old path to the new path to ensure zero service interruption;

[0018] Step 4, SRv6 Path Dynamic Authentication and Packet Forwarding: The SRv6 controller generates a dynamically changing Authen-SID for each path at a specified period, and embeds it as authentication metadata into the SegmentList carrying the path identifier and sequence number. After forming a complete list of authentication segments, it is sent to all SRv6 nodes on the path. The head node encapsulates the packet based on the final SID list determined in Step 3, and uses the SegmentList and Authen-SID to calculate and generate the first SRH authentication code embedded in the packet using a hash algorithm. After receiving the packet, the intermediate and tail nodes recalculate and generate the second authentication code using the same algorithm and parameters, and compare it with the first authentication code carried in the packet in real time. If they match, the packet is forwarded according to the SID list; otherwise, the path is deemed illegal or the packet has been tampered with and is discarded, thereby achieving secure and reliable transmission of AI business data.

[0019] Step 5, End-to-End Visualized Operation and Closed-Loop Feedback: Using flow detection technology, flow measurement identifiers are embedded when AI service data packets enter the network. SRv6 nodes along the route collect and report latency, jitter, and packet loss information for each hop in real time based on this identifier. The operation and maintenance monitoring layer reconstructs the complete transmission path and segmented performance indicators of each AI service flow from the source node to the destination node based on the reported data, and displays the transmission status, path status, and computing node load of AI traffic in real time. When network anomalies such as end-to-end latency exceeding the threshold, path interruption, or security authentication failure are detected, the fault node and cause are quickly located using the SID positioning capability of SRv6, and the fault information is fed back to Steps 2 and 3, triggering path recalculation or adjustment, forming a complete optimization closed loop.

[0020] Step 4, before forwarding SRv6 packets, also includes a packet encapsulation optimization step: using G-SRv6 compressed frame header technology, non-core fields in the IPv6 base and SRH are removed, and common prefix extraction and aggregation encoding of segment list SIDs are performed to reduce packet header overhead; for AI inference traffic, a simplified SRH structure is further adopted to minimize single packet encapsulation latency; for AI training traffic, batch SID offset encoding is used to improve the encapsulation and forwarding efficiency of large data blocks.

[0021] Step 5 is followed by proactive early warning and pre-tuning steps: Based on historical monitoring data, the operation and maintenance monitoring layer uses a long short-term memory recurrent neural network model to not only predict the future bandwidth utilization of core links in a time series, but also predict the load change trend of AI computing nodes; when the prediction results show that a critical path will exceed the early warning threshold in the future, the SRv6 controller automatically triggers the pre-tuning process to pre-calculate and cache alternative paths that meet the SLA requirements for the affected service flows; when the actual utilization exceeds the tuning threshold, the cached alternative path strategy is immediately activated to achieve millisecond-level fast switching, upgrading the network operation and maintenance mode from passive response to proactive defense.

[0022] In a second aspect, the present invention provides a financial-specific network optimization system for implementing the method described in the first aspect, oriented towards large AI models, comprising:

[0023] The AI ​​business perception module is deployed on the financial business side to collect business requirement parameters and traffic data of various financial AI large models in real time. The business requirement parameters include AI business type, data sensitivity, and latency requirements. The traffic data includes burst peaks of training traffic, transmission duration, and data packet size and interaction frequency of inference traffic. The collected data is then sent to the SRv6 control module.

[0024] The SRv6 control module, as the centralized management and control unit of the system, is used to manage and distribute SRv6 policies across the entire network, realizing AI service type identification, differentiated path calculation, computing power collaborative scheduling, and policy distribution. Specifically, it includes:

[0025] The business type identification unit is used to classify AI business into training business, inference business and sensitive data transmission business based on the data reported by the AI ​​business perception module.

[0026] The differentiated path calculation unit is used to leverage the SID programming capabilities of SRv6 to generate exclusive SRv6 path strategies for different types of AI services, including multi-path load balancing strategies for training traffic, shortest path priority strategies for inference traffic, and secure path binding strategies for sensitive data traffic.

[0027] The computing power collaborative scheduling unit is used to receive the real-time load status of computing power nodes reported by the computing power resource monitoring module. When the load of a computing power node exceeds the preset threshold, it recalculates and adjusts the AI ​​traffic path, schedules the traffic to computing power nodes with lower load, and adopts a smooth switching mechanism of "build first and then disconnect" during the switching process.

[0028] The policy distribution unit is used to distribute the generated SID list to the SRv6 forwarding module via the control protocol.

[0029] The computing power resource monitoring module, deployed on AI computing power nodes in various data centers, is used to collect and report the load status of all computing power nodes in the network in real time, specifically including:

[0030] The load acquisition unit is used to collect the CPU utilization, GPU utilization, memory usage and task execution progress of the computing nodes in real time.

[0031] The status reporting unit is used to report the collected load data to the computing power collaborative scheduling unit of the SRv6 control module in real time according to a preset cycle.

[0032] The SRv6 forwarding module, deployed at various forwarding nodes of the financial backbone network, is used to execute the forwarding and path authentication of AI business packets according to the policies issued by the SRv6 control module. Specifically, it includes:

[0033] The message encapsulation optimization unit is used to encapsulate and optimize SRv6 messages before forwarding. It adopts G-SRv6 compressed frame header technology to reduce message header overhead, uses a simplified SRH structure for AI inference traffic to reduce encapsulation latency, and uses batch SID offset encoding for AI training traffic to improve encapsulation efficiency.

[0034] The message forwarding unit is used to forward AI service messages hop-by-hop according to the received SID list;

[0035] The path authentication execution unit is used to verify the authentication code of the received message during the message forwarding process. If the authentication is successful, the message continues to be forwarded; if the authentication fails, the message is discarded.

[0036] The security protection module is used to build a multi-layered end-to-end security protection system for financial AI data; specifically, it includes:

[0037] The path dynamic authentication unit is used to work in conjunction with the SRv6 control module to generate a dynamically changing authentication segment identifier Authen-SID according to a specified period, and send it along with the segment list carrying the path identifier and sequence number to all SRv6 forwarding modules on the path for real-time authentication of packets.

[0038] The quantum encryption unit is used to establish an end-to-end quantum encryption channel for sensitive data traffic, and uses "SRv6 + quantum key + national cryptographic algorithm" technology to ensure the confidentiality and integrity of data transmission;

[0039] The data desensitization unit is used to desensitize sensitive fields before AI data enters the SRv6 network. Combined with federated learning technology, it enables sensitive data to participate in joint modeling only as model parameter updates without leaving the local network, thus achieving effective protection of the original data.

[0040] The operation and maintenance monitoring module is used for end-to-end visual monitoring, fault location, and data feedback of AI service traffic, specifically including:

[0041] The flow detection unit is used to embed a flow measurement identifier when AI service data packets enter the network using iFIT flow detection technology. The SRv6 forwarding modules along the way collect and report the latency, jitter and packet loss information of each hop in real time based on the identifier.

[0042] The visualization unit is used to reconstruct the complete transmission path and segmented performance indicators of each AI business flow based on the reported data, and to display the transmission status, path status and computing node load of AI traffic in real time in the form of charts.

[0043] The fault location unit is used to quickly locate the fault node and cause using the SID location capability of SRv6 when a network anomaly is detected, and to feed back the fault information to the SRv6 control module to trigger the recalculation or adjustment of the path.

[0044] The proactive early warning module, deployed on the operations and maintenance monitoring side, is used to predict future network status changes based on historical monitoring data, enabling proactive early warning and avoidance of network faults; specifically, it includes:

[0045] The predictive analysis unit is used to perform time-series predictions on the future bandwidth utilization of the core link and the load change trend of AI computing nodes using a long short-term memory recurrent neural network model.

[0046] The pre-tuning trigger unit is used to send a pre-tuning trigger command to the SRv6 control module when the prediction result exceeds the warning threshold. It pre-calculates and caches alternative paths, and activates the switchover immediately when actual congestion occurs.

[0047] This application adopts an integrated design of "business perception - computing network collaboration - path authentication - flow feedback," organically combining multiple technologies such as AI business feature recognition, computing load monitoring, dynamic path authentication, and flow detection to form a complete closed loop from business recognition to path calculation, and from security protection to operation and maintenance monitoring. The various technologies work together to produce the following beneficial effects:

[0048] First, this application establishes a precise mapping relationship between network scheduling and AI service characteristics, resolving the mismatch between traffic scheduling and AI service requirements in existing solutions. Existing SRv6 scheduling schemes classify services based on a five-tuple, but this classification granularity remains at the service source level, failing to perceive the differentiated network resource requirements at different stages within the same service category. This application, by collecting real-time traffic characteristic parameters of AI services, including burst peaks and transmission durations of training traffic, and packet size and interaction frequency of inference traffic, determines whether the current transmission is of training data or an inference request. For training traffic, multi-path load balancing and redundant bandwidth reservation are employed, aggregating multiple physical links into a logical high-bandwidth pipeline to meet its massive data burst transmission needs. For inference traffic, shortest path priority and highest priority scheduling are used to reduce forwarding hops and queuing, meeting its low-latency and fast-response requirements. Through this differentiated processing, the transmission throughput of training data is improved, and the end-to-end latency of inference requests is controlled, avoiding the problems of training traffic crowding out inference bandwidth or inference traffic exceeding latency limits due to path redundancy.

[0049] Second, this application incorporates the load of computing nodes into path scheduling decisions, achieving joint optimization of network transmission efficiency and computing resource utilization. In existing solutions, network optimization and computing scheduling belong to different systems, and path calculation is based solely on network status information without considering the load status of the destination computing nodes, easily leading to contradictions such as "smooth network but queuing computing power" or "idle computing power but network congestion." This application collects computing node load data such as CPU utilization and GPU utilization in real time. When the load of computing nodes exceeds a preset threshold, it identifies the affected service flows and filters candidate target nodes with lower loads. The path is recalculated with the constraint of minimizing end-to-end latency and avoiding congestion along the route, and traffic is scheduled to idle computing nodes. This mechanism integrates network path selection and computing task allocation into a unified decision-making framework, avoiding task queuing caused by computing overload. Tests show that when GPU utilization exceeds 70%, the average queuing latency and processing latency jitter of tasks increase sharply. This application, by pre-scheduling traffic to idle nodes, increases the effective working time of computing resources such as GPUs by more than 30%. Meanwhile, by distributing traffic to computing nodes in multiple data centers, the processing bottleneck caused by single-point computing power overload is avoided, and the overall computing output is improved without changing the hardware scale.

[0050] Third, this application embeds a hop-by-hop path authentication mechanism into the SRv6 forwarding process, solving the problem of existing solutions lacking real-time verification capabilities for transmission paths. Existing SRv6 scheduling schemes mainly focus on path calculation itself, lacking the ability to verify whether the path has been hijacked or the packet has been tampered with. Security protection usually relies on transport layer encryption protocols such as IPsec, but these protocols protect the data content rather than the path itself. In this application, the controller generates a dynamically changing authentication segment identifier Authen-SID at a specified period (e.g., 1 second), which is sent to all nodes on the path along with a segment list carrying the path identifier and sequence number. When the head node encapsulates the packet, it uses this information to generate an authentication code and embeds it in the packet. After receiving the packet, the intermediate and tail nodes recalculate and generate a second authentication code and compare them. If they match, forwarding continues; if they do not match, the path is deemed invalid and the packet is discarded. This mechanism realizes real-time verification of path legitimacy in the forwarding process. The 1-second authentication period is much shorter than the session timeout time (more than 30 seconds) commonly seen in financial transactions, effectively preventing replay attacks and preventing data from flowing to unauthenticated nodes, forming a complementary protection with transport layer encryption.

[0051] Fourth, this application establishes an end-to-end monitoring and closed-loop feedback mechanism from a business perspective through flow-following detection technology, shortening the fault response cycle. Existing monitoring technologies mainly focus on network infrastructure indicators such as link utilization and latency. When service performance deteriorates, maintenance personnel see abnormal link indicators but find it difficult to correlate these phenomena with specific service flows, resulting in fault investigation cycles typically lasting several hours. This application employs iFIT flow-following detection technology, embedding flow measurement identifiers when AI service data packets enter the network. Nodes along the route collect and report latency, jitter, and packet loss information for each hop in real time based on these identifiers. The maintenance platform reconstructs the complete transmission path and segment-by-segment performance indicators for each service flow based on the reported data. When anomalies such as excessive end-to-end latency, path interruption, or security authentication failure are detected, the SID positioning capability of SRv6 is used to quickly determine the location of the faulty node, improving the positioning accuracy from "a certain link" to "the outgoing port of a certain node," reducing fault investigation time to minutes. Simultaneously, fault information is fed back to the path calculation module to trigger path adjustment, forming an automated closed loop from monitoring to decision-making to execution.

[0052] Fifth, this application proactively mitigates risks and improves the efficiency of data transmission before congestion occurs by optimizing message encapsulation and proactively issuing early warnings. Existing technologies typically trigger optimization processes only after link utilization exceeds an 80% threshold, a reactive approach where services are already affected during the brief congestion period before the threshold is triggered. This application, based on historical monitoring data and employing a long short-term memory recurrent neural network model, not only predicts the future bandwidth utilization of core links but also forecasts the load change trends of AI computing nodes. When the prediction indicates that a critical path will exceed the warning threshold (e.g., 75%) in the future, a pre-optimization process is triggered in advance. Alternative paths that meet SLA requirements are pre-calculated and cached for affected service flows. Once the actual utilization exceeds the optimization threshold (80%), the cached path strategy is immediately activated, achieving millisecond-level rapid switching and reducing the impact of congestion on services. Meanwhile, this application uses G-SRv6 compressed frame header technology to reduce packet header overhead, adopts a simplified SRH structure for AI inference traffic, improves packet carrying efficiency by 40%, and reduces forwarding latency by more than 30%; for AI training traffic, it adopts batch SID offset encoding, improves packet carrying efficiency by 20%, increases single-link transmission throughput by 25%, and can shorten the transmission cycle by 20%-30% for PB-level training data.

[0053] Sixth, this application ensures business continuity through a smooth "build-then-disconnect" switching mechanism, guaranteeing that upper-layer applications are unaware of changes to the underlying path. When traffic needs to be switched from the original path to the new path, a new policy is first issued to the ingress node for pre-installation but not immediately activated. After the new path is successfully established and connectivity is confirmed, traffic is gradually migrated from the old path to the new path. The resources occupied by the original path are released only after the traffic on the original path reaches zero. This approach avoids potential business interruptions and packet loss during the switching process, with fault recovery time controlled within 50ms, meeting the requirements of Ethernet linear protection switching in the ITU-T G.8031 standard. This is particularly important for interruption-sensitive services such as real-time risk control and high-frequency trading. Attached Figure Description

[0054] Figure 1 A schematic diagram illustrating the process of optimizing a dedicated financial network for large AI models;

[0055] Figure 2 This is a schematic diagram of the business-aware differentiated SRv6 path calculation process;

[0056] Figure 3 A schematic diagram of the dynamic adjustment process for computing power-aware paths;

[0057] Figure 4 A schematic diagram of the SRv6 path dynamic authentication and packet forwarding process;

[0058] Figure 5 A schematic diagram of the end-to-end visualized operation and maintenance and closed-loop feedback process;

[0059] Figure 6 System architecture diagram for optimizing a dedicated financial network for large AI models. Detailed Implementation

[0060] The technical solution of this application will be further described in detail below with reference to the accompanying drawings and specific embodiments. Those skilled in the art should understand that the embodiments described herein are for illustrative purposes only and are not intended to limit the scope of protection of this application.

[0061] Definitions:

[0062] 1. Multi-path load balancing: refers to the technology of distributing AI training traffic to multiple physical links in the SRv6 network for parallel transmission. It aims to aggregate multiple links into a logical high-bandwidth pipeline to meet the massive and sudden transmission needs of AI training data.

[0063] 2. Shortest Path First: This refers to an algorithm that selects forwarding paths for AI inference traffic with the goal of minimizing the number of hops or link latency between the source and destination nodes. It aims to reduce forwarding hops and transmission latency to meet the millisecond-level response requirements of inference tasks.

[0064] 3. Computing Power Co-scheduling: This refers to a mechanism that incorporates the real-time load status of AI computing power nodes (such as GPU servers) into the SRv6 network path scheduling decision. By adjusting the network path, traffic is scheduled from high-load computing power nodes to idle nodes, achieving joint optimization of computing and network resources.

[0065] 4. Build First, Disconnect Later: A smooth traffic switching mechanism. During path change, the connectivity of the new path is first established and verified. Once the new path is ready, traffic is gradually migrated from the old path to the new path. Finally, the resources of the old path are released, aiming to ensure zero interruption and zero packet loss during the switching process.

[0066] 5. Dynamic Path Authentication: This invention proposes a security mechanism for real-time legitimacy verification of SRv6 transmission paths. By periodically issuing dynamically changing authentication segment identifiers (Authen-SID) by the controller, and with the head node generating authentication codes and intermediate nodes verifying them hop-by-hop, path hijacking and packet tampering are prevented.

[0067] 6. Authen-SID (Authenticated Segment Identifier): A special type of SRv6 segment identifier defined in this invention. It is dynamically generated by the controller at specified intervals and contains authentication information such as timestamps and path identifiers. It is used to generate an authentication code together with the segment list to achieve real-time authentication of SRv6 paths.

[0068] 7. SegmentList: In this invention, it specifically refers to the composite SID sequence structure used for path authentication. It not only contains the basic SID sequence used to guide packet forwarding, but also embeds metadata such as Authen-SID, sequence number, and validity period used for security authentication.

[0069] 8. Flow-based detection: A performance monitoring technology that operates alongside business data streams. By embedding flow measurement identifiers into AI service packets, network nodes along the path collect and report latency, jitter, and packet loss information for each hop, thereby achieving accurate visualization and reconstruction of the transmission quality of each AI service stream.

[0070] 9. Pre-tuning

[0071] Explanation: A proactive network operation and maintenance mechanism. Based on AI algorithms such as LSTM, it predicts future network load. When potential congestion risks are predicted, it calculates and caches alternative paths for services in advance. When congestion actually occurs, it immediately activates the alternative path, achieving a shift from passive response to proactive defense.

[0072] 10.G-SRv6: A general term for an SRv6 packet header compression technology. In this invention, it specifically refers to an encapsulation optimization technology that reduces SRv6 packet header overhead and improves payload transmission efficiency through methods such as field pruning, common prefix extraction, and SID aggregation encoding.

[0073] Example 1

[0074] This embodiment provides a method for optimizing a dedicated financial network for large AI models, applied to a "three-site, four-center" backbone network scenario for financial institutions. In this scenario, the network devices support the SRv6 protocol and are equipped with a centralized SRv6 controller.

[0075] The core logic of this method is: taking "AI business needs driving network optimization" as the core, leveraging the advantages of SRv6 such as SID programming and precise path scheduling, and combining the traffic characteristics, security requirements and computing power collaboration requirements of financial AI big data models, to build an integrated network optimization system of "perception-scheduling-security-collaboration-operation and maintenance".

[0076] like Figure 1 As shown, the method includes the following steps:

[0077] Step 1: AI business perception and network status collection.

[0078] The SRv6 controller interacts with AI business systems via interfaces to collect real-time business requirement parameters and traffic data from various large-scale financial AI models, while also collecting real-time status information from each node in the SRv6 network layer. Specific data collected includes:

[0079] Business requirement parameters: AI business type (training task or inference task), data sensitivity (sensitive data or public data, such as user credit data is sensitive data, public market data is public data), latency requirements (such as real-time risk control inference requires end-to-end latency ≤10ms, batch training requires latency ≤50ms).

[0080] Traffic data: For training traffic, collect its burst peak bandwidth (e.g., peak ≥ 10Gbps) and transmission duration; for inference traffic, collect its packet size (e.g., ≤ 1KB) and interaction frequency (e.g., ≥ 100 times / second).

[0081] Network status information: The SRv6 controller collects link bandwidth utilization, end-to-end latency, jitter, link fault status, etc. from each node of the SRv6 network layer.

[0082] The data acquisition frequency is set to 100ms / time to ensure real-time data transmission and avoid untimely scheduling due to perception lag. The acquired data is processed uniformly by the SRv6 controller to achieve comprehensive perception of AI needs and network status.

[0083] Step 2: Calculate differentiated SRv6 paths based on business awareness.

[0084] like Figure 2 As shown, the SRv6 controller receives the data collected in step 1 and classifies the AI ​​service type and traffic characteristics into three core scenarios: AI training services, AI inference services, and sensitive data transmission services. The judgment logic adopts a rule-based matching method, with preset characteristic thresholds for each type of service. For example, a burst traffic peak ≥10Gbps is judged as training traffic, and a data packet size ≤1KB and an interaction frequency ≥100 times / second is judged as inference traffic, ensuring the accuracy of the judgment.

[0085] Based on the judgment results, the SRv6 controller utilizes the SID programming capability of SRv6, combined with the "three-site, four-center" topology architecture of the financial dedicated network, to calculate and generate exclusive SRv6 path strategies for different types of AI services, as follows:

[0086] (1) For AI training traffic, which is characterized by burstiness and massive data volume, SRv6 bandwidth and path pre-allocation is performed. Based on the traffic peak and combined with the multi-link reachability information between the source node and the destination node in the "three-site four-center" topology, the SRv6 controller allocates dedicated SRv6 segment routing paths for the training traffic. It generates the first SID list (training-dedicated SID list) (such as SID-T-01~SID-T-05) through programming and reserves redundant bandwidth (the reservation ratio is 20% of the traffic peak) to avoid burst traffic occupying core service bandwidth. At the same time, the SRv6 multi-path load balancing mechanism is adopted to distribute the training traffic to multiple idle links to improve transmission efficiency. Path selection prioritizes avoiding links with high current load.

[0087] (2) For AI inference traffic, which is characterized by high frequency, low latency, and small data packets, SRv6 shortest path priority scheduling is implemented. The SRv6 controller calculates the shortest SRv6 path between AI inference nodes and computing power nodes through the global network view and the hop count information between nodes in the "three-site four-center" topology. It then generates a second SID list (inference-specific SID list) (such as SID-I-01~SID-I-03), deletes redundant path nodes, reduces the number of forwarding hops, and sets the priority of inference traffic to the highest (higher than ordinary financial business traffic) to ensure millisecond-level response to inference requests. If the shortest path experiences slight latency fluctuations, it automatically switches to the backup shortest path (the latency difference between the backup path and the main path is ≤0.5ms).

[0088] (3) For sensitive data traffic involving financial user privacy information, security policies are implemented and bound to the SRv6 path. Combining the security authentication capabilities of nodes in the "three-site four-center" topology, the SRv6 path is deeply bound to security policies such as quantum encryption and data anonymization to generate a third SID list (security-specific SID list) (such as SID-S-01~SID-S-04). Only nodes that have passed security authentication are allowed to access the path. At the same time, a latency threshold for sensitive data transmission is preset to ensure that security protection does not affect transmission efficiency.

[0089] Step 3: Dynamic adjustment of the computing power perception path.

[0090] like Figure 3 As shown, the SRv6 controller collects the load status of AI computing nodes in each data center in real time, specifically including: CPU utilization, GPU utilization, memory usage, and task execution progress. The collection frequency is the same as in step 1, set to 100ms / time. The SRv6 controller establishes a linkage mechanism between computing load and network scheduling, binding the load status of computing nodes with SRv6 paths and bandwidth allocation to achieve collaborative optimization of "computing power-network".

[0091] After receiving the computing load data, the SRv6 controller determines the load status of each computing node. In this embodiment, the preset load threshold is 70% (which can be adjusted according to the needs of financial AI business). When the GPU utilization exceeds this threshold three times consecutively, the node is determined to be in an overload state.

[0092] When an overload of a computing node is detected, the SRv6 controller dynamically adjusts its execution path according to the following sub-steps:

[0093] Sub-step 3-1, Overload event triggering. The SRv6 controller receives and updates the computing resource status table at a fixed period (100ms). When it detects that the GPU utilization of any AI computing node exceeds the preset overload threshold (70%) three times consecutively, it determines that the node has entered an overload state and identifies the list of affected service flows passing through the node.

[0094] Sub-step 3-2, Candidate Target Node Screening. For each affected service flow, initiate the path recalculation process. First, analyze the service flow's requirements, particularly its required computing power type (e.g., GPU model) and computing power size. Then, query the current computing power resource status table and screen all other data center nodes to select nodes that meet the following conditions as candidate target nodes: the node has idle computing power resources of the type required by the service, and the node's current overall load (e.g., weighted average GPU utilization) is below the acceptance threshold (50%). This screening process excludes equally busy nodes, ensuring that traffic is scheduled to truly idle computing power resources.

[0095] Sub-step 3-3: Joint Path Calculation and Optimal Solution Selection. For each selected candidate target node, an improved multi-constraint shortest path algorithm is invoked. Starting from the source node and ending at the candidate node, an optimal network path that satisfies the service flow SLA constraints is calculated. The goal of this algorithm is to minimize end-to-end latency while ensuring that the bandwidth utilization of all links on the path does not exceed a preset safety threshold (80%) after the addition of this service flow. After the algorithm is executed, for each (candidate node, optimal path) combination, a corresponding comprehensive score is calculated. The comprehensive score function is designed as: Score = w1 × (1 - maximum bandwidth utilization of the path) + w2 × (1 - target node load), where w1 and w2 are weight coefficients that can be dynamically adjusted according to the network and computing power constraints (e.g., increasing w2 when computing power resources are scarce). The (node, path) combination with the highest comprehensive score is selected as the new scheduling strategy.

[0096] Sub-steps 3-4 involve task priority coordination. If the load of all candidate computing nodes exceeds the acceptance threshold, indicating a general shortage of computing resources across the network, the SRv6 controller automatically triggers AI task priority scheduling, suspends non-core AI training tasks, releases computing resources, and prioritizes computing and network resources for core inference tasks (such as real-time risk control and high-frequency trading analysis) to ensure the continuity of core financial AI business.

[0097] Sub-steps 3-5: Smooth Policy Switching and Execution. Based on the selected optimal path, a new SRv6 policy, i.e., a new SID list, is generated. To ensure uninterrupted service during the switchover process, a smooth switchover mechanism of "build first, then disconnect" is adopted. Specifically, the new policy (including the new SID list) is first sent to the ingress node, but an immediate switchover command is not issued. After receiving the new policy, the ingress node establishes forwarding entries to the new path, but still uses the old path to forward service flows. After the new path is successfully established (e.g., path connectivity is detected as normal through mechanisms such as BFD), a traffic switching command is issued, instructing the ingress node to migrate traffic from the old path to the new path in a "gradual transfer" manner. For example, 10% of the traffic can be forwarded through the new path first, and after observing that the network and computing node status is normal, the proportion can be gradually increased until 100% of the traffic is switched to the new path, and finally the resources of the old path are cleaned up.

[0098] In addition, this method also includes a rapid deployment mechanism for computing power nodes: combined with ZTP zero-configuration start-up technology, newly launched AI computing power nodes can automatically connect to the SRv6 network. The SRv6 controller automatically identifies the new nodes and assigns them a dedicated SID and path, realizing "plug and play". This reduces the deployment cycle of new nodes from several weeks to minutes, adapting to the needs of rapid iteration in financial AI business.

[0099] Step 4: SRv6 path dynamic authentication and message forwarding.

[0100] like Figure 4 As shown, considering the security requirements of sensitive financial AI data, a three-layer security protection system of "SRv6 path authentication + quantum encryption + data anonymization" is constructed, specifically implemented as follows:

[0101] (1) SRv6 Path Dynamic Authentication. A time-period-based SRv6 transmission path authentication method is adopted. The SRv6 controller sends an Authen-SID specific to the path to all SRv6 nodes on the preset path according to a specified period (the preset period is 1 second), and at the same time sends a SegmentList carrying the PathID and Sequence. When the SRv6 head node sends a message, it uses the SegmentList and Authen-SID to calculate and generate the first SRH authentication code. When the intermediate nodes and tail nodes forward the message, they recalculate and generate the second authentication code and compare it with the first authentication code. If they match, the message is forwarded; otherwise, it is discarded to prevent path tampering and replay attacks.

[0102] The Authen-SID is uniformly generated by the SRv6 controller and distributed to all SRv6 nodes along a preset path at a preset interval (1 second, configurable). The generation cycle is completely synchronized with the path authentication cycle. It automatically expires upon timeout and a new Authen-SID is generated. The generation uses a hybrid hash algorithm of "timestamp + path-unique ID + node public key hash" to ensure it cannot be illegally forged. The core generation formula is:

[0103] Authen-SID=SHA256(T+PID+H(PubKey))mod2^128

[0104] The parameters are defined as follows:

[0105] T: Millisecond-level precise timestamp (e.g., 1690000000000) to ensure the timeliness of Authen-SID;

[0106] PID: A unique path ID within the financial private network (assigned according to the "three locations and four centers" topology, such as the path PID from Head Office IDC1 to East China IDC2 being 001-002, globally unique).

[0107] H(PubKey): The SM3 hash of the public keys of all legitimate SRv6 nodes on this path (generated using the SM2 algorithm), ensuring that only legitimate nodes can resolve it;

[0108] mod2^128: Maps the calculation result to 128 bits, matching the standard length of SRv6SID.

[0109] Prerequisites for generation: The SRv6 controller must first verify the online status and identity legitimacy of all nodes on the path. If there are unauthenticated or offline nodes, the path will be replanned immediately and a new PID will be generated before the Authen-SID generation process is executed.

[0110] The Authen-SID uses a standard 128-bit binary structure, functionally segmented according to the authentication and expansion requirements of the financial AI private network. Each field is a fixed 32 bits with no redundancy and can be directly embedded into the SegmentList field of the SRH header, compatible with the basic forwarding SID. Specifically, the segments are as follows: bits 0-31 store the lower 32 bits of a millisecond-level timestamp, used by nodes to verify whether the Authen-SID has timed out; bits 32-63 store the encoded value of the path-unique PID, used to bind the authentication identifier to the specific transmission path; bits 64-95 store the lower 32 bits of the node's public key SM3 hash value, used for initial node identity verification; and bits 96-127 are reserved fields to accommodate future security expansions in the financial industry.

[0111] SegmentList is a composite SID sequence structure generated by the SRv6 controller for the authentication path. It contains the basic forwarding SID sequence, Authen-SID, and authentication metadata. It serves as the dual basis for SRv6 packet forwarding and path authentication. It is embedded in the SRH field of the SRv6 header and is transmitted between path nodes along with the packet, ensuring that each node can complete path validity and packet integrity verification based on this structure.

[0112] The SegmentList is generated by the SRv6 controller based on a pre-configured authentication path on the financial AI private network. The generation is synchronized with the Authen-SID and is only distributed to legitimate nodes along that path. The core generation rules are as follows:

[0113] Basic SID Sequence Generation: Based on the financial “three locations and four centers” topology, generate the basic SRv6 forwarding SID sequence for this path (such as head node SID-middle node SID-tail node SID) to ensure the basic forwarding capability of the packet.

[0114] Add authentication metadata: Add a sequence number and validity period to the base SID sequence. The sequence number is an auto-incrementing unique integer (to prevent replay attacks), and the validity period is the same as the Authen-SID (1 second).

[0115] Authen-SID Fusion: The generated Authen-SID is embedded at the end of the basic SID sequence as an authentication-specific SID, forming a complete authentication SegmentList with the forwarding SID;

[0116] Validity verification: After generation, the SRv6 controller sends a verification command to all nodes in the path. The SegmentList will only become effective if all nodes report that they have successfully received and parsed the command.

[0117] The SegmentList is a dynamically lengthed binary structure, with its total length varying with the number of nodes in the path (base SID sequence length = 128 bits × number of nodes). It is embedded within the SegmentList field of the SRH header and fully complies with the SRH field specification of RFC8986. Its structure, from high to low bits, consists of: header identifier (8 bits, identifying it as "authentication-type SegmentList"), number of path nodes (8 bits, storing the number of SRv6 nodes in the path), base forwarding SID sequence (128 bits × N, where N is the number of path nodes), Authen-SID (128 bits), sequence number (32 bits, an auto-incrementing unique integer), and expiration date (32 bits, storing the expiration timestamp of the Authen-SID).

[0118] (2) Quantum encryption protection. For sensitive data traffic, a quantum encryption strategy is bound to the SRv6 path. The "SRv6 + quantum key + national cryptographic algorithm" technology is adopted. Quantum security U devices are deployed at the headquarters and branches to establish an end-to-end quantum encryption dedicated line channel to achieve encryption protection for sensitive data transmission.

[0119] (3) Data anonymization protection. Before AI data enters the SRv6 network, sensitive data is anonymized (e.g., hiding the middle digits of user ID card number and bank card number). Combined with federated learning technology, sensitive data can participate in joint modeling only as model parameter updates without leaving the local network, thus effectively protecting the original data and preventing the leakage of sensitive data.

[0120] (4) Security protection and scheduling linkage. The security protection module and the SRv6 controller are linked in real time. If abnormal traffic is detected (such as unauthorized access, data tampering, authentication failure), the SRv6 path dynamic switching is triggered immediately to cut off the abnormal connection and send the alarm information to the operation and maintenance platform to realize the integrated processing of "security alarm - path switching - fault diagnosis".

[0121] In addition, before forwarding SRv6 packets, a packet encapsulation optimization step is included: Addressing the low carrying efficiency of existing SRv6 packets, G-SRv6 compressed frame header technology is used to optimize the encapsulation of SRv6 packets. The G-SRv6 compression algorithm employs a two-stage execution logic of "field pruning + SID aggregation," and is customized to suit the training / inference traffic characteristics of financial AI scenarios, as detailed below:

[0122] Phase 1: Field Pruning Algorithm. Non-essential fields for financial AI scenarios in the standard SRv6 header are selectively removed, retaining only core fields necessary for routing and forwarding. IPv6 Basic Header Pruning: Non-essential extended fields such as Hop-by-HopOptions and DestinationOptions are removed, retaining eight core fields: Version, TrafficClass, FlowLabel, PayloadLength, NextHeader, HopLimit, SourceAddress, and DestinationAddress. This reduces the IPv6 basic header length from 40 bytes to 32 bytes (inference traffic) or retains 40 bytes (training traffic, where flow labels must be retained for batch data transmission identification). SRH Pruning: Non-essential fields such as Reserved and HMACKeyID are removed from the SRH, retaining only three core fields: SegmentCount, LastSegmentIndex, and SegmentList. This compresses the SRH length from the standard minimum of 32 bytes to 8 bytes (inference traffic) or 16 bytes (training traffic).

[0123] Phase Two: SID Aggregation and Compression Algorithm. Aggregation encoding is performed on the SegmentList (composed of multiple 128-bit SIDs), which constitutes the largest proportion in the SRH. Leveraging the characteristics of the financial private network—"relatively fixed path nodes and high SID prefix repetition"—SID storage overhead is reduced. SID Prefix Extraction: The common prefix of all SIDs in the same financial private network path is extracted (e.g., in the "three locations, four centers" topology of financial institutions, SIDs in the same area have consistent prefixes). This common prefix is ​​stored in the local dictionary of the SRv6 controller, and only the prefix index (4-bit / 8-bit) is carried in the header. Repeated SID Encoding: For consecutively repeated SIDs in the path (e.g., intermediate node SIDs in multi-hop forwarding), an encoding method of "repetition count + base SID" is used (e.g., three consecutive identical SIDs are encoded as "0011 + base SID"), replacing the original three complete 128-bit SIDs.

[0124] Adaptation for Financial AI Scenarios: For AI inference traffic (small data packets, high-frequency interactions), a "prefix index + single SID encoding" approach is adopted to compress multiple SIDs in the SegmentList into "1-byte prefix index + 2-byte SID suffix," reducing the storage overhead of a single SID from 16 bytes to 3 bytes. For AI training traffic (large data packets, burst transmissions), a "batch SID offset encoding" approach is adopted to sort multiple consecutive SIDs by offset, storing only the starting SID and the offset list, achieving a compression ratio of 1:5 to 1:8.

[0125] Algorithm execution flow: When a packet enters the network, the SRv6 edge router receives the compression strategy issued by the SRv6 controller (dynamically adjusted based on the AI ​​service type); it executes the field pruning algorithm to remove unnecessary fields and generate a simplified IPv6+SRH header; it executes the SID aggregation compression algorithm to optimize the encoding of the SegmentList, generate the final compressed frame header, and add a 1-bit "compression identifier" (identifying the compression type) to the SRH; the packet is forwarded in the financial private network according to the compressed frame header, and all SRv6 nodes identify the compression type through the "compression identifier" and call the corresponding parsing logic; when the packet leaves the network, the edge router performs the reverse process of decompression to restore the standard SRv6 frame header, ensuring protocol compatibility with the computing power nodes.

[0126] The compressed frame header data structures are compared below:

[0127]

[0128] According to the measured data in the "G-SRv6 Network Technology White Paper (2022 Edition)", in the financial private network scenario: the packet carrying efficiency of AI inference traffic is improved by 44.4%, and the forwarding latency is reduced by more than 30%, fully meeting the requirement of "end-to-end latency ≤10ms"; the packet carrying efficiency of AI training traffic is improved by 22.2%, the single-link transmission throughput is improved by 25%, and the PB-level training data transmission cycle is shortened by 20%-30%.

[0129] Step 5: End-to-end visualized operation and maintenance and closed-loop feedback.

[0130] The SRv6 controller employs flow-following detection technology to monitor the transmission status of AI traffic (bandwidth utilization, packet loss rate), SRv6 path status (latency, jitter, node load), computing node status (load, task execution progress), and security protection status (authentication results, anomaly alarms) in real time. Using iFIT flow-following detection technology, it loads a flow measurement identifier onto each AI data packet, enabling end-to-end visual monitoring from AI service nodes to computing nodes. Simultaneously, the monitoring data is displayed in chart format (such as latency change curves and bandwidth utilization bar charts), facilitating real-time monitoring of the network and AI service operation status by operations and maintenance personnel.

[0131] like Figure 5 As shown, the proactive early warning and AI algorithm prediction function described in step 5 is implemented in the following way:

[0132] First, a predictive model is established. A separate data analysis and prediction platform is deployed, employing a Long Short-Term Memory (LSTM) recurrent neural network model to perform time-series predictions of bandwidth utilization on the core link. The model's input feature vector includes: time-series data of the link's utilization over the past 60 minutes (at 1-minute granularity), the number of currently active cross-datacenter AI training tasks and their estimated remaining time, whether the current day is a financial trading day or a special settlement day (represented by Boolean values), and hourly time features.

[0133] Secondly, model training and updates. The prediction platform constructs training and validation sets using historically collected network monitoring data, and performs supervised training on the LSTM model using the average link utilization over the next 5 minutes as the prediction target. The training process uses root mean square error as the loss function and the Adam optimizer for parameter updates. After training, the model is encapsulated as a prediction service, interacting with the SRv6 controller via a REST API. The model is retrained weekly using the latest data to ensure its adaptability to dynamic network changes.

[0134] Finally, the application of prediction results and proactive tuning. Before each routine path calculation, the SRv6 controller calls this prediction service to obtain the predicted utilization rate of each core link within the next 5 minutes. When the prediction results show that the utilization rate of a critical path will exceed the preset warning threshold (75%) within the next 5 minutes, it is judged as a potential congestion risk. Subsequently, the controller automatically triggers a pre-tuning process: for the low-priority service flows carried on this path (such as non-real-time AI training data backup), alternative paths that meet their SLA requirements are calculated in advance, and these alternative paths are cached in the policy library in a "standby" state. If the actual network utilization rate continues to climb and exceeds the formal tuning threshold (80%), the controller immediately activates these pre-calculated standby policies to achieve millisecond-level rapid switching, thereby effectively avoiding congestion and upgrading the network operation and maintenance mode from "passive response" to "proactive defense".

[0135] The SRv6 controller determines whether there are network anomalies based on monitoring data. The anomaly judgment criteria include: end-to-end latency exceeding a preset threshold, jitter exceeding a preset threshold, path interruption, packet loss rate ≥1%, and security authentication failure.

[0136] If the network is normal, the monitoring task will continue to be executed, and the process will loop back to step 2 to adapt to AI business needs and traffic changes in real time.

[0137] If a network anomaly occurs, SRv6 rapid fault location and repair is performed: Utilizing SRv6's SID location capability, combined with monitoring data, faulty nodes (such as SRv6 router failures, link interruptions, and computing node failures) are quickly located, and the fault location and cause are marked on the visualization platform. For path interruption faults, the SRv6 controller automatically triggers redundant path switching (switching time ≤ 50ms) to restore AI traffic transmission; for node failures, traffic is automatically scheduled to backup nodes. After fault repair, the system automatically switches back to the original path (or the optimal path) and records fault information for subsequent operation and maintenance optimization.

[0138] If the AI ​​service is not terminated, the optimization process continues to cycle through steps 2 to 5, adapting in real time to the dynamic changes of the AI ​​service and the fluctuations of the network status, ensuring that the network is always in the optimal operating state; if the AI ​​service is terminated (such as when AI training is completed or the inference task ends), the entire optimization process is terminated, SRv6 path and bandwidth resources are released, and monitoring data and optimization logs are archived to provide a reference for network optimization of subsequent AI services.

[0139] The key thresholds and parameters involved in this technical solution are all set based on the characteristics of financial AI business, hardware performance benchmark tests, and industry best practices, and have dynamic configurability to adapt to the needs of financial institutions of different sizes and types:

[0140] Business Feature Threshold Setting: The traffic feature thresholds used in Step 2 to distinguish between AI training and inference services (e.g., burst traffic peaks ≥10Gbps are judged as training traffic) are determined as follows: In the initial stage of system deployment, all AI service traffic for one consecutive month is collected through the SRv6 visualization monitoring platform. Statistical analysis is performed on the collected traffic data to extract features such as peak bandwidth, average packet length, and request frequency for each service flow. A clustering algorithm automatically categorizes service flows into training and inference types, and the 95th percentile of each type of service traffic feature is used as the default judgment threshold. After the system goes live, the controller will automatically recalculate and fine-tune these thresholds monthly based on real-time collected traffic data, achieving adaptive optimization of the thresholds.

[0141] Computing Node Load Threshold Setting: The threshold (70%) used in step 3 to determine computing node overload is set based on performance inflection point tests of mainstream AI acceleration chips (such as NVIDIA V100, A100, and Huawei Ascend 910). Test results show that when GPU utilization consistently exceeds 70%, the average queuing latency and task processing latency jitter increase sharply, severely impacting the real-time performance of model inference. Therefore, 70% is used as the warning line to trigger computing power scheduling. This threshold is not fixed; system administrators can flexibly adjust it through the controller's northbound interface according to the specific business's SLA requirements and hardware model. For example, for core transaction risk control businesses that are extremely sensitive to latency, the computing node threshold can be lowered to 50%.

[0142] Time parameter settings: The SRv6 path dynamic authentication cycle (1 second) in step 4 is a result of comprehensively considering the timeliness of security protection and the performance overhead of the controller. A 1-second cycle is much shorter than the session timeout time commonly found in financial transactions (usually over 30 seconds), effectively preventing replay attacks. Simultaneously, this cycle matches the frequency of the controller collecting network status (100ms / time) and computing power status (100ms / time), enabling real-time verification of path status without introducing excessive control plane pressure. The fault switching time target (≤50ms) complies with the requirements for Ethernet linear protection switching in the ITU-T G.8031 standard, ensuring that upper-layer AI applications (such as real-time risk control) are unaware of underlying network faults. Achieving this target relies on SRv6's fast fault detection mechanism (such as BFD) and the smooth "build first, then disconnect" switching process.

[0143] Example 2

[0144] This embodiment provides a financial-specific network optimization system for large AI models, used to implement the optimization method described in Embodiment 1. For example... Figure 6 As shown, the system is deployed in the "three-site, four-center" backbone network environment of a financial institution. The network adopts a three-layer structure design: core layer, aggregation layer, and access layer. Each layer uses a dual-plane architecture, with the production backbone network using a dual-plane design and the non-production backbone network using a single-plane deployment. The network equipment supports the SRv6 protocol, and the control layer has an SDN controller cluster (primary site, backup site, and arbitration server) deployed with a geographically distributed disaster recovery architecture to achieve high availability of the control layer.

[0145] The system includes the following modules:

[0146] The AI ​​business perception module, deployed on the financial business side, is used to collect in real time the business requirement parameters and traffic data of various financial AI large models, as well as the real-time status information of each node in the SRv6 network layer. The specific parameters, collection frequency, and judgment logic are the same as in step 1 of Example 1. The collected data is uniformly sent to the SRv6 control module.

[0147] The SRv6 control module, serving as the system's centralized management and control unit, is deployed within the SDN controller cluster. It manages and distributes SRv6 policies across the entire network, enabling AI service type identification, differentiated path calculation, collaborative computing power scheduling, and policy distribution. This module specifically includes:

[0148] The business type identification unit is used to classify and determine the AI ​​business type and traffic characteristics based on the data reported by the AI ​​business perception module, dividing them into AI training business, AI inference business, and sensitive data transmission business. The judgment logic and feature thresholds are the same as step 2 in Example 1.

[0149] The differentiated path calculation unit utilizes the SID programming capabilities of SRv6, combined with the "three-site, four-center" topology of the financial dedicated network, to calculate and generate exclusive SRv6 path strategies for different types of AI services. The specific processing methods for training traffic, inference traffic, and sensitive data traffic are the same as step 2 in Example 1.

[0150] The computing power collaborative scheduling unit is used to receive the real-time load status of computing power nodes reported by the computing power resource monitoring module and establish a linkage mechanism between computing power load and network scheduling. When the load of a computing power node is detected to exceed a preset threshold, dynamic path adjustment is performed, including candidate node screening, joint path calculation, comprehensive scoring and optimization, and task priority coordination. The specific process is the same as step 3 in Example 1.

[0151] The policy distribution unit is used to distribute the generated SID list to the SRv6 forwarding module via control protocols such as NETCONF / YANG. During traffic switching, a smooth switching mechanism of "build first, then disconnect" is adopted, and the specific switching process is the same as step 3 of Example 1.

[0152] The computing power resource monitoring module is deployed on AI computing power nodes in various data centers to collect and report the load status of all computing power nodes across the network in real time. This module specifically includes:

[0153] The load acquisition unit is used to collect the CPU utilization, GPU utilization, memory usage, and task execution progress of the computing nodes in real time. The acquisition frequency is the same as in step 3 of Example 1.

[0154] The status reporting unit is used to report the collected load data to the computing power collaborative scheduling unit of the SRv6 control module in real time according to a preset cycle.

[0155] In addition, the module also supports ZTP zero-configuration start-up function. Newly launched AI computing power nodes can automatically connect to the SRv6 network. The SRv6 control module automatically identifies the new nodes and assigns them a unique SID and path, realizing "plug and play".

[0156] The SRv6 forwarding module, deployed at each forwarding node (PE / P router) of the financial backbone network, is used to perform forwarding and path authentication of AI service packets according to the policies issued by the SRv6 control module. This module specifically includes:

[0157] The packet encapsulation optimization unit is used to encapsulate and optimize SRv6 packets before forwarding, employing G-SRv6 compressed frame header technology to reduce packet header overhead. This unit executes field pruning algorithms and SID aggregation compression algorithms, using different compression strategies for inference traffic and training traffic. The specific compression methods and compression ratios are the same as in step 4 of Example 1. When a packet enters the station, compression is performed according to the compression strategy, and a compression flag is added; when a packet leaves the station, the reverse process of decompression is performed to restore the standard SRv6 frame header.

[0158] The message forwarding unit is used to forward AI service messages hop-by-hop according to the received SID list.

[0159] The path authentication execution unit is used to verify the authentication code of received packets during packet forwarding. This unit receives the Authen-SID and SegmentList from the SRv6 control module. The header node generates an authentication code and embeds it into the packet. The intermediate and tail nodes recalculate and compare the code. If authentication passes, forwarding continues; if authentication fails, the packet is discarded. The specific implementation of the authentication mechanism is the same as step 4 in Example 1.

[0160] The security protection module is used to build a multi-layered, end-to-end security protection system for financial AI data. This module specifically includes:

[0161] The path dynamic authentication unit is used in conjunction with the SRv6 control module to generate a dynamically changing authentication segment identifier (Authen-SID) at a specified period. This Authen-SID, along with a segment list (SegmentList) carrying the path identifier and sequence number, is then sent to all SRv6 forwarding modules along the path for real-time packet authentication. The Authen-SID generation algorithm, data structure, and SegmentList structure are all the same as in step 4 of Example 1.

[0162] The quantum encryption unit is used to establish an end-to-end quantum encryption channel for sensitive data traffic. It adopts the "SRv6 + quantum key + national cryptographic algorithm" technology and deploys quantum-secure U devices in headquarters and branches to ensure the confidentiality and integrity of data transmission.

[0163] The data desensitization unit is used to desensitize sensitive fields before AI data enters the SRv6 network. Combined with federated learning technology, it enables sensitive data to participate in joint modeling only as model parameter updates without leaving the local network, thus achieving effective protection of the original data.

[0164] The security linkage unit is used to link with the SRv6 control module in real time. When abnormal traffic is detected, it immediately triggers dynamic switching of the SRv6 path, cuts off the abnormal connection, and sends alarm information to the operation and maintenance monitoring module.

[0165] The operations and maintenance monitoring module is used for end-to-end visual monitoring, fault location, and data feedback of AI service traffic. This module specifically includes:

[0166] The flow detection unit uses iFIT flow detection technology to embed a flow measurement identifier when AI service data packets enter the network. SRv6 forwarding modules along the route collect and report latency, jitter, and packet loss information for each hop in real time based on this identifier. The collection frequency is the same as in step 5 of Example 1.

[0167] The visualization unit is used to reconstruct the complete transmission path and segmented performance indicators of each AI business flow based on the reported data, and to display the transmission status, path status and computing node load of AI traffic in real time in the form of charts.

[0168] The fault location unit is used to quickly locate the faulty node and cause using the SID location capability of SRv6 when a network anomaly is detected, and to feed back the fault information to the SRv6 control module to trigger path recalculation or adjustment. The fault switching time and the mechanism for switching back after fault repair are the same as in step 5 of Example 1.

[0169] The proactive early warning module, deployed on the operations and maintenance monitoring side, is used to predict future network status changes based on historical monitoring data, enabling proactive early warning and avoidance of network faults. This module specifically includes:

[0170] The predictive analysis unit is used to perform time-series prediction of the future bandwidth utilization of the core link using an LSTM model. The input features, prediction target, and training and update mechanism of the model are the same as those in step 5 of Example 1.

[0171] The pre-tuning trigger unit sends a pre-tuning trigger command to the SRv6 control module when the prediction result exceeds the warning threshold. Upon receiving the command, the SRv6 control module pre-calculates and caches alternative paths for the affected service flows, and immediately activates the switchover when the actual utilization exceeds the tuning threshold. The warning threshold and tuning threshold are the same as in step 5 of Example 1.

[0172] In this embodiment, the modules work together to form a complete closed loop from business awareness to path calculation, and from security protection to operation and maintenance monitoring. The setting basis and configuration methods of the key thresholds and parameters involved in the system (including business characteristic thresholds, computing power node load thresholds, authentication cycles, fault switching times, etc.) are the same as those described in Embodiment 1, and will not be repeated here.

[0173] It should be noted that the specific embodiments described in this application are merely illustrative examples of the technical solutions of this invention, and not limitations on the scope of protection. The core of this invention lies in achieving deep integration of AI business feature recognition, computing load monitoring, dynamic path authentication, and flow-following feedback through an integrated design of "business perception—network collaboration—path authentication—flow feedback." Any modifications, equivalent substitutions, or improvements made based on the technical solutions described in this application, or any application of the aforementioned core technical features to other scenarios (including but not limited to AI large-scale model network optimization in other industries), within the spirit and principles of this invention, as long as their substance is the same as the technical solutions described in this application, fall within the protection scope of this invention.

Claims

1. A financial network optimization method for large AI models, characterized in that, Includes the following steps: Collect business requirement parameters, traffic data, and real-time status information of the SRv6 network layer for the financial AI big data model; wherein, the business requirement parameters include AI business type, data sensitivity, and latency requirements; the traffic data includes burst peaks of training traffic, transmission duration, and packet size and interaction frequency of inference traffic; and the network status information includes link bandwidth utilization, latency, jitter, and fault status. Based on the collected data, the segment identifier (SID) programming capability of SRv6 is used to generate exclusive SRv6 path policies for different types of AI services, and generate corresponding SID lists. Specifically, for AI training traffic, a first SID list corresponding to the multi-path load balancing policy is generated; for AI inference traffic, a second SID list corresponding to the shortest path policy between the source node and the destination node is generated; and for sensitive data traffic, a third SID list corresponding to the path bound to the security policy is generated. The load status of AI computing nodes is collected and linked with the SRv6 path strategy. When the load of a computing node is detected to exceed a preset threshold, the optimal network path is recalculated with the constraint of minimizing end-to-end latency and not causing congestion along the way. AI traffic is then scheduled from the overloaded node to the target computing node through SRv6SID programming. The controller generates a dynamically changing Authen-SID for each path at a specified period and distributes it to all SRv6 nodes on that path. The head node encapsulates the message based on the determined SID list and generates a first authentication code embedded in the message using the authentication information. After receiving the message, the intermediate and tail nodes recalculate and generate a second authentication code and compare the second authentication code with the first authentication code. If they match, the message is forwarded according to the SID list; otherwise, the message is discarded. When AI service data packets enter the network, flow measurement identifiers are embedded using flow detection technology. SRv6 nodes along the route collect and report performance information for each hop based on the identifiers. The operation and maintenance monitoring layer reconstructs the complete transmission path and segment-by-segment performance indicators of each AI service flow based on the reported data. When a network anomaly is detected, the faulty node is located using the SID positioning capability of SRv6, and the fault information is fed back to the path calculation or adjustment steps to trigger the recalculation or adjustment of the path. In the steps of generating dedicated SRv6 path strategies for different types of AI services: for AI training traffic, while generating a first SID list, redundant bandwidth is reserved for the training traffic; for AI inference traffic, while generating a second SID list, the highest forwarding priority is configured for the inference traffic; for sensitive data traffic, when generating a third SID list, quantum encryption and data desensitization security capabilities are deeply coupled with the path, and access is restricted to nodes that have only undergone security authentication. In the process of scheduling AI traffic from overloaded nodes to target computing power nodes, a smooth switching mechanism of "build first, then disconnect" is adopted. Specifically, after the new path is successfully established and the connectivity is confirmed to be normal, traffic is gradually migrated from the old path to the new path. In the step where the controller generates a dynamically changing authentication segment identifier (Authen-SID) for each path at a specified period, the Authen-SID is generated in the following manner: Authen-SID=SHA256(T+PID+H(PubKey))mod2^128; Where T is a millisecond-level timestamp, PID is a unique path ID within the financial private network, and H (PubKey) is the SM3 hash value of the public keys of all valid SRv6 nodes on this path; The Authen-SID is embedded into the SegmentList, which carries the path identifier and sequence number, to form a complete list of authenticated segments, and then sent to all SRv6 nodes on the path. The head node uses the authentication information to generate the first authentication code of the embedded message, specifically by using the SegmentList and Authen-SID to calculate the first SRH authentication code of the embedded message using a hash algorithm. Before forwarding SRv6 packets, packet encapsulation optimization steps are also included: G-SRv6 compressed frame header technology is used to remove non-core fields in the IPv6 base and SRH, and common prefix extraction and aggregation encoding are performed on the segment list SID; for AI inference traffic, a simplified SRH structure is used to minimize single packet encapsulation latency; for AI training traffic, batch SID offset encoding is used to improve the encapsulation and forwarding efficiency of large data blocks. Following the end-to-end visualized operation and maintenance steps, there are also proactive early warning and pre-tuning steps: The operation and maintenance monitoring layer uses a long short-term memory recurrent neural network model based on historical monitoring data to make time-series predictions of the future bandwidth utilization of core links; when the prediction results show that a critical path will exceed the early warning threshold in the future, the controller automatically triggers the pre-tuning process to pre-calculate and cache alternative paths that meet the service level agreement requirements for the affected service flows; when the actual utilization exceeds the tuning threshold, the cached alternative path strategy is immediately activated.

2. A financial network optimization system for large AI models, characterized in that, The system is used to implement the method of claim 1, and the system comprises: The AI ​​business perception module is deployed on the financial business side to collect business demand parameters and traffic data of the financial AI big model and send the collected data to the SRv6 control module. The SRv6 control module, as the centralized management and control unit of the system, is used to manage and issue SRv6 policies across the entire network, realizing AI service type identification, differentiated path calculation, computing power collaborative scheduling, and policy issuance; the SRv6 control module includes a service type identification unit, a differentiated path calculation unit, a computing power collaborative scheduling unit, and a policy issuance unit; The computing power resource monitoring module is deployed in the AI ​​computing power nodes of each data center to collect and report the load status of the computing power nodes across the network. It includes a load collection unit and a status reporting unit. The SRv6 forwarding module is deployed at each forwarding node of the financial backbone network. It is used to perform forwarding and path authentication of AI business messages according to the policies issued by the SRv6 control module. It includes a message forwarding unit and a path authentication execution unit. The security protection module is used to build a multi-layered end-to-end security protection system for financial AI data, including a path dynamic authentication unit; The operation and maintenance monitoring module is used for end-to-end visual monitoring, fault location and data feedback of AI business traffic, including a flow detection unit, a visualization unit and a fault location unit.

3. The financial dedicated network optimization system for AI large-scale models according to claim 2, characterized in that, The SRv6 forwarding module also includes a message encapsulation optimization unit, which is used to encapsulate and optimize SRv6 messages before forwarding. It uses G-SRv6 compressed frame header technology to reduce message header overhead, adopts a simplified SRH structure to reduce encapsulation latency, and uses batch SID offset encoding to improve encapsulation efficiency.

4. The financial dedicated network optimization system for AI large-scale models according to claim 2, characterized in that, The security protection module also includes a quantum encryption unit and a data desensitization unit; The quantum encryption unit is used to establish an end-to-end quantum encryption channel for sensitive data traffic, and uses "SRv6 + quantum key + national cryptographic algorithm" technology to ensure the confidentiality and integrity of data transmission; The data desensitization unit is used to desensitize sensitive fields before AI data enters the SRv6 network, and combined with federated learning technology, it enables sensitive data to participate in joint modeling in the form of model parameter updates without leaving the local network.

5. A financial network optimization system for large AI models according to claim 2, characterized in that, The system also includes an active early warning module, deployed on the operation and maintenance monitoring side, used to predict future network status changes based on historical monitoring data; the active early warning module includes a predictive analysis unit and a pre-tuning trigger unit. The predictive analysis unit is used to perform time-series prediction of the future bandwidth utilization of the core link using a long short-term memory recurrent neural network model. The pre-tuning trigger unit is used to send a pre-tuning trigger instruction to the SRv6 control module when the prediction result exceeds the early warning threshold, pre-calculate and cache alternative paths, and activate the switch when actual congestion occurs.