Network Performance Management Engine

By deploying packet interceptors and custom filters, the network management engine addresses the lack of real-time visibility in complex networks, enhancing troubleshooting and optimization by providing detailed metrics on packet latency and routing, thus improving network reliability.

JP2026109601APending Publication Date: 2026-07-01EBAY INC

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
EBAY INC
Filing Date
2025-12-18
Publication Date
2026-07-01

AI Technical Summary

Technical Problem

Conventional network management systems lack detailed real-time visibility into packet-level latency and traffic behavior in complex and dynamic environments, making it difficult to accurately detect subtle anomalies, correlate them with specific network paths, and identify root causes in large-scale evolving network infrastructures.

Method used

Deploying packet interceptors (e.g., the eBPF-extended Berkeley packet filtering program) and custom Envoy filters to measure and record packet processing time at various points within the network, providing detailed metrics on request routing and latency, and using a network management engine with a layered approach to monitor and optimize network performance.

Benefits of technology

Enhances visibility into network performance, enabling efficient troubleshooting and optimization by pinpointing latency issues at specific nodes, improving network reliability and responsiveness.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026109601000001_ABST
    Figure 2026109601000001_ABST
Patent Text Reader

Abstract

This invention provides a method, system, and computer storage medium for providing a network management engine in a cloud computing system. [Solution] A method for managing a network in a network management system via a network performance management engine, wherein a client communicates packets associated with an application gateway and a network packet management extension engine 502A, receives a response packet associated with the packet based on the communication of the packet 504A, extracts packet latency data associated with the response packet 506A, and transmits packet latency data 508A.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to a network performance management engine.

Background Art

[0002] Users can interact with a cloud mesh network in different types of applications and services to achieve network tasks. A cloud mesh network refers to a distributed interconnection system of multiple cloud environments that cooperate to provide scalable, flexible, and elastic network services across diverse geographical locations. The cloud mesh network enables seamless communication and data sharing between various cloud platforms, allowing users to optimize resource allocation, improve data transfer speeds, and enhance fault tolerance. By utilizing a mesh architecture, cloud resources can be dynamically interconnected to ensure high availability and redundancy. This type of network can support applications and services in various domains such as load balancing, disaster recovery, and global content delivery. Through advanced routing protocols and automated resource management, the cloud mesh network improves the overall performance and reliability of cloud-based systems, making it an ideal solution for enterprises and service providers with the needs of complex and distributed infrastructure.

Summary of the Invention

[0003] The various aspects of the technology described herein generally concern systems, methods, and computer storage media for providing a network management engine in cloud computing systems. A network management engine is an end-to-end system that oversees, monitors, and optimizes the entire lifecycle of network traffic, from the initial client request to the final data response. It ensures seamless data flow by capturing, analyzing, and managing network packets, and identifies and resolves latency or congestion issues by tracking key performance metrics. The network management engine leverages a layered approach to network management, monitoring, and fault detection to ensure efficient data flow, metric capture, and performance analysis.

[0004] The network management engine includes a network packet management extension engine and a network performance management engine. The network packet management extension engine is a dedicated engine designed to calculate and analyze packet latency at various stages of the network path. The network packet management extension engine improves network packet processing by capturing detailed information at critical points in the flow. For example, during client request processing and ingress at a Transport Layer Balancer (TLB), and during data inspection and logging at an ingress gateway (GW), the network packet management extension engine supports connectivity tracking by enabling measurement of packet transit time, capturing metadata, and logging latency within the network. This network packet management extension engine enables custom processing and data logging that provides deep insights into latency dynamics across the network infrastructure.

[0005] The network performance management engine is a dedicated engine that identifies and manages deviations from expected network behavior, particularly in areas where latency can occur. By monitoring network traffic, analyzing latency patterns, and visualizing data, the network performance management engine detects performance problems and supports automated remediation. Key features include client-side latency calculation and graph generation that visualizes latency data via path analysis graphs, and automated analysis and remediation that leverages tools to correlate latency with system resources and reroute traffic when necessary. The network performance management engine provides a holistic view of network performance through tools such as fault analyzers, supporting proactive problem solving.

[0006] Operationally, a client initiates a request to the application gateway to receive a response containing network latency data captured via the response header. The application gateway (App GW) receives and validates the incoming request, which is then forwarded to the transport layer balancer (TLB) based on established routing rules. The TLB distributes client requests across available backend resources, thus distributing the load. Packet interceptors (e.g., the eBPF-extended Berkeley packet filter program) are employed to monitor packet flow, adding timestamps and capturing ingress and egress times. A first packet interceptor adds an ingress timestamp to packets as they enter the TLB, while a second packet interceptor measures the egress time of packets leaving the TLB and calculates the total processing time within the TLB.

[0007] Packets are encapsulated by a tunnel for secure transport between the TLB and the ingress gateway (GW). The ingress GW routes traffic from the TLB to the appropriate backend service, works with a third packet interceptor to log packet transit time, and captures relevant metadata for latency tracking. The third packet interceptor records the ingress timestamp of packets arriving at the ingress GW, monitors the egress time of packets leaving, and enables detailed latency measurement.

[0008] The Application Gateway Envoy (App GW Envoy) acts to pass packets to backend servers, adding tracking headers with latency and node metadata for network visibility and end-to-end latency tracking, and then passing annotated packets. First, the server processes client requests received via Server Envoy by performing backend operations and routes the response back through Server Envoy for header tracking. The server application processes client requests based on business logic, generates responses, and sends them back through Server Envoy. App GW Envoy then adds the final tracking header to the outgoing packet, completing the latency monitoring chain and passing the packet to the client.

[0009] The client extracts and calculates network latency data. The client can update logs (e.g., log database) using routing analysis graphs to support alert generation. The logs store routing analysis graphs generated by the client. The Time Series Database (TSDB) stores traffic metrics (e.g., latency and performance metrics over time) from the TLB and ingress GW to support analysis (e.g., through integration with routing analysis data) and enable efficient data retrieval for fault analysis and trend visualization. The fault analyzer analyzes log and time series database data to identify network anomalies and generates fault alerts when deviations in network performance or latency are detected. These operations, centered on packet flow capture, monitoring, and analysis, are performed via the network management engine, providing end-to-end network performance management and fault detection.

[0010] This “Summary of the Invention” is provided to introduce the concept in a simplified form, which is further explained in the “Modes for Carrying Out the Invention” below. This “Summary of the Invention” is not intended to identify the main or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. [Brief explanation of the drawing]

[0011] The technology described herein is described in detail below with reference to the accompanying drawings. [Figure 1] This is a block diagram of a cloud mesh network architecture according to an aspect of the technology described herein. [Figure 2A] This is a block diagram of a network management system for providing network management according to an aspect of the technology described herein. [Figure 2B]This is a block diagram of a network management system for providing network management according to an aspect of the technology described herein. [Figure 2C] This is a block diagram of a network management system for providing network management according to an aspect of the technology described herein. [Figure 2D] This is a block diagram of a network management system for providing network management according to an aspect of the technology described herein. [Figure 3] This is a block diagram of a network management system for providing network management according to an aspect of the technology described herein. [Figure 4A] This specification provides a first set of exemplary methods for providing network management in a network management system via a network packet management extension engine, according to aspects of the technology described herein. [Figure 4B] This specification provides a first set of exemplary methods for providing network management in a network management system via a network packet management extension engine, according to aspects of the technology described herein. [Figure 4C] This specification provides a first set of exemplary methods for providing network management in a network management system via a network packet management extension engine, according to aspects of the technology described herein. [Figure 5A] This specification provides a second set of exemplary methods for providing network management in a network management system via a network performance management engine, according to aspects of the technology described herein. [Figure 5B] This specification provides a second set of exemplary methods for providing network management in a network management system via a network performance management engine, according to aspects of the technology described herein. [Figure 5C] This specification provides a second set of exemplary methods for providing network management in a network management system via a network performance management engine, according to aspects of the technology described herein. [Figure 6] This specification provides a block diagram of an exemplary artificial intelligence system computing environment suitable for use in implementing aspects of the technology described herein. [Figure 7] This specification provides a block diagram of an exemplary distributed computing environment suitable for use in implementing aspects of the technology described herein. [Figure 8] This is a block diagram of an exemplary computing environment suitable for use in implementing aspects of the technology described herein. [Modes for carrying out the invention]

[0012] Overview Cloud computing systems provide a distributed network of remote servers hosted over the internet for storing, managing, and processing data, rather than relying on local servers or personal computers. In cloud computing systems, resources such as storage, processing power, and applications are delivered to users as a service over the internet. These services can scale on demand and are often designed to support multi-tenant architectures, where multiple clients securely share the same infrastructure. Cloud computing systems act as a backbone for providing seamless connectivity and distributed data processing, enabling client requests to be managed and processed across a network of interconnected resources.

[0013] A cloud mesh network provides a networking framework that enables seamless interconnection and communication between multiple cloud computing resources, often distributed across different physical and virtual environments. In a cloud mesh network, various nodes and gateways (such as application gateways, transport layer balancers, ingress gateways, and other network components) work together to create an integrated network fabric that routes and balances traffic between distributed services. A cloud mesh network can be a tessellated mesh network (i.e., a tess cloud mesh), which can be used in geographically distributed sensor networks where data points (or "nodes") form a repeating spatial pattern across specific areas. By establishing secure and efficient routing paths, a cloud mesh network ensures that data flows between nodes with minimal latency, guaranteeing optimal performance and reliability.

[0014] In this way, the cloud computing system provides the underlying infrastructure to support distributed applications and storage, and the cloud mesh network manages the real-time flow of data between these distributed resources. Through this network, data is directed across various processing layers (from client requests to backend services via load balancers and application gateways) with precise control over packet processing, timestamping, and latency tracking. Together, the cloud computing system and the cloud mesh network ensure that each request is processed efficiently, end-to-end latency is minimized, and anomalies in network behavior can be rapidly detected and addressed through real-time monitoring and fault analysis.

[0015] Conventionally, network management systems have been limited in providing detailed real-time visibility into packet-level latency and traffic behavior, especially in particularly complex and dynamic environments. This limitation makes it difficult for the system to accurately detect subtle anomalies, correlate them with specific network paths, and identify root causes in large-scale evolving network infrastructures. For example, conventional network packet management systems often have limited ability to accurately calculate and quantify packet latency, especially in particularly complex and dynamic network topologies. In such networks, packets may move along different paths that experience various levels of congestion, routing changes, or Quality of Service (QoS) configurations, all of which may result in significant delays. For example, if a packet experiences congestion on one route, its latency may increase, while a packet taking an alternative route may encounter minimal delay.

[0016] These factors create a moving target for network management tools and make it difficult to provide precise visibility into the latency of every packet in real time. These tools can provide valuable insights into a wide range of network performance trends, such as identifying common congestion points or large-scale latency problems, but are often insufficient to provide the detailed packet-level visibility required for precise troubleshooting. As a result, achieving accurate real-time monitoring of packet latency in complex and changing environments remains an important challenge. For example, large enterprise networks can have multiple redundant links between data centers. A network packet management system can report overall latency trends, but cannot accurately indicate whether a particular packet has taken a longer route due to a temporary routing change or congestion on one of the links. This lack of granularity makes it difficult for network administrators to understand the root cause of latency problems on a per-packet basis.

[0017] Similarly, conventional network performance management systems have limitations in detecting anomalies and correlating them with specific node-level traffic metrics. Modern networks are inherently dynamic, with constantly changing traffic patterns, adaptive routing protocols, and fluctuating QoS settings, all of which add layers of complexity. These dynamics make it difficult to maintain precise visibility into the behavior of network traffic at the packet level. For example, if a particular link begins to experience higher latency than expected, especially if the system relies on sampled data or aggregated metrics that may not capture the fine details needed for the engine to identify subtle problems, it may take some time for the system to detect this anomaly.

[0018] Furthermore, conventional performance management systems that only monitor aggregated data from routers or switches may miss important nuances such as intermittent spikes in latency caused by overloaded network nodes or undetected network path failures. Detecting such anomalies becomes even more difficult when visibility into specific network segments or paths is limited. In essence, without more advanced monitoring tools that can provide deep packet-level insights and real-time correlation, identifying and addressing performance issues in complex modern network infrastructures is a significant operational challenge. For example, a sudden drop in application performance due to latency may be wrongly attributed to a wide area network (WAN) bottleneck, but in reality, it may be the result of a temporary misconfiguration in the routing table of a specific network node. Without precise per-packet visibility, such anomalies may not be detected, leading to delays in troubleshooting and resolution. Therefore, a more comprehensive network management system with an alternative basis for performing network management operations can improve the computing operations and interfaces for providing network management.

[0019] Description of the technical solution At a high level, the network management engine within the cloud computing system oversees the entire lifecycle of network traffic, from client request to data response, by capturing, analyzing, and optimizing network packets to ensure seamless data flow. The network management engine includes two main components: an extended network packet management engine that calculates and analyzes packet latency at key stages (such as during client request processing and data inspection), and a network performance management engine that monitors traffic, detects latency deviations, visualizes data, addresses performance issues, and supports automated remediation to optimize network behavior.

[0020] Cloud computing networks can be based on distributed cloud architectures (e.g., cloud mesh networks or Tess cloud mesh) that connect various services and resources in a mesh configuration, improving flexibility and scalability. In the context of cloud computing networks, this enables seamless communication and resource sharing between different components, allowing for dynamic load balancing, improved fault tolerance, and optimization of resource utilization across multiple cloud environments.

[0021] As an example, a distributed cloud architecture for a cloud computing network might consist of several Tess App Gateway (App GW) groups. Each App GW instance consists of an Ingress GW and a TLB (Traffic Layer Balancer). The Ingress GW functions as an L7 load balancer, and the TLB functions as an L4 load balancer. An L4 load balancer operates at the transport layer, directing traffic based on IP addresses and ports, making it efficient for high-throughput applications. In contrast, an L7 load balancer operates at the application layer, enabling more sophisticated routing decisions based on the content of the request, such as HTTP headers and URLs.

[0022] An App GW instance can be equivalent to a single hardware load balancer. Each App GW instance consists of M Ingress GW instances and N TLB nodes. Multiple VIPs can be hosted by an App GW instance. Requests targeting any VIP may come in through one of the TLB nodes and then be forwarded through an IP-in-IP tunnel to one of the Ingress GW nodes before being routed to the service endpoint. Responses are sent back directly from the Ingress GW node to the client. Incoming TLS connections to the VIP terminate at one of the Ingress GW nodes. New connections to service backend pods are established from the Ingress GW nodes, and these are persistent connections. Connections terminate on the Envoy proxy instances within the pod, which are also mesh components.

[0023] Referring to Figure 1, Figure 1 shows a cloud computing system 100 with a cloud mesh network 100A architecture (e.g., Tess App GW architecture) designed to efficiently manage incoming requests while providing secure and reliable access to backend services. At the heart of this architecture is the concept of modular components that work together to enable seamless communication from clients to service endpoints. At the forefront of this architecture is a client 120, which represents a user or system initiating a service request. Client 120 works with a boundary gateway protocol component (e.g., BGP 130) used to effectively route traffic between different networks. Client 120 is the source of requests aimed at accessing services hosted behind the App GW 140. The client targets a specific virtual IP (VIP) hosted by the App GW 140 and marks an entry point associated with the cloud mesh network 110. When a request is made, it first encounters the App GW 140, which acts as a central hub for managing these incoming requests, distributing them to the appropriate services and providing visibility and routing.

[0024] Each App GW instance includes several Ingress GW 146 instances and a Transport Layer Balancer (TLB) 142 node. The Ingress GW 146 is the gateway node to which incoming TLS connections terminate and are routed to backend service pods. The TLB nodes work to distribute incoming requests across available Ingress GWs, ensuring balanced load and preventing any single node from being overwhelmed. Requests targeting the VIP enter through one of these TLB nodes, which then forwards the request to the Ingress GW node via an IP-in-IP tunnel (e.g., tunnel 144). This tunneling mechanism provides a secure and efficient means of transporting data between components.

[0025] When multiple requests reach the Ingress GW node, they encounter a TLS termination process where the incoming secure connection is decrypted. Here, a persistent connection to the backend service pod is established, specifically, an App GW Envoy 148 connects to server envoy 152. Server envoy 152 may be an envoy proxy instance running within the application pod, managing communication between Ingress GW 146 and the backend services represented by server 150 and server application 154.

[0026] The server application 154 is where the core business logic resides. It processes requests, generates responses, and these responses are relayed back through established channels. The responses retrace the same path in reverse, from the server application through server envoy 152 back to Ingress GW 146, but from Ingress GW 146 they are sent directly to client 120. While this architecture handles traffic and secures communications, it faces significant visibility challenges. Currently, request duration metrics are captured at Ingress GW 146 and Envoy proxy 148, but more detailed insights into latency per hop and the specific paths taken by requests through the mesh remain difficult to grasp. This lack of visibility complicates the troubleshooting process and makes it difficult for teams to identify hotspots or latency issues within the mesh.

[0027] Understanding how data packets travel through the cloud mesh network architecture 110 and the time they take at each step can be challenging. The cloud mesh network architecture faces significant challenges due to hop-by-hop latency and a lack of visibility into the path requests take. This issue complicates the process of diagnosing network performance problems, optimizing resource allocation, and ensuring reliable service delivery.

[0028] For example, when a request is made to a service VIP (virtual IP address), it passes through multiple nodes in the network, including TLB nodes and ingress GW nodes. However, without detailed insight into the time taken at each hop, it becomes difficult to identify where the delay is occurring. This lack of granular data hinders the ability to accurately pinpoint performance bottlenecks, making troubleshooting a complex and time-consuming task.

[0029] Furthermore, the inability to track the precise path of requests through the mesh network means that hotspots—areas experiencing higher-than-usual traffic or latency—cannot be easily identified and addressed. This can lead to performance degradation and impact the overall reliability of the network. For a network to be reliable, it must deliver requests in a timely and consistent manner, and unexpected delays can undermine this reliability.

[0030] As an example, a web application hosted on a Tess cloud mesh is accessed by a user. The request is sent from the user's browser to a service VIP managed by the Tess cloud mesh. This request passes through several components before reaching the service endpoint responsible for processing it. First, the request is directed to a TLB node, where load balancing is handled at the L4 level. From there, the request is forwarded via an IP-in-IP tunnel to one of the ingress GW nodes. At the ingress GW node, which acts as an L7 load balancer, the request is processed and sent to the appropriate service endpoint. A connection to the service backend pod is established, and the envoy proxy within the pod takes over processing the request. At the service endpoint, the request is processed, a response is generated, and then sent back via the same path to the user's browser. Within this scenario, several potential latency points exist, as follows: This includes the time it takes for the request to travel from the TLB node to the Ingress GW node, the processing time at the Ingress GW node, the time required for the request to reach the service endpoint and generate a response, and the time required for the response to travel through the Ingress GW node and TLB node back to the user's browser.

[0031] Without visibility into latency at each of these hops, it becomes difficult to pinpoint the exact source of the delay. For example, if a user experiences a slow response time, potential causes could include prolonged processing time at the TLB node, delays occurring within the IP-in-IP tunnel between the TLB and the ingress GW node, a bottleneck at the ingress GW node, or slow processing at the service endpoint.

[0032] The lack of visibility into per-hop latency and the paths taken by requests traversing the Tess cloud mesh presents significant challenges in diagnosing problems, optimizing performance, and ensuring reliability. The proposed solution attempts to address these challenges by providing detailed insights into network performance, thereby improving its overall efficiency and reliability. Specifically, the proposed solution involves deploying packet interceptors (e.g., the eBPF-extended Berkeley packet filtering program) and custom Envoy filters to measure and record packet processing time at various points within the network. By capturing detailed metrics on request routing and latency, the system aims to provide visibility into the path from client to service endpoint. This visibility enhances the ability to diagnose problems, optimize performance, and ensure the network operates reliably.

[0033] By implementing the proposed solution, which includes deploying packet interceptors (e.g., the Extended Berkeley Packet Filter (eBPF) program) and custom Envoy filters, detailed metrics regarding request routing and latency can be captured. For example, an eBPF program attached to a TLB node can measure the time taken for packet processing and forwarding. Similarly, an additional eBPF program on an ingress GW node can track the time spent on request processing and forwarding to service endpoints.

[0034] These metrics allow network administrators to quickly identify whether a delay reported by a user occurred at a TLB node, an ingress Gw node, or a service endpoint. This enhanced visibility facilitates more efficient troubleshooting and optimization, ensuring reliable and efficient network operation.

[0035] As an example, in complex networks, data packets can take various paths based on routing protocols, congestion, and configuration. Consider, for instance, a request made by a user in New York to a service hosted in a data center in California. The packet may pass through several nodes (such as routers and gateways) before reaching its destination. Without clear visibility into how long each hop takes, it becomes difficult to pinpoint where the delay is occurring. For example, if the first hop from the user's device to the local router takes 5ms, but the next hop to the regional data center takes 50ms due to congestion, without tracking this, the network team might assume the problem lies at the service endpoint rather than realizing the delay occurred earlier in the route.

[0036] To optimize performance, it's necessary to examine the exact path a request takes from start to finish. This means knowing which nodes it passes through and how much time it spends at each. Using a request tracking system, network administrators can verify that a request is sent from the client to the ingress GW, then to the load balancer, and finally to the TLB node before reaching the service endpoint. If the load balancer introduces a 40ms latency, this information allows the team to focus on optimizing that particular component.

[0037] Detailed metrics are crucial for diagnosing problems and improving network performance. These metrics should not only show average latency but also identify patterns and anomalies. If the metrics reveal that latency on a TLB node consistently spikes to 100ms during peak hours, the network team can investigate further. They might discover that this node is overloaded and needs scaling up, or that there is a configuration issue affecting performance.

[0038] Exemplary systems and resources The embodiments of the technical solution can be illustrated by referring, for example, to Figures 2A, 2B, 2C, and 3. Figure 2A shows a cloud computing system 100, which includes a cloud mesh network 110, a network packet management extension engine 110A, and a network performance management engine 110B, a client 120, a Border Gateway Protocol (BGP) 130, an Application Gateway (App GW) 140, a Transport Layer Balancer (TLB) 142, a packet interceptor 142A, a packet interceptor 142B, a tunnel 144, an ingress gateway (Ingress GW) 146, a packet interceptor 146A, an Application Gateway Envoy (App GW envoy) 148, a server 150, a server envoy 152, a server application 154, a log 160, a time-series database (TSDB) 170, and a fault analyzer 180. The cloud computing system 100 corresponds to the cloud computing system associated with the item listing system 600, which is described below with reference to Figure 6.

[0039] The Network Packet Management Extension Engine 110A and the Network Performance Management Engine 110B are collectively referred to as the Network Management Engine 110. The Network Management Engine 110 is an end-to-end system that oversees, monitors, and optimizes the entire lifecycle of network traffic from the initial client request to the final data response. By capturing, analyzing, and managing network packets, it ensures seamless data flow and tracks key performance metrics to identify and resolve latency or congestion issues.

[0040] The Network Packet Management Extension Engine 110A is a dedicated engine designed to calculate and analyze packet latency at various stages of the network path. The Network Packet Management Extension Engine 110A improves network packet processing by capturing detailed information at critical points in the flow. For example, during client request processing and ingress at the TLB, as well as during data inspection and recording at the ingress gateway (GW), it enables measurement of packet transit time, captures metadata, and supports connectivity tracking by recording latency within the network. The Network Packet Management Extension Engine 110A enables custom processing and data recording that provides deep insights into latency dynamics across the network infrastructure.

[0041] The Network Performance Management Engine 110B is a dedicated engine that identifies and manages deviations from expected network behavior, particularly in areas where latency is likely to occur. By monitoring network traffic, analyzing latency patterns, and visualizing data, the Network Packet Management Extension Engine 110A detects performance problems and supports automated remediation using AI. Key features include latency calculation and graph generation on the client side, which visualizes latency data using path analysis graphs, correlates latency with system resources, and automates analysis and remediation by leveraging tools to reroute traffic when necessary. The Network Performance Management Engine 110B provides a holistic view of network performance and supports proactive problem solving through tools such as the Fault Analyzer 180.

[0042] The network management engine 110 leverages a layered approach to network management, monitoring, and fault detection, with each component playing a role in ensuring efficient data flow, capturing metrics, and analyzing performance. The following is a detailed breakdown of each component, including its role, the data it processes, its interfaces, and its key operations.

[0043] Client 120 initiates a network request to access a service or application. Client 120 can represent either an end user or an automated system making a request to the backend. Client 120 communicates an outgoing network request including a header, source / destination IP, and payload. Client 120 sends the request to a network entry point (such as App GW140). Client 120 initiates a connection request, waits for a response, and optionally captures the response header and latency data for analysis.

[0044] BGP130 (Boundary Gateway Protocol) determines the optimal route for packets traversing a network. This architecture ensures that packets reach backend services via efficient and reliable routes. BGP130 may include routing tables associated with network routing metrics. BGP130 connects with routers and network gateways to share routing information to improve route selection. BGP130 dynamically updates routing paths based on network conditions, minimizing latency and rerouting traffic as needed to avoid congestion.

[0045] App GW140 provides the entry point for client traffic. App GW140 filters and manages traffic and directs it to internal services such as the Transport Layer Balancer (TLB) 142. App GW140 accesses and processes incoming requests from clients, including the IP header and payload. App GW140 interfaces with clients, the TLB, and other internal services. App GW140 further validates incoming requests, applies security policies, and forwards traffic to the TLB for load balancing.

[0046] TLB142 is responsible for distributing incoming traffic across multiple service instances. This ensures balanced workload distribution and minimizes latency. TLB142 processes incoming packets with headers, timestamps, and metadata. TLB142 connects to the ingress gateway (GW) 146 via tunnel 144. TLB142 distributes requests to backend services based on load, availability, and other performance factors. TLB142 hosts packet interceptors 142A and 142B, timestamping packets and monitoring latency.

[0047] Packet interceptors 142A and 142B provide monitoring and data manipulation within the kernel at the TLB level, enabling high-performance packet inspection and latency tracking. The packet interceptors track network packet headers, timestamps, and processing metrics. They attach to the TLB's ingress and egress points to monitor packet flow. Packet interceptor 142A timestamps incoming packets at ingress, and packet interceptor 142B records the exit time at egress. Together, these calculate processing time within the TLB.

[0048] Tunnel 144 securely transports packets between TLB 142 and Ingress GW 146. This encapsulated path prevents interference and maintains data integrity. Tunnel 144 processes encrypted packet payloads with headers and connects the TLB to the Ingress GW. Tunnel 144 provides a secure path for network traffic, maintaining isolation and data protection as packets travel through the infrastructure.

[0049] The Ingress Gateway (Ingress GW) 146 manages incoming traffic from TLB 142 and forwards it to the appropriate service instance. The Ingress GW 146 processes network packets with encapsulated metadata and communicates with the App GW Envoy 148 and TLB 142. The Ingress GW 146 directs packets to a specific application instance or proxy, attaches a packet interceptor 146A to further monitor processing time, and extracts relevant data for latency analysis.

[0050] Packet interceptor 146A operates within the ingress GW 146 to log timestamps as packets pass through and collect metadata. Packet interceptor 146A processes packet metadata such as timestamps and source IP, and monitors ingress and egress traffic within the ingress GW 146. Packet interceptor 146A tracks the time it takes for packets to pass through the ingress GW 146 from entry to exit, calculates latency, and captures relevant metrics for analysis.

[0051] The Application Gateway Envoy (App GW Envoy) 148 is an Application Gateway-level Envoy proxy that attaches metadata to packets and creates tracking headers for end-to-end latency monitoring. App GW Envoy 148 processes packets annotated with tracking headers and latency metrics, communicates with the Ingress GW, and appends data to response packets passed from backend servers. App GW Envoy 148 inserts tracking headers into each packet, providing hop-by-hop visibility into network latency and time at each node.

[0052] Packets are associated with packet latency data, which refers to collected metrics and contextual details that quantify the time it takes for a packet to travel through specific network elements and, ultimately, the entire network path. Beyond kernel-level representation, these metrics can be surfaced at the application layer using HTTP headers inserted by an intermediate proxy such as Envoy. In other words, the same latency data initially captured in low-level data structures is transformed into a human-readable format within HTTP headers attached to response packets, thereby providing clients and downstream systems with end-to-end visibility into per-hop latency and network performance bottlenecks.

[0053] For example, the measured TLB duration, along with other timing metrics recorded at different hops, may be injected into the HTTP header before the response packet is sent back to the client. These headers, which may include fields such as X-CORP-MESH-TLB-DURATION or similar identifiers, allow the receiver to gain a clear hop-by-hop understanding of the network latency being encountered. As a result, latency data collected at the map level can be transformed into meaningful application-layer insights, enabling clients or observability tools to analyze and respond to network performance issues with greater accuracy.

[0054] Packet latency data can be stored in a kernel-level data structure, specifically a map that associates the identification information of a particular network flow with key latency measurements recorded at the Transport Layer Balancer (TLB). This data structure operates using a key-value paradigm. The key is constructed from the source IP address and source port of the internal packet, uniquely identifying a given flow or connection attempt. The corresponding value includes the tunnel source IP and the measured TLB duration. The tunnel source IP identifies the specific TLB host that forwarded the packet, and the TLB duration quantifies the time the packet spent traversing the TLB node. By pairing these elements, packet latency data not only records how long a packet remained within a critical network component (TLB), but also correlates this timing with a specific source endpoint and the TLB node it traversed.

[0055] Packet latency data provides detailed insights into hop-by-hop delay and network behavior. This allows downstream processing components, such as Envoy proxies or other monitoring tools, to extract, analyze, and annotate network flows with precise latency information. This enables operators and automated systems to isolate slow network segments, identify performance bottlenecks, and gain a detailed hop-level understanding of packet movement through the cloud mesh.

[0056] Server 150 processes client requests and executes backend functions based on the received data. Server 150 processes incoming requests, including the packet payload, and then connects with Server Envoy 152 and Server Application 154. Server 150 provides backend services and data processing in response to client requests. Server Envoy 152 is the Envoy proxy at the server level. Server Application 154 is at the application layer, where client requests are processed according to business logic. Server Application 154 processes requests, and the response payload communicates with Server 150 and Server Envoy 152. Server Application 154 performs core operations and generates a response that is sent back to Client 120. Along the path to the client, App GW Envoy 148 adds a final tracking header and timestamp to the outgoing packet, providing latency and full visibility into the network path.

[0057] Log 160 stores detailed records of network events, metrics, and errors for later analysis. Log 160 data includes timestamped logs that capture network activity, packet metadata, and latency measurements. Log 160 is accessible by the fault analyzer 180. Log 350 collects and organizes logs, providing a historical record of network performance and events. The time-series database (TSDB) 160 stores timestamped network data (e.g., traffic metrics from TLB 142 and ingress GW 146) and provides structured datasets for latency and performance analysis. Data in TSDB 360 includes time-series metrics, including latency, packet flow, and error rates. TSDB 360 aggregates metrics over time, supporting efficient searches for analysis, visualization, and historical comparisons. TSDB 170 works with the fault analyzer 180 to support alert generation.

[0058] The fault analyzer 180 identifies performance issues by analyzing logs and metrics from the TSDB. The fault analyzer 180 processes time-series data, log entries, and fault metrics. Specifically, the fault analyzer 180 receives data from log 160 and TSDB 170, generates alerts based on detecting network performance anomalies, issues alerts when latency exceeds a threshold, and enables proactive management and troubleshooting.

[0059] This layered setup creates a framework for monitoring, analyzing, and optimizing network performance across a complex service infrastructure. From initial client requests to backend processing and performance analysis, each component plays a distinct role in ensuring reliable, high-performance network operation.

[0060] Referring to Figure 2B, Figure 2B shows a schematic diagram 100B relating to providing a network management engine according to the embodiments described herein. This process is initiated by a client 120 that sends a network request addressed to a specific service. This request passes through multiple layers and components, each carefully configured to optimize network performance and enhance visibility into the network state. Each component of this architecture is integrated to enable data collection, latency analysis, and fault detection using advanced mechanisms such as BGP 130 and packet interceptors across different nodes.

[0061] Upon startup, client 120's request is directed to App GW140, which processes the incoming traffic and balances it across multiple available servers in the system. App GW140 acts as an entry point for directing packets to the appropriate backend service. The request is then forwarded to TLB142, which is responsible for distributing network traffic across multiple application instances to ensure optimal load balancing and low latency. For example, when an HTTP packet enters App GW140, it first travels through the IP layer and is encapsulated in TCP (Transmission Control Protocol) for reliable delivery. The packet then passes through XDP (eBPF eXpress Data Path), where it may be filtered or redirected for high-performance packet processing. From there, it is passed to a Traffic Control (TC) system, which can implement policies such as bandwidth shaping or prioritization before forwarding it to an IPVS (IP Virtual Server) load balancer. Based on the configured load balancing method, the TC distributes the traffic to the appropriate backend servers. Throughout this entire path, the payload (application data) remains intact, and the overall processing time is minimized through efficient processing at each network layer.

[0062] Within TLB142, eBPF programs 142A and 142B are deployed to monitor packet flow at both ingress and egress points. These eBPF programs are kernel-based technologies designed for packet filtering and monitoring in high-speed environments. eBPF142A attaches to the ingress traffic path to mark each incoming packet with a timestamp and create a baseline for measuring latency. eBPF142B then attaches to the egress traffic path to measure the time packets spend within the TLB node, which provides an initial measurement of network latency.

[0063] From the TLB, traffic flows through tunnel 144, which links the TLB to the Ingress Gateway (Ingress GW) 146. This tunnel is designed to secure data transport between the TLB and the Ingress Gateway, ensuring that traffic remains isolated and optimized as it travels through the network infrastructure. Packets traveling through the tunnel to the Ingress GW 146 are encapsulated within multiple layers. It begins with an IP header, followed by another IP header, then a TCP segment with duration (DU) and the corresponding payload (application data). The entire packet is forwarded through the tunnel, the outer IP header handles routing to the Ingress Gateway, which deencapsulates the tunnel and passes the packet further to its destination.

[0064] Upon reaching the ingress gateway 146, another eBPF program, eBPF146A, is attached to track packets as they enter and leave the gateway. This eBPF program is configured to monitor key metrics such as packet arrival time, source IP, and transport duration, adding another layer of latency monitoring within the gateway node. Packets traveling through the ingress gateway 146 begin by being processed at the XDP (eBPF eXpress data path) level, where packets are filtered or redirected for high-performance processing before reaching the kernel's networking stack. From there, it passes through Traffic Control (TC), which manages the flow and ensures Quality of Service (QoS) by applying policies such as traffic shaping or prioritization. The packets are then routed at the IP layer, where their destination is determined and, if necessary, encapsulated for tunneling or forwarding to the appropriate backend.

[0065] Finally, Envoy (i.e., the application gateway Envoy148), as a service proxy, can process packets at the application layer and perform tasks such as load balancing, routing, and applying any additional network policies before the packets reach their intended service or application. Envoy works to record the source IP address, source port, and destination IP address associated with each new TCP connection. To achieve this, an Envoy network filter is introduced and executed whenever a new TCP connection is established.

[0066] During initialization, the network filter queries the kernel to retrieve the TLB IP address and TLB duration associated with the {source IP, source port} key. Once this information is obtained, it is stored in memory for reference during response processing. After the lookup, the corresponding kernel entry is immediately removed to maintain a clean state.

[0067] To prevent the accumulation of old data, the garbage collector within Envoy periodically removes outdated map entries. For example, it can run every two minutes and remove entries older than 90 seconds. This approach ensures efficient memory usage and prevents potential memory leaks.

[0068] Referring to Figure 2A, upon arrival at server Envoy 152, the Envoy proxy inspects the packet before it reaches server application 154. Here, the actual server application 154 processes the client's request and handles the actions specific to the requested service. After processing, the server application's response is routed back through server Envoy 162, and that step is traced back the same path to client 120 via App gateway Envoy 148 and ingress GW 146.

[0069] The client extracts and calculates network latency data. The client can update the log database with a routing analysis graph to support alert generation. A routing analysis graph is a visual representation that maps the distinct paths a packet takes across the network, providing detailed insights into latency and performance metrics at each hop. This type of graph is particularly valuable in complex network environments where a packet can traverse multiple routers, switches, and gateways, each potentially impacting overall latency based on congestion, routing decisions, or quality of service (QoS) settings.

[0070] When constructing a routing analysis graph, data is collected at key network nodes, including entry points, intermediate routers, gateways, and destinations. Each node in the graph represents a point where metrics such as latency, packet drop rate, and jitter are measured. As packets travel along their routes, these nodes help to pinpoint the precise location of delays or performance degradations. For example, if packets consistently experience high latency at a particular router, the routing analysis graph highlights this, allowing network administrators to focus on resolving the issue at that node rather than in a wider network segment.

[0071] Route analysis graphs also incorporate data on dynamic routing changes and fluctuating traffic patterns, showing how packets might traverse different routes under different conditions. This real-time tracking enables accurate latency calculations across each route, giving administrators a clear view of where congestion or anomalies are impacting network performance. Where tools rely on aggregated data, route analysis graphs offer increased granularity by capturing detailed metrics at each network hop, enabling the detection and addressment of subtle anomalies that might otherwise go unnoticed.

[0072] In network management, routing graphs provide visibility into network operational health by correlating traffic flow with node-level performance data. By identifying potential problem areas at each stage of packet traversal, they support more efficient troubleshooting, faster resolution times, and optimized routing paths, thereby improving overall network reliability and responsiveness.

[0073] Turning to node-level traffic metrics, these refer to a set of specific measurements and performance indicators collected at each network node, which together provide high-granularity insights into data flow, resource utilization, and latency behavior. These metrics provide a localized view of the health, efficiency, and operational status of nodes within a broader distributed network environment. By collecting these metrics at the node level, the system enables precise, real-time visibility into traffic patterns and resource constraints that impact end-to-end network performance.

[0074] In the network management engine, each node (such as a transport layer balancer, ingress gateway, or application gateway) generates and tracks metrics related to its own activity, resource usage, and traffic processing. For example, packet drops in a node's NIC or within its kernel highlight the possibility that data packets may not be able to be transmitted successfully, providing clues to potential congestion or capacity issues. CPU and memory utilization metrics provide a view to processing demand and availability, helping to effectively manage resource allocation. These metrics directly support the solution's goal of achieving real-time latency analysis and enabling timely adjustments for optimized packet routing and processing.

[0075] Furthermore, metrics such as round-trip time (RTT) histograms, congestion-limited connection counts, and open TCP / UDP port counts contribute to detailed visibility into connection health and stability. By correlating these node-level traffic metrics with data stored in components such as time-series databases, the solution can perform advanced analytics to detect anomalies, accurately indicate deviations from expected performance, and improve the accuracy and responsiveness of network management. Ultimately, node-level traffic metrics serve as foundational data to enhance latency monitoring, fault detection, and automated network tuning across cloud mesh networks, supporting this solution's focus on precise path-specific traffic control and fault resolution.

[0076] For example, node-level traffic metrics provide insights into the health and performance of each network node by focusing on factors such as resource utilization, connectivity stability, and data flow efficiency. Each metric is essential for diagnosing network issues and optimizing traffic management, especially in complex systems where latency and resource constraints can impact overall performance.

[0077] Packet drops within the network interface card (NIC) and kernel reveal potential bottlenecks or interruptions in data transmission. For example, if a node exhibits a high packet drop rate on its NIC, it may indicate a problem with buffer capacity or an overloaded link, requiring a closer examination of traffic routing or hardware capacity. Within the kernel, packet drops can result from processing limitations that cause packets to be discarded before they are forwarded, impacting the reliability of data transmission at a fundamental level.

[0078] The number of CPU cores exceeding a predetermined utilization rate (e.g., 75% utilization) within a one-second window is another crucial metric. This measurement provides real-time insights into processing demand across nodes and highlights when specific tasks or traffic spikes are overloading resources. For example, if multiple cores consistently exceed 75% utilization during peak periods, the node may require load balancing or further resource provisioning to avoid performance degradation.

[0079] Congestion-limited connections refer to connections that cannot proceed at their full capacity due to network congestion. Tracking the number of these connections reveals when and where data traffic encounters bandwidth constraints, providing a basis for adjusting QoS settings or rerouting traffic to less congested paths.

[0080] Round-trip time (RTT) histograms are useful for identifying latency patterns and anomalies. For example, a normal distribution of RTT may indicate stable performance, while spikes or shifts in the histogram suggest fluctuating delays that potentially point to problems with routing paths or transient congestion. Monitoring RTT histograms allows for the rapid identification of latency changes, which is invaluable for maintaining seamless, low-latency connectivity.

[0081] Metrics for the number of connections to receive memory (RMEM) and write memory (WMEM) exceeding the provisioned thresholds help identify situations where data buffering exceeds what the node's memory can handle. Connections that frequently exceed the RMEM or WMEM threshold are a sign of insufficient buffering or abnormally high data rates and require optimization to maintain data integrity and prevent connection slowdowns.

[0082] Packet and bitrate measurements allow administrators to track overall throughput and identify unusual spikes or drops in data flow by measuring the amount of data passing through a node over time. For example, if a node experiences a sudden drop in packet rate, it may indicate a routing problem, packet filtering, or an upstream application failure, prompting immediate investigation.

[0083] The count of open TCP and UDP ports on a node indicates which services are active and accessible. Unmonitored open ports can expose the system to unauthorized access or increased load from external sources, making this information essential for maintaining network security and efficiency.

[0084] Node memory utilization provides a view of available and consumed memory, helping to assess whether memory resources are sufficient for the current task. High memory utilization without proper management can lead to paging, ultimately slowing down data and packet processing.

[0085] Memory bandwidth utilization indicates how much memory access capacity is being used, especially in relation to data-intensive applications. If memory bandwidth is fully utilized, access to critical data will be slow, potentially leading to a bottleneck even if CPU and network resources are not being fully utilized.

[0086] Finally, tracking the percentage of CPU consumed by eBPF on a host highlights the processing demands of eBPF programs that manage packet processing and monitoring. For example, if an eBPF program is using a significant share of the CPU, it may be limiting resources for other processes on the node, indicating that the efficiency of these monitoring functions needs to be optimized.

[0087] These node-level traffic metrics collectively provide a comprehensive view of the network's health and performance at each node. Through consistent monitoring and analysis of these metrics, network administrators gain the tools to diagnose, tune, and optimize node performance across a variety of traffic conditions and operational demands.

[0088] For example, referring to bandwidth utilization, determining that bandwidth utilization is met involves correlating an increase in packet latency with an increase in traffic load, as observed through multiple packet interceptors deployed across the network. As more data flows through the network, traffic patterns become denser, and packets begin to compete for the same resources, such as queues and transmit buffers. Packet interceptors record this increasing competition as a gradual increase in latency. For example, once the traffic volume exceeds a certain threshold, packets may begin to queue at the transport layer balancer or experience slower forwarding rates at the ingress gateway. These conditions are reflected in timestamped packet data, where previously negligible latency increases significantly, both at individual nodes and cumulatively along the packet's multi-hop path. By continuously comparing current latency readings with an established performance baseline, the system can detect subtle shifts indicating congestion. When an interceptor reports a persistent increase in latency across multiple points in the network, particularly points known to be capacity-limited segments, the system infers that available bandwidth has been effectively consumed.

[0089] Throughout the network management process, Log 160 is continuously updated with information on packet flow, latency measurements, and processing times at each network node. These logs serve as a persistent repository of network activity, which is invaluable for troubleshooting and retrospective analysis. To manage and analyze large amounts of real-time data, TSDB 170 is used. This dedicated database is optimized for timestamped data and captures metrics from each hop in the request path, including timestamps, hop-specific latency, and error rates. TSDB aggregates this information to provide an end-to-end view of network performance.

[0090] Finally, the fault analyzer 180 processes data from both logs and TSDB 170 to detect deviations from normal performance benchmarks. If latency spikes or packet loss exceeding acceptable thresholds are detected, the fault analyzer generates an alert prompting network administrators to investigate and correct the problem if necessary. By analyzing trends and identifying anomalies, the fault analyzer acts as a proactive layer of monitoring and control, protecting network reliability and efficiency.

[0091] In this way, the network management engine 110 integrates advanced components and technologies to provide a comprehensive solution for network management, performance monitoring, and fault detection. This architecture enables precise latency measurement at each network node, provides end-to-end visibility, allows for rapid response to network irregularities, and ensures optimal performance and reliability of client-server interactions.

[0092] Referring to Figure 2C, Figure 2C shows a flowchart relating to a network packet management extension engine that enhances mesh visibility by tracking request paths and hop-level latency, enabling potentially automated recovery for rapid troubleshooting and performance optimization. The network packet management extension engine provides trackable paths for request processing, monitors latency at each node, and enables automated real-time responses to optimize network performance and maintain service availability. Path identification and TLB duration measurement can be described by referring to the following steps.

[0093] In Step 201C - Client Request Processing and Ingress in the TLB: The client request process begins when a client or synthetic traffic generator initiates a request targeting a virtual IP (VIP) associated with a particular service. This VIP is a unique IP that acts as a single entry point, guiding the request through a mesh of network components that facilitates routing, monitoring, and latency tracking. The first component to receive the request is the Transport Layer Balancer (TLB), which handles routing at the network edge to direct traffic to the appropriate service.

[0094] Deployment of eBPF programs on TLB hosts: Several eBPF programs are deployed on TLB hosts to enhance tracking and optimize latency. eBPF allows small programs to run within the kernel to monitor and analyze packet data, minimizing performance overhead by running only when specific traffic conditions are met.

[0095] eBPF1 is attached to a Traffic Control (TC) hook at the ingress (TLB entry point), and its function is to timestamp incoming SYN packets for IPs associated with the TLB's VIP. This timestamp helps track the exact entry time of a packet when it reaches the TLB, providing a baseline for latency analysis. To optimize performance, eBPF1 uses a per-CPU sampling method, ensuring that only one SYN packet per CPU core per second is processed, reducing the load on the system.

[0096] eBPF2 is also attached to the TC hook, but at the egress point (TLB exit point), this eBPF2 measures the time taken from packet ingress to egress within the TLB, which includes any time spent creating the network tunnel. The time spent within the TLB is calculated and written as TLB duration within the packet's internal IP options, giving a precise measure of TLB processing latency.

[0097] TLB Data Inspection and Logging: eBPF3 inspects tunneled packets and records relevant metadata such as source IP, source port, TLB IP address, and TLB duration. This data is essential for understanding where latency may be introduced in packet processing. The program attaches to the TC within the ingress gateway (GW) pod and logs metadata for each packet that will be referenced in subsequent connectivity analysis.

[0098] In Step 202C - Processing and Connection Tracking at the Ingress Gateway (GW): Once a request reaches the Ingress GW, it is processed by App GW Envoy, a high-performance proxy service that routes and monitors network traffic. To track and report latency, network filters in Envoy collect essential metadata.

[0099] App GW Envoy Integration for Connection Tracking: The network filter in App GW Envoy captures source IP, source port, and TLB information (e.g., TLB processing time, TLB host IP). This information is first logged to the TLB by the eBPF program and becomes accessible for each connection, allowing Envoy to evaluate request processing time as the request travels through each layer of the network. A garbage collector function clears old entries from memory every two minutes to prevent data overflow, ensuring that only relevant recent data is stored.

[0100] Custom response filters for App GW Envoy - App GW Envoy also adds specific tracking headers to each HTTP response to capture detailed processing information. This response filter is deployed within the ingress GW pod, logging request latency at several points and attaching metadata for each layer the packet passes through.

[0101] X-CORP-MESH-PROXY-DURATION: Represents the processing time within the Envoy proxy. X-CORP-MESH-PROXY-POD: Identifies proxy pods by IP or FQDN, enabling pod-level tracking.

[0102] Additional headers: X-CORP-MESH-INGRSS-GW-DURATION: Indicates the total processing time within the ingress gateway.

[0103] X-CORP-MESH-INGRSS-GW-POD: Identifies the Ingress GW pod. X-CORP-MESH-TLB-HOST: Captures source IP from the TLB retrieved from local memory.

[0104] X-CORP-MESH-TLB-DURATION: Reflects the time it took for the TLB to process the request. In step 202C - Latency calculation and graph generation on the client: Upon receiving an HTTP response, the client extracts the tracking headers added by the App GW Envoy within the Ingress GW. These headers provide a complete description of the latency across various network points.

[0105] Each client captures headers such as X-CORP-MESH-PROXY-DURATION and X-CORP-MESH-TLB-DURATION, which represent the latency at each key node (e.g., TLB, Ingress GW, Envoy proxy).

[0106] The client logs this data as a JSON object, enabling a structured representation of network paths and latency per node.

[0107] [Table 1]

[0108] In Step 204C - Automated Analysis and Remediation: A tool (e.g., a fault analyzer) analyzes JSON latency data and correlates it with kernel metrics to identify any bottlenecks. If network congestion, resource exhaustion, or abnormal latency spikes are detected, the tool can initiate adjustments via the VIP scheduler, which reroutes traffic and dynamically allocates resources to improve network performance and ensure reliability. To operate efficiently within the network, it is intended that route access for App GW Envoy may be enabled to support BPF capabilities. If security protocols restrict route access, alternative solutions include implementing a Remote Procedure Call (RPC) proxy. However, this may add significant latency due to additional routing overhead.

[0109] To measure network latency without compromising packet flow, the time taken is included within the IP option using the timestamp IP option, a feature specified in the IP standard for experimental purposes. This option is rarely used in typical network configurations and was chosen for its suitability with latency measurement. By attaching it only to TCP SYN packets, any additional performance overhead is minimized. This approach avoids the costly adjustments required for packet headroom, which can slow down high-rate packet processing.

[0110] Timestamps are applied only to traffic destined for virtual IPs (VIPs) traveling between the transport layer balancer (TLB) and the ingress gateway (GW). This selective approach ensures that the IP option remains hidden from other network devices along its path, protecting it from unnecessary exposure while keeping it optimized. Including timestamps introduces some lookup costs within the kernel, but these costs are minimized by using per-CPU hash maps that enable lock-free lookups. This configuration reduces bottlenecks and allows for efficient latency tracking.

[0111] Within this framework, the design measures packet latency from the host to the guest namespace, via an IPVS load balancer, up to the egress point in the Linux® kernel. The timestamp IP option is then removed at the lowest level of the Linux networking stack within the ingress GW to avoid its appearance in the response to the client. Without this removal step, the IP option would be visible across all devices in the packet's network path, potentially adding latency as some devices would process the IPv4 IP option on a slower path. The timestamp can also increase packet size, risking fragmentation and further delay. To prevent this, the timestamp is removed, particularly in the traffic control (TC) hook within the guest network namespace, which is an efficient choice to avoid the heavy performance cost of attaching the eXpress data path (XDP) to the virtual Ethernet (veth) interface within the guest namespace.

[0112] Further performance optimization is achieved through per-CPU sampling, which reduces the frequency of timestamp updates between the TLB and ingress GW to 1 / 16th. This embodiment significantly reduces the system's processing load without compromising latency tracking accuracy. Through this combination of targeted timestamp use, efficient data processing, and strategic processing points, the design achieves detailed latency measurement while maintaining high network performance.

[0113] Referring to Figure 2D, Figure 2D shows a flowchart relating to an exemplary embodiment of network management in which hop-level visibility is achieved using eBPF, Envoy filters, and header metadata to measure hop-by-hop latency and enable detailed analysis and automated problem solving. During operation, requests to access VIP services are initiated by a client (or synthetic traffic generator). Requests are routed through various components within the Tess cloud mesh, hop-by-hop latency is measured, and metadata is collected to aid in troubleshooting and visibility.

[0114] In 201D - Request Processing on TLB: A client request is directed to the virtual IP (VIP) address associated with the service. The request first reaches one of the TLB (Transport Layer Balancer) nodes. Multiple eBPF (Enhanced Berkeley Packet Filter) programs are deployed on the TLB host to monitor the request. These programs are only activated for traffic flowing from the TLB to the ingress GW (Gateway) node, thus minimizing overhead.

[0115] First eBPF Program: In the TLB, an eBPF program attached to the ingress's TC (Traffic Control) hook is invoked. The timestamp IP option is added to incoming SYN packets destined for the TLB VIP. This timestamp records the precise time the packet entered the TLB host. To reduce processing costs, time stamping is limited to one SYN packet per CPU core per second.

[0116] Second eBPF program: A second eBPF program, also attached to the TC hook, measures the time a packet spends in the TLB host kernel from entry to exit, including tunnel creation time. The calculated TLB duration is then written to the timestamp field in the packet's internal IP option, thus providing a TLB-level latency measurement.

[0117] In step 202D - Data inspection and recording at the Ingress GW: After leaving the TLB node, the request is forwarded to the Ingress GW node. At the Ingress GW, a third eBPF program inspects the packet for data tracking. This program stores the packet's metadata, including the source IP, source port, TLB IP address, and TLB processing time. A Berkeley Packet Filter (BPF) map is used for this data. The key consists of the source IP and port of the internal packet, and the value includes the TLB processing time and TLB IP. The network filter within the Envoy proxy retrieves the TLB metadata and stores it in memory so that it can be referenced for additional latency metrics in response processing.

[0118] In step 203D - Latency measurement at the Envoy proxy: The request is processed by the App Gw Envoy proxy within the service's application pod, where further latency information is measured. The App Gw Envoy adds an HTTP header (X-CORP-MESH-PROXY-DURATION) to capture the time it takes for the endpoint service to process the request. Another HTTP header (X-CORP-MESH-PROXY-POD) is added to indicate the IP address or fully qualified domain name (FQDN) of the App Gw Envoy proxy for traceability.

[0119] In step 204D - Custom response filter in Ingress GW: In the Ingress GW, the App GW Envoy response filter records the request duration. The filter uses TLB metadata stored in memory to collect and add the following HTTP headers to the response.

[0120] X-CORP-MESH-INGRSS-GW-DURATION: Total duration during Ingress GW. X-CORP-MESH-INGRSS-GW-POD: The IP address or FQDN of the Ingress Gateway Pod.

[0121] X-CORP-MESH-TLB-HOST: Source IP address from the TLB. X-CORP-MESH-TLB-DURATION: Processing time in TLB. To prevent memory leaks, the collected data is periodically aged out, with older entries being stored in the BPF map.

[0122] In step 205D - Client-side response processing: Upon receiving a response, the client extracts and logs the relevant headers and calculates the latency per hop. X-CORP-MESH-PROXY-DURATION: Time spent by the service endpoint.

[0123] X-CORP-MESH-INGRSS-GW-DURATION: Time from Ingress Gateway to service. X-CORP-MESH-INGRSS-GW-POD, X-CORP-MESH-PROXY-POD, X-CORP-MESH-TLB-HOST, and X-CORP-MESH-TLB-DURATION for tracing purposes.

[0124] The latency per hop is calculated by analyzing these headers, which reveals the latency of each hop and any delay along the path. In Step 206D - Visualization and Tool-Driven Analysis: Recorded metrics are aggregated, for example, into a JSON document.

[0125] [Table 2]

[0126] This JSON document can be parsed by an AI tool, which then combines it with kernel-level metrics to identify bottlenecks or resource constraints. When a resource exhaustion or latency spike is detected within the mesh, the tool (e.g., an AI tool) can trigger an automated remediation action via the VIP scheduler, ensuring high availability and optimized performance.

[0127] Referring to Figure 3, Figure 3 shows an end-to-end embodiment of a network management engine that provides several key capabilities to improve network performance and visibility. One of its main features is end-to-end latency tracking, which adds a timestamp option to the IP header, enabling precise measurement of each segment in the packet's path. This capability ensures that every hop can be accurately monitored for latency. Another key feature is optimized sampling and filtering. By using per-CPU sampling, the network management engine effectively reduces the performance impact associated with high-rate packet flows, ensuring that network efficiency is maintained even under heavy traffic conditions.

[0128] In addition, custom response filtering and analysis are facilitated through Envoy network filters at each hop. These filters enhance packet headers with latency data, enabling hop-by-hop latency analysis, which is particularly useful for applications. The network management engine continuously logs metrics to a time-series database and utilizes a fault analyzer to identify network problems in near real-time, issue alerts, and enable rapid intervention when problems occur.

[0129] During operation, client 310 sends a request (request 1) to TLB320. Upon arrival at TLB320, the network management engine marks each packet with an IP option timestamp. This timestamp acts as a fundamental latency metric that packets will follow throughout each step of the route, providing an accurate view of the time intervals between network nodes. The timestamp may be applied only to SYN packets destined for the VIP, which minimizes performance overhead.

[0130] When the request (Request 2) reaches the Ingress GW330, additional latency data is captured. Here, the network management engine applies an eBPF (Enhanced Berkeley Packet Filter) program in the egress to measure latency and packet transit time across the kernel of the Ingress GW330. The Ingress GW330 also includes a custom Envoy network filter that captures relevant metadata such as TLB processing time, TLB host IP, and client source IP. To effectively manage the data and prevent memory overload, the Ingress GW330 implements per-CPU sampling techniques to ensure that only a controlled rate of SYN packets per core per second is processed, minimizing processing overhead. The Ingress GW330 uses App GW Envoy (not shown) to add a tracking header to the packets and capture hop-specific latency metrics visible to client 310 upon reception.

[0131] In application pod 340, the packet reaches server Envoy proxy 342 (request 3), and the network management engine enables a custom response filter. This filter is configured to include important latency metrics such as X-CORP-MESH-TLB-DURATION and X-CORP-MESH-INGRSS-GW-DURATION in the outgoing response header. These headers enable precise tracking of the path of each packet returning from TLB 320 through ingress GW 340. Server Envoy proxy 342 then forwards the packet to server application 344 (request 4), where the packet is processed and a response (response 5) is sent back to server Envoy proxy 342 for the reverse path. In the return process, server Envoy proxy 342 sends a response back to ingress GW 330 via application pod 340 (response 6), and then sends a response (response 7) to client 310.

[0132] Upon reception at client 310, the enhanced response header provides the data necessary to calculate the latency per hop. Using these data points, the client generates a detailed analytical path graph that maps each network hop, showing the latency between client 310, TLB 320, ingress GW 330, application pod 340, server Envoy proxy 342, and server application 344. This path graph is logged to log 350 for historical analysis.

[0133] Throughout each interaction, the TLB320 and Ingress GW330 continuously send traffic metrics to a time-series database (TSDB360), where network management engine metrics regarding packet transit rate, hop-by-hop latency, and overall network performance are stored. This TSDB ensures accurate, timestamped records that enable long-term trend analysis.

[0134] The fault analyzer 370 operates as part of the network management engine's pre-monitoring component. It receives data streams from both log 350 and TSDB360 and inspects them for signs of latency deviations, packet loss, or other indicators of network degradation. When an anomaly is detected, the fault analyzer 370 generates an alert 380, enabling rapid troubleshooting and adjustments to routing paths or load balancing configurations.

[0135] Embodiments of the technical solution can be described, for example, with reference to Figures 2A, 2B, 2C, 2D, and 3, where Figure 2A is a block diagram of an exemplary technical solution environment, based on the exemplary environment described with reference to Figures 6, 7, and 8, for use in implementing embodiments of the technical solution shown. Generally, the technical solution environment includes a technical solution system suitable for providing an exemplary cloud computing system 100 in which the method of this disclosure may be employed. In particular, Figure 2A shows a high-level architecture of the cloud computing system 100 according to embodiments of this disclosure. Among other engines, managers, generators, selectors, or components (collectively referred to herein as "components") not shown, the cloud computing system 100 in Figure 2 corresponds to Figure 1.

[0136] Estimative method Referring to Figures 4A, 4B, 4C, 5A, 5B, and 5C, flowcharts illustrating a method for providing network management using a network management engine are shown. The method may be performed using a cloud computing system as described herein. In embodiments, one or more computer storage media embody computer-executable or computer-available instructions that, when executed by one or more processors, cause one or more processors to execute a method (e.g., a computer implementation method) in a cloud computing system (e.g., a computerized system or computer system).

[0137] Referring to Figure 4A, a flowchart is provided showing method 400A for providing network management using a network management engine. In block 402A, a first packet interceptor adds a marker to the packet that enables the calculation of packet latency. In block 404A, a second packet interceptor uses the marker to calculate the TLB duration, which indicates the packet latency. In block 406A, a third packet interceptor inspects the packet, i.e., the TLB duration. In block 408A, the network packet management extension engine stores the TLB duration.

[0138] Referring to Figure 4B, a flowchart is provided showing method 400B for providing network management using a network management engine. In block 402B, the network packet management extension engine tracks a packet latency metric using transmitted packets; in block 404B, it adds markers to the packets that enable the calculation of packet latency; in block 406B, it calculates the TLB duration, which indicates the packet latency associated with the TLB; and in block 408B, it inspects the tunneled packets for packet latency data.

[0139] Referring to Figure 4C, a flowchart is provided showing method 400C for providing network management using a network management engine. In block 402C, the network packet management extension engine generates transport load balancer (TLB) traffic metrics associated with the first and second packet interceptors; in block 404C, it generates ingress gateway (Ingress GW) traffic metrics associated with the third packet interceptor; in block 406C, it stores the TLB traffic metrics and Ingress GW traffic metrics in a time-series database (TSDB); and in block 408C, it sends the TLB traffic metrics and Ingress GW traffic metrics from the TSDB to a fault analyzer to cause the fault analyzer to generate alerts.

[0140] Referring to Figure 5A, a flowchart is provided illustrating method 500A for providing network management using a network management engine. In block 502A, the client communicates packets associated with the application gateway and the network packet management extension engine; in block 504A, the client receives a response packet associated with the packet based on the communication of the packet; in block 506A, the client extracts packet latency data associated with the response packet; and in block 508A, the client transmits the packet latency data.

[0141] Referring to Figure 5B, a flowchart is provided illustrating method 500B for providing network management using a network management engine. In block 502B, the fault analyzer accesses network performance data associated with the network, which includes packet latency data generated using multiple packet interceptors associated with the network packet management extension engine; in block 504B, it identifies bandwidth utilization associated with a predetermined threshold of the network; and in block 506B, it sends alerts associated with the bandwidth utilization.

[0142] Referring to Figure 5C, a flowchart is provided illustrating method 500C for providing network management using a network management engine. In block 502C, the network filter accesses packet latency data associated with the network packet management extension engine, in block 504C, uses the packet latency data to update the response packet for the packet associated with the client, and in block 506C, sends the response packet to the client.

[0143] Network Packet Management Extension Engine The network packet management enhancement engine presented in this solution addresses the complex challenge of calculating and quantifying packet latency in real time across diverse network topologies. Unlike conventional systems, this engine is designed to capture precise packet-level latency metrics by using a set of strategically distributed packet interceptors across the network. These interceptors are located at critical junctions, namely transport layer balancers (TLBs) and ingress gateways (Ingress GWs), enabling them to mark, inspect, and process packet data in a way that reveals granular insights into the latency of each packet as it traverses the network.

[0144] Upon entering the network, packets first encounter a packet interceptor located in the TLB's traffic control hook. At this point, the first packet interceptor adds a timestamp marker to each packet. This marker serves as a crucial reference for calculating latency and captures the arrival time of the packet at the TLB host. By embedding this incoming timestamp, the engine establishes an initial data point for latency calculations and sets the basis for precise tracking of packets as they travel through the network.

[0145] As the packet progresses, it reaches a second packet interceptor located on the egress network interface of the TLB's traffic control hook. This second interceptor is tasked with calculating the TLB duration, a measure of latency particularly relevant to the TLB's internal processing time. The TLB duration is derived by calculating the difference between the packet's arrival timestamp (recorded by the first interceptor) and the current time the packet reaches the second interceptor. This calculated TLB duration replaces the original timestamp in the packet's data. By including this duration, the system can accurately quantify the latency occurring during TLB processing and capture any delays that may arise from load balancing, routing, or other TLB functions.

[0146] Next, the packet proceeds to a third packet interceptor located at the ingress gateway, where the engine inspects the packet and extracts latency data, including TLB duration. At this point, the packet's latency data is carefully analyzed and cataloged. The extracted latency information, including the internal packet source IP, port, tunnel source IP, and TLB duration, is stored in a map structure. This data structure is organized into key-value pairs, where the key represents a unique combination of the internal packet source IP and port, and the value stores the tunnel source IP and TLB duration. By structuring the data in this way, the engine achieves an efficient organization for retrieving latency metrics, enabling both rapid analysis and minimal storage overhead.

[0147] A network filter (or network traffic filter) oversees the storage and management of packet latency data within the map. This filter operates to update the latency information associated with each packet as needed, ensuring that response packets sent back to the client carry accurate and up-to-date latency metrics. Through this update mechanism, the filter enables the engine to reflect the latest network conditions and packet transit times, resulting in a responsive and adaptive latency tracking system.

[0148] To support critical synchronization packet processing during connection establishment, the engine restricts latency tracking to synchronization packets. This selective approach minimizes unnecessary processing overhead while maintaining high accuracy in latency calculations for new connections. By focusing on synchronization packets, the engine optimizes resource usage and improves overall network performance without compromising latency visibility.

[0149] The integration of the tunnel engine between the TLB and the ingress gateway further enhances its effectiveness in tracking latency. As packets travel through this tunnel, marker timestamps provide a means to calculate tunnel-related latency and capture the total transmission time of packets across this path. In addition, the tunnel structure accommodates unique latency data for each packet, preventing loss of latency metrics even in high-throughput environments.

[0150] Latency data for each packet is stored and processed in the network's time-series database, making it accessible for detailed analysis and fault detection. By maintaining a continuous record of latency information, the time-series database enables long-term trend analysis and historical performance monitoring. Furthermore, this data is fed into a fault analyzer that leverages latency trends and inconsistencies to detect potential network failures. Through real-time latency tracking and historical data analysis, the fault analyzer can identify deviations from expected network behavior and issue alerts to help network administrators proactively address potential problems.

[0151] This embodiment of the Network Packet Management Extension Engine provides a robust and scalable solution for accurately tracking and quantifying packet latency. By strategically positioning packet interceptors to timestamp, compute, and store latency data, the engine achieves a high level of visibility into packet flow, enabling real-time insights into network performance and latency characteristics across complex topologies. Network filters, time-series databases, and fault analyzer components ensure that latency data is not only tracked but also stored and analyzed for the proactive identification of network problems, representing a substantial advance in network latency monitoring and management.

[0152] Network Performance Management Engine The network performance management engine described in this technical solution provides an advanced, highly responsive framework for identifying and addressing deviations in network behavior, including packet latency across complex topologies. This enables both active clients and synthetic traffic generators to produce packets that reveal fine-grained latency metrics. This allows network operators to dynamically respond to shifting network conditions and detect anomalies across node-level metrics.

[0153] Within the core of the network performance management engine, a series of strategically placed packet interceptors, forming part of the network packet management extension engine, operate within the network to capture packet latency metrics at specific intervals. A first packet interceptor, located at the traffic control hook of the transport layer balancer (TLB), timestamps each incoming packet. This timestamp, added as a placeholder within the packet, captures the exact moment the packet enters the TLB and forms the basis for calculating packet-specific latency metrics. As the packet progresses, a second packet interceptor calculates TLB duration by comparing the incoming timestamp to the current time and measuring the latency occurring during packet processing within the TLB. This calculated TLB duration is then embedded within the packet, enabling accurate latency tracking and ensuring consistency across different paths and traffic flows. A third packet interceptor, located at the application gateway's ingress gateway, inspects the packet for any additional latency data before forwarding it to its destination, thereby ensuring that the latency metrics remain comprehensive and up-to-date.

[0154] The network performance management engine's network filter enables the storage and management of latency data via a map structure, and latency metrics are efficiently organized using a unique key derived from the internal source IP, port, and associated TLB duration of each packet. This approach allows the network performance management engine to accurately update and manage packet latency metrics, supporting real-time response packet updates based on evolving network conditions. As each client packet traverses the network, the network filter updates the corresponding response packet with relevant latency data extracted from the map, providing a complete latency profile when returned to the client. This process enables continuous monitoring of network performance at the packet level, revealing fine-grained latency information crucial for real-time anomaly detection and resolution.

[0155] In addition, the network performance management engine's management includes generating and updating dynamic visual representations of routing analysis graphs, response packets, and packet latency data associated with their routes. These graphs provide detailed insights into latency across different network segments, enabling network administrators to observe latency patterns, identify deviations from normal performance, and take preventative action as needed. By leveraging these visual analyses, administrators can not only see the overall network performance but also pinpoint specific areas where latency exceeds acceptable thresholds. This graphical data is further complemented by node-level traffic metrics that reflect bandwidth utilization across different network nodes. The network performance management engine uses this data to set predetermined bandwidth thresholds, and when these are exceeded, alerts are triggered to notify administrators of potential network congestion or performance degradation, enabling rapid intervention.

[0156] When packet latency data reveals suboptimal path performance, the network performance management engine dynamically selects alternative paths to optimize packet delivery. By using latency data generated from packet interceptors, the network performance management engine identifies potential alternative paths that reduce latency and improve network performance. This dynamic rerouting mechanism not only ensures optimal path selection but also helps prevent network congestion, improving the overall efficiency and responsiveness of the network in real time.

[0157] The network performance management engine is designed to seamlessly adapt to both active network environments and synthetic traffic conditions. By supporting synthetic traffic generators, the network performance management engine simulates various network patterns and conditions, tests the network's response to dynamic traffic flows, and identifies latency issues before they impact live traffic. This allows administrators to evaluate network performance under diverse conditions, providing valuable insights into how the network infrastructure responds to variable loads and identifying potential latency bottlenecks that might otherwise go undetected.

[0158] This technical solution provides a framework for real-time packet-level network management through its integrated packet interceptor, traffic filter, and performance monitoring capabilities. By capturing latency data at critical points in the network, dynamically updating response packets, and generating detailed routing analysis, the engine facilitates comprehensive monitoring of packet flow, enabling proactive anomaly detection and efficient network optimization across dynamic and complex topologies.

[0159] technical improvements Embodiments of the present invention have been described with reference to several inventive features (e.g., operation, system, engine, and components) related to an item listing system. The inventive features described include the operation, interface, data structure, and configuration of computing resources associated with providing the functions described herein with respect to a network management engine associated with a cloud computing system.

[0160] Embodiments of the present invention relate to the field of computing, and more specifically to artificial intelligence systems. The exemplary embodiments described below provide, among other things, systems, methods, and program products for performing operations that provide network management. Thus, these embodiments improve the technical field of cloud mesh network technology that provides more network management. For example, the network management engine enhances monitoring and performance management across complex network topologies. Packet latency can be calculated and quantified in real time, enabling accurate identification of deviations from expected network behavior. The network management engine captures granular traffic metrics that provide a detailed view of network performance at the packet level, without relying solely on sampled data. By leveraging advanced telemetry and real-time analytics, it supports a proactive approach to performance monitoring and detects subtle anomalies that conventional systems might miss. In addition, the network management engine enables correlation between node-level traffic metrics and observed network behavior, which helps improve troubleshooting accuracy and accelerates responses to performance issues across dynamically routed network paths. Through these capabilities, network visibility is enhanced, enabling more efficient resource utilization and optimized network operation.

[0161] The functions of the embodiments of the present invention are further described by embodiments and illustrative examples to demonstrate the operation of providing network management using a network management engine in a cloud computing system as a solution to specific problems in cloud mesh network technology, in order to improve computing operations in a cloud computing system.

[0162] Additional support for a detailed description of the invention Exemplary Item Listing System Environment Referring now to Figure 6, Figure 6 shows a computing environment for an exemplary item listing system 600 in which embodiments of the present disclosure may be employed. In particular, Figure 6 shows a high-level architecture of an exemplary item listing platform 610 that may host a technical solution environment or a part thereof. It should be understood that this and other configurations described herein are provided as examples. For example, as stated above, many of the elements described herein may be implemented as individual components or distributed components, or together with other components, in any appropriate combination and location. Other configurations and elements (e.g., machines, interfaces, functions, sequences, and groupings of functions) may be used in addition to or instead of those shown.

[0163] The item listing system 600 may be a cloud computing environment that provides computing resources for functions associated with the item listing platform 610. For example, the item listing system 600 supports the delivery of computing components and services, including servers, storage, databases, networking, applications, and machine learning, associated with the item listing platform 610 and client devices 620. Multiple client devices (e.g., client devices 620) include hardware or software that accesses resources on the item listing system 600. Client devices 620 may include applications (e.g., client application 622) and interface data (e.g., client application interface data 624) that support client-side functions associated with the item listing system. Multiple client devices can access the computing components of the item listing system 600 via a network (e.g., network 630) to perform computing operations.

[0164] The Item Listing Platform 610 is responsible for providing a computing environment or architecture that includes infrastructure to support the provision of Item Listing Platform functions (e.g., e-commerce functions). The Item Listing Platform supports storing items in an item database and providing a search system for receiving queries and identifying search results based on those queries. The Item Listing Platform may also provide a computing environment with features for managing, selling, buying, and recommending different types of items. Specifically, the Item Listing Platform 610 may be for a content platform such as the eBay Content Platform or e-commerce platform developed by eBay Inc. in San Jose, California.

[0165] The item listing platform 610 can provide item listing operations 630 and item listing interfaces 640. The item listing operations 630 may include service operations, communication operations, resource management operations, security operations, and fault tolerance operations that support specific tasks or functions within the item listing platform 610. The item listing interfaces 640 may include service interfaces, communication interfaces, resource interfaces, security interfaces, and management and monitoring interfaces that support functions between item listing platform components. The item listing operations 630 and item listing interfaces 640 can enable communication, coordination, and seamless functionality of the item listing system 600.

[0166] For example, the functions associated with the Item Listing Platform 610 may include shopping operations (e.g., product search and browsing, product selection and shopping cart, checkout and payment, and order tracking), user account operations (e.g., user registration and authentication, and user profiles), seller and product management operations (e.g., seller registration, product listing, and inventory management), payment and financial operations (e.g., payment processing, refunds, and reimbursements), order fulfillment operations (e.g., order processing and fulfillment, and inventory management), customer support and communication interfaces (e.g., customer support chat / email and notifications), security and privacy interfaces (e.g., authentication and authorization, payment security), recommendation and personalization interfaces (e.g., product recommendations, and customer reviews and ratings), analytics and reporting interfaces (e.g., sales and inventory reporting, and user behavior analysis), and API and integration interfaces (e.g., APIs for third-party integrations).

[0167] The item listing platform 610 can provide an item listing platform database (e.g., item listing platform database 650) for efficiently managing and storing different types of data. The item listing platform database 650 may include relational databases, NoSQL databases, search databases, cache databases, content management systems, analytics databases, payment gateway databases, customer relationship management databases, log and error databases, inventory and supply chain databases, and multi-channel databases used in combination to efficiently manage data and provide users with an e-commerce experience.

[0168] The item listing platform 610 supports applications (e.g., application 660) which are computer programs, software components, or services that service specific functions or sets of functions to meet specific item listing platform requirements or user requirements. Applications can be client-side (user-facing) and server-side (backend). Applications can also include applications that have no AI support at all (e.g., application 662), applications supported by conventional AI models (e.g., application 664), and applications supported by generative AI models (e.g., application 666). As an example, applications can include online point-of-sale applications, mobile shopping apps, administrator and management consoles, payment gateway integrations, user account and authentication applications, search and recommendation engines, inventory and inventory management applications, order processing and fulfillment applications, customer support and communication tools, content management systems, analytics and reporting applications, marketing and promotion applications, multi-channel integration applications, log and error tracking applications, customer relationship management (CRM) applications, security applications, and APIs and web services used in combination to efficiently provide users with an e-commerce experience.

[0169] The item listing platform 610 may include a machine learning engine (e.g., machine learning engine 670). Machine learning engine 670 refers to a machine learning framework or platform that provides the infrastructure and tools for designing, training, evaluating, and deploying machine learning models. Machine learning engine 670 can function as the backbone for developing and deploying machine learning applications and solutions. Machine learning engine 670 can also provide tools for visualizing data and model results, as well as tools for interpreting model decisions to gain insights into how the models are making predictions.

[0170] The Machine Learning Engine 670 can provide the libraries, algorithms, and utilities necessary to perform various tasks within a machine learning workflow. A machine learning workflow can include data processing, model selection, model training, model evaluation, hyperparameter tuning, scalability, model deployment, inference, integration, customization, and data visualization. The Machine Learning Engine 670 can include pre-trained models for various tasks, simplifying the development process. In this way, the Machine Learning Engine 670 can streamline the entire machine learning process, from data preparation and model training to deployment and inference, thereby making it accessible and efficient for different types of users working in a wide range of machine learning applications (e.g., customers, data scientists, machine learning engineers, and developers).

[0171] The machine learning engine 670 can be implemented within the item listing system 600 as a component that leverages machine learning algorithms and techniques (e.g., machine learning algorithm 672) to enhance various aspects of the item listing system's functionality. The machine learning engine 670 can provide a selection of machine learning algorithms and techniques used to teach a computer to learn from data and make predictions or decisions without explicit programming. These techniques are widely used in a variety of applications across different industries and include, for example: supervised learning (e.g., linear regression, classification, support vector machines (SVM)), unsupervised learning (e.g., clustering, principal component analysis (PCA), correlation rules (e.g., a priori)), reinforcement learning (e.g., Q-learning, deep Q-Network (DQN)), deep learning (e.g., neural networks, convolutional neural networks (CNN), and recurrent neural networks (RNN)), and ensemble learning random forests.

[0172] The machine learning training data 120 supports the process of building, training, and fine-tuning a machine learning model. The machine learning training data 120 consists of labeled datasets used to teach a machine learning model to recognize patterns, make predictions, or perform specific tasks. The training data typically includes two main components: input features (X) and labels or target values ​​(Y). Input features can include variables, attributes, or characteristics used as input to the machine learning model. Depending on the nature of the problem, input features (X) may be numerical, categorical, or text. For example, in a model for predicting house prices, input features might include the number of bedrooms, square feet, neighborhood, etc. Labels or target values ​​(Y) contain the values ​​that the model is intended to predict or classify. Labels represent the desired output or ground truth for each corresponding set of input features. For example, in a spam email classifier, labels indicate whether each email is spam or not (i.e., binary classification). The training process involves presenting a model with training data, which learns to make predictions or decisions by identifying patterns and relationships between input features (X) and target values ​​(Y). The machine learning algorithm adjusts its internal parameters during training to minimize the difference between its predictions and the actual labels in the training data. The machine learning engine 670 can use historical and real-time data to train models, make predictions, and continuously improve performance and user experience.

[0173] The machine learning engine 670 may include machine learning models (e.g., machine learning model 676) generated using the machine learning engine workflow. Machine learning model 676 may include both generative AI models and conventional AI models, both of which may be employed in the item listing system 600. Generative AI models are designed to generate new data, often in the form of text, images, or other media, based on patterns and knowledge learned from existing data. Generative AI models can be employed in a variety of ways, including content generation, product image generation, personalized product recommendations, natural language chatbots, and content summarization. Conventional AI models encompass a wide range of algorithms and techniques and can be used in a variety of ways, including recommendation systems, predictive analytics, search algorithms, fraud detection, customer segmentation, image classification, natural language processing (NLP), and A / B testing and optimization. Often, a combination of both generative and conventional AI models can be used to combine data-driven insights and creativity to deliver a well-rounded and effective e-commerce experience.

[0174] The machine learning engine 670 can be used to analyze data, make predictions, automate processes, and provide a more personalized and efficient shopping experience for users. Examples include product recommendations, search and filtering, pricing optimization, inventory and inventory management, customer segmentation, churn prediction and retention, fraud detection, sentiment analysis, customer support and chatbots, image and video analysis, and advertising targeting and marketing. Specific applications of machine learning within the item listing platform 610 may vary depending on specific goals, available data, and resources.

[0175] The Item Listing System 600 provides item listing system data that notifies customer service interactions and therefore works in conjunction with the Customer Service Management System to address any issues or questions arising from those item listings. The Customer Service Management System can be a software solution designed to streamline and automate the processing of customer inquiries and support requests across various communication channels. The Customer Service Management System centralizes customer interactions, enabling service teams to efficiently categorize, prioritize, and resolve issues while tracking and managing each case throughout its lifecycle. Integrated tools such as ticketing systems, knowledge bases, and automation features like AI-driven chatbots improve response times, reduce manual effort, and ensure consistent, high-quality customer service. The Item Listing System and the Customer Service Management System can be integrated to ensure seamless communication and efficient resolution of customer concerns.

[0176] Exemplary Distributed Computing System Environment Referring here to Figure 7, Figure 7 shows an exemplary distributed computing environment 700 in which embodiments of the present disclosure may be employed. In particular, Figure 7 shows a high-level architecture of an exemplary cloud computing platform 710 that can host a technical solution environment or a part thereof (e.g., a data trustee environment). It should be understood that this and other configurations described herein are provided for illustrative purposes only. For example, as stated above, many of the elements described herein may be implemented as individual components or distributed components, or together with other components, in any appropriate combination and location. Other configurations and elements (e.g., machines, interfaces, functions, sequences, and groupings of functions) may be used in addition to or instead of those shown.

[0177] A data center can support a distributed computing environment 700, which includes a cloud computing platform 710, racks 720, and nodes 730 (e.g., computing devices, processing units, or blades) within the racks 720. The technical solution environment may be implemented using a cloud computing platform 710 that runs cloud services across different data centers and geographical areas. The cloud computing platform 710 can implement components of a fabric controller 740 for provisioning and managing the allocation, deployment, upgrade, and management of resources for cloud services. Typically, the cloud computing platform 710 functions to store data or run service applications in a distributed manner. The cloud computing infrastructure 710 within the data center may be configured to host and support the operation of endpoints for specific service applications. The cloud computing infrastructure 710 may be a public cloud, a private cloud, or a dedicated cloud.

[0178] Node 730 may be provisioned on Host 750 (e.g., an operating system or runtime environment) that runs the software stack defined on Node 730. Node 730 may also be configured to run a dedicated function (e.g., a compute node or a storage node) within the Cloud Computing Platform 710. Node 730 is assigned to run one or more parts of a tenant's service application. A tenant can refer to a customer that utilizes the resources of the Cloud Computing Platform 710. The service application components of the Cloud Computing Platform 710 that support a particular tenant may be referred to as a multi-tenant infrastructure or tenancy. The terms service application, application, or service are used interchangeably herein and broadly refer to any software or part of software that runs on or accesses the locations of storage and compute devices within a data center.

[0179] If multiple separate service applications are supported by multiple nodes 730, the multiple nodes 730 may be divided into multiple virtual machines (e.g., virtual machines 752 and 754). Multiple physical machines can also run separate service applications concurrently. Multiple virtual machines or physical machines may be configured as individualized computing environments supported by multiple resources 760 (e.g., hardware resources and software resources) within the cloud computing platform 710. The resources are intended to be configured for specific service applications. Furthermore, each service application may be divided into functional parts so that each functional part can run on a separate virtual machine. The cloud computing platform 710 can use multiple servers to run service applications and perform data storage operations in a cluster. In particular, multiple servers can perform data operations independently but are provided as a single device called a cluster. Each server in the cluster may be implemented as a node.

[0180] The client device 780 may be linked to a service application within the cloud computing platform 710. The client device 780 may be any type of computing device that corresponds to the computing device 700 described with reference to Figure 7, for example, the client device 780 may be configured to issue commands to the cloud computing platform 710. In embodiments, the client device 780 may communicate with the service application via the Virtual Internet Protocol (IP) and a load balancer, or other means of directing communication requests to a designated endpoint within the cloud computing platform 710. Multiple components of the cloud computing platform 710 may communicate with each other via a network (not shown), which may include, but is not limited to, one or more local area networks (LANs) and / or wide area networks (WANs).

[0181] Exemplary computing environment While an overview of embodiments of the present invention has been briefly described, exemplary operating environments in which embodiments of the present invention can be carried out are described below to provide a general context for various aspects of the present invention. Referring first, in particular to Figure 8, an exemplary operating environment for carrying out embodiments of the present invention is shown and is collectively referred to as computing device 800. Computing device 800 is merely an example of a suitable computing environment and is not intended to imply any limitation on the scope of use or functionality of the present invention. Furthermore, computing device 800 should not be construed as having any dependencies or requirements relating to one or a combination of the components shown.

[0182] The present invention can be described in the general context of computer code or machine-usable instructions, including computer-executable instructions such as program modules, which are executed by a computer or other machine such as a portable information terminal or other handheld device. Generally, a program module, which includes routines, programs, objects, components, data structures, etc., refers to code that performs a specific task or implements a specific abstract data type. The present invention can be implemented in a variety of system configurations, including handheld devices, consumer electronics, general-purpose computers, and more specialized computing devices. The present invention can also be implemented in a distributed computing environment in which tasks are performed by remote processing devices linked via a communication network.

[0183] Referring to Figure 8, the computing device 800 includes a bus 810 that directly or indirectly connects the devices of memory 812, one or more processors 814, one or more presentation components 816, input / output ports 818, input / output component 820, and exemplary power supply 822. Bus 810 represents what may be one or more buses (such as an address bus, a data bus, or a combination thereof). The various blocks in Figure 8 are shown with lines to clarify the concepts and are intended to represent other configurations of the components and / or functions of the components described. For example, a presentation component such as a display device can be considered an I / O component. Also, a processor has memory. Recognizing that such things are the nature of the art, it should be reiterated that the figures in Figure 8 are merely illustrative of exemplary computing devices that can be used in connection with one or more embodiments of the present invention. Categories such as “workstation,” “server,” “laptop,” and “handheld device” are all intended within the scope of Figure 8 and are referred to as “computing device,” so they are not distinguished.

[0184] The computing device 800 typically includes various computer-readable media. Computer-readable media can be any available media that can be accessed by the computing device 800, and include both volatile and non-volatile media, and removable and non-removable media. Examples, but not limited to, computer-readable media may include computer storage media and communication media.

[0185] Computer storage media include volatile and non-volatile, removable and non-removable media implemented in any method or technique for storing information such as computer-readable instructions, data structures, program modules, or other data. Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technologies, CD-ROM, digital versatile disk (DVD) or other optical disk storage devices, magnetic cassettes, magnetic tapes, magnetic disk storage devices or other magnetic storage devices, or any other media that can be used to store desired information and can be accessed by computing device 800. Computer storage media exclude signals themselves.

[0186] Communication media typically include any information distribution medium that embodies computer-readable instructions, data structures, program modules, or other data in modulated data signals such as carrier waves or other transport mechanisms. The term “modulated data signal” means a signal having one or more of its characteristics set or modified in a manner that encodes the information within the signal. By example, but not limited to, communication media include wired media such as wired networks or direct wired connections, as well as wireless media such as acoustic, RF, infrared, and other wireless media. Any combination of the above may also be included within the scope of computer-readable media.

[0187] Memory 812 includes computer storage media in the form of volatile and / or non-volatile memory. Memory may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid-state memory, hard drives, optical disc drives, etc. Computing device 800 includes one or more processors that read data from various entities such as memory 812 or I / O components 820. Presentation component(s) 816 presents data display to the user or other device. Exemplary presentation components include display devices, speakers, printing components, vibration components, etc.

[0188] I / O port 818 allows computing device 800 to be logically coupled to other devices, including I / O components 820, some of which may be integrated. Exemplary components include microphones, joysticks, gamepads, satellite receivers, scanners, printers, and wireless devices.

[0189] Additional structural and functional features of the embodiment of the technical solution While various components used herein have been identified, it should be understood that any number of components and configurations may be employed to achieve the desired functionality within the scope of this disclosure. For example, components in the embodiments shown in the figures are indicated by lines for conceptual clarity. Other configurations of these components and other components may also be implemented. For example, while some components are shown as single components, many of the elements described herein may be implemented as individual components, distributed components, or together with other components in any appropriate combination and location. Some elements may be omitted entirely. Furthermore, various functions described herein as being performed by one or more entities may be performed by hardware, firmware, and / or software, as described below. For example, various functions may be performed by a processor that executes instructions stored in memory. Thus, other configurations and elements (e.g., machines, interfaces, functions, sequences, and groupings of functions) may be used in addition to or instead of those shown.

[0190] The embodiments described in the following paragraphs may be combined with one or more of the alternative embodiments described in detail. In particular, the claimed embodiments may alternatively include references to two or more other embodiments. The claimed embodiments may specify further limitations of the claimed subject matter.

[0191] The subject matter of embodiments of the present invention is described herein in detail to satisfy legal requirements. However, the description itself is not intended to limit the scope of this patent. Rather, the inventors intend that the claimed subject matter may also be embodied in other ways, in conjunction with other current or future technologies, to include different steps or combinations of steps similar to those described herein. Furthermore, although the terms “step” and / or “block” may be used herein to mean different elements of the method used, these terms should not be construed as implying any particular order among the various steps disclosed herein unless the order of the individual steps is explicitly described.

[0192] For the purposes of this disclosure, the word “includes” has the same broad meaning as the word “equip,” and the word “access” includes “receive,” “reference,” or “retrieve.” Furthermore, the word “communicate” has the same broad meaning as the words “receive” or “transmit” facilitated by a software or hardware-based bus, receiver, or transmitter using the communication medium described herein. Furthermore, words such as “a” and “an” include plural and singular forms unless otherwise indicated. Thus, for example, the constraint of “feature” is satisfied if one or more features exist. Also, the term “or” includes conjunctions, disjunctions, and both (thus a or b includes either a or b, and a and b).

[0193] For the purposes of the above detailed description, embodiments of the present invention are described with reference to a distributed computing environment. However, the distributed computing environment described herein is merely illustrative. Components can be configured to perform novel aspects of the embodiments, and the term “configured for” can mean “programmed” to perform a particular task using code or to implement a particular abstract data type. Furthermore, while embodiments of the present invention can generally refer to the technical solution environments and outlines described herein, it should be understood that the techniques described may be extended to other embodiment contexts.

[0194] The embodiments of the present invention are described in relation to specific embodiments that are intended to be illustrative rather than restrictive in all respects. Alternative embodiments will become apparent to those skilled in the art to which the invention belongs without departing from the scope of the invention.

[0195] From the above, it will be clear that the present invention is well-adapted to achieve all of the above-mentioned objectives, along with other advantages that are clearly and structurally inherent. It will be understood that certain features and subcombinations are useful and can be used without reference to other features or subcombinations. This is intended and is within the scope of the claims.

Claims

1. A computerized system, One or more computer processors, It comprises a computer memory that stores computer-usable instructions, When the aforementioned computer-enabled instruction is used by the one or more computer processors, it causes the one or more computer processors to execute a plurality of operations, and the plurality of operations are: The communication involves a client communicating packets associated with an application gateway and a network packet management extension engine, wherein the network packet management extension engine includes multiple packet interceptors that support tracking multiple packet latency metrics using multiple packets transmitted through the application gateway. Receiving a response packet associated with the packet, based on the communication of the packet, wherein the packet is associated with a plurality of packet latency metrics tracked using the plurality of packet interceptors, Extracting packet latency data associated with the response packet and the plurality of packet latency metrics, A system including transmitting the aforementioned packet latency data.

2. The aforementioned multiple operations are, The system according to claim 1, further comprising generating hop-by-hop latency associated with the packet using the packet latency data.

3. The aforementioned multiple operations are, The process further includes communicating the aforementioned packet latency data to update the packet latency log associated with the route analysis graph, The system according to claim 1, wherein the route analysis graph is a visual representation of patent latency data associated with a plurality of response packets and a plurality of corresponding routes for the plurality of response packets.

4. The aforementioned multiple operations are, The system according to claim 1, further comprising dynamically selecting an alternative path for a plurality of subsequent packets based on the packet latency data.

5. The system according to claim 1, wherein the network filter updates the response packet based on the packet latency data.

6. The system according to claim 1, wherein the network filter supports lookup and delete operations that can be performed on a map data structure storing the packet latency data in order to support updating the response packets.

7. The system according to claim 1, wherein the client is a synthetic traffic generator that simulates multiple network traffic patterns and multiple conditions for testing and evaluating network performance.

8. A computing system having a processor and memory, comprising one or more computer storage media having computer executable instructions that cause the processor to perform a plurality of operations, wherein the plurality of operations are Accessing network performance data associated with a network, wherein the network performance data includes packet latency data generated using multiple packet interceptors. Based on the packet latency data tracked using the plurality of packet interceptors within the network, it is determined that the bandwidth utilization rate has met a pre-set threshold for the network. One or more computer storage media, including sending alerts associated with the bandwidth utilization rate.

9. The network performance data includes a routing analysis graph based on a plurality of response packets from one or more clients, the routing analysis graph being a visual representation of patent latency data associated with the plurality of response packets and the plurality of corresponding routes for the plurality of response packets, according to one or more computer storage media of claim 8.

10. The network performance data includes traffic metrics associated with a transport layer balancer (TLB) and an ingress gateway, as described in one or more computer storage media according to claim 8.

11. The network comprises an application gateway and a transport layer balancer (TLB) operably connected to an ingress gateway via a tunnel, one or more computer storage media according to claim 8.

12. The plurality of packet interceptors include a first packet interceptor configured to add an incoming time to a timestamp option in a plurality of incoming packets, the first packet interceptor being operably connected to a traffic control hook of the transport layer balancer, one or more computer storage media according to claim 8.

13. The plurality of packet interceptors include a second packet interceptor configured to calculate a transport layer balancer (TLB) time indicating packet latency, the second packet interceptor being operably connected to an egress network interface of the transport layer balancer's traffic control hook, one or more computer storage media according to claim 8.

14. The plurality of packet interceptors include a third packet interceptor configured to inspect the tunneled packets for packet latency data, the third packet interceptor being operably connected to the ingress network interface of the ingress gateway's traffic control hook, one or more computer storage media according to claim 8.

15. The alert is associated with node-level traffic associated with the network, one or more computer storage media according to claim 8.

16. The determination that the bandwidth utilization rate has met a preset threshold for the network is based on correlating an increase in packet latency with an increase in traffic load on the network using the packet latency data, one or more computer storage media according to claim 8.

17. A method performed by a computer, This involves accessing response packets associated with packet latency data determined by multiple packet interceptors that support tracking multiple packet latency metrics using multiple transmitted packets, within a network filter. The response packet associated with the client is updated using the packet latency data, A method comprising sending the aforementioned response packet to the client.

18. The method according to claim 17, wherein the network filter supports lookup and delete operations that can be performed on a map data structure storing the packet latency data in order to support the generation of the response packet.

19. The method according to claim 17, wherein the packet latency data is stored in a map data structure associated with keys and values, the key being based on the internal packet source IP and port, and the value being based on the tunnel source IP and transport layer balancer (TLB) time.

20. The aforementioned network filter is Server application and An application gateway ingress gateway having a transport layer balancer (TLB) operationally connected to the ingress gateway via a tunnel, The method according to claim 17, wherein the device is operably connected to the device.