Web page access control method, electronic device, storage medium, and program product
By associating cached fragments with the server indication name field of the current fragment in the gateway device, the problem of interception omissions caused by packet fragmentation and out-of-order delivery is solved, ensuring the integrity and accuracy of the field and improving the efficiency and reliability of access control.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- TP-LINK INT SHENZHEN CO LTD
- Filing Date
- 2026-02-05
- Publication Date
- 2026-06-19
AI Technical Summary
When faced with packet fragmentation or out-of-order delivery, gateway devices may be unable to fully extract and identify the server instruction name field, potentially leading to missed interceptions in access control.
By associating cached fragments of the same target message with the current fragment, the server indication name field is extracted and processed. The integrity and accuracy of the field are ensured by utilizing the encapsulation and caching mechanisms of the transport layer protocol.
It improves the accuracy of access control interception, avoids interception omissions caused by incomplete field extraction, reduces system resource consumption, and enhances the adaptability and reliability of web page access control.
Smart Images

Figure CN122247650A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of network security technology, and in particular to a web page access control method, electronic device, computer-readable storage medium, and computer program product. Background Technology
[0002] When performing web page access control, relevant web page access control technologies typically extract the SNI (Server Name Indication) field from HTTPS traffic through gateway devices to achieve access control. However, gateway devices can usually only perform isolated analysis of single packets. When complex network situations such as packet fragmentation or out-of-order delivery occur, they often cannot completely extract and identify the SNI field, which may lead to missed interceptions in access control. Summary of the Invention
[0003] This application provides a web page access control method, an electronic device, a computer-readable storage medium, and a computer program product.
[0004] This application provides a web page access control method, the method being used in a gateway device, the method comprising:
[0005] Obtain the current message segment sent by the target terminal; Based on the received cached message fragments and the current message fragment, the server indication name field of the current message fragment is extracted. The cached message fragment and the current message fragment are fragments of the same target message and are both encapsulated by the transport layer protocol before being transmitted to the gateway device. If the complete server indication name field is extracted, the target operation for the target message corresponding to the current message fragment is determined based on the complete server indication name field and the preset rule base.
[0006] In this way, by associating cached fragments of the same target packet with the current fragment for field extraction, the problem that traditional packet-by-packet independent analysis schemes cannot handle packet fragmentation and out-of-order delivery is solved. To a certain extent, this ensures the complete extraction of the server instruction name field, improves the accuracy of access control interception, and effectively avoids interception omissions caused by incomplete field extraction.
[0007] In some implementations, the step of extracting the server indication name field from the current packet fragment based on the received cached packet fragment and the current packet fragment includes: The server indication name field is extracted from the current message segment to determine the first server indication name field segment; The first server indication name field fragment and the second server indication name field fragment are combined, wherein the second server indication name field fragment is determined by extracting the server indication name field from the cached message fragment.
[0008] In this way, by extracting and recombining fragments, the problem of the server indication name field being split into multiple message fragments is effectively solved. This ensures that no matter how the server indication name field is split, the complete server indication name field content can be obtained through combination processing. To a certain extent, this improves the success rate and completeness of server indication name field extraction, providing a reliable guarantee for accurate comparison with the preset rule base. This enhances the accuracy of web page access control policy execution to a certain extent and avoids misjudgment or omission caused by incomplete extraction of the server indication name field.
[0009] In some embodiments, the method further includes: If the result of the combined processing is an incomplete server indication name field, the result of the combined processing is cached, and the sequence number of the last byte in the current message segment is recorded, wherein the sequence number is the sequence position of the last byte in the current message segment in the target message.
[0010] In this way, by caching incomplete combination results, the repetitive operation of re-extracting existing field content after subsequent fragments arrive is avoided, reducing redundant calculations and reducing the system resource consumption of the gateway device to a certain extent. Furthermore, by recording the sequence number of the last byte of the current packet fragment, a precise location basis is provided for subsequent field completion, ensuring that the missing part can be quickly located after subsequent fragments arrive, achieving accurate field completion. This guarantees the continuity and integrity of the server-indicated name field extraction, thereby effectively solving the problem of difficult field extraction caused by packet fragmentation delay and out-of-order delivery to a certain extent, and enhancing the adaptability and reliability of web access control in complex network environments.
[0011] In some implementations, the step of extracting the server indication name field from the current packet fragment based on the received cached packet fragment and the current packet fragment includes: The gateway device establishes a transmission control protocol connection with the target terminal through its local upper-layer protocol stack. The current message segment is received via the transmission control protocol connection; Based on the received cached message fragments and the current message fragment, the server indication name field is extracted from the current message fragment.
[0012] Thus, by establishing a Transmission Control Protocol (TCP) connection and fully utilizing its reliable and ordered transmission characteristics, problems such as message fragment loss and out-of-order delivery are effectively solved. This provides a complete and stable data foundation for extracting the server instruction name field, improving the accuracy and success rate of field extraction. At the same time, compared with the complex encryption and decryption operations in man-in-the-middle proxy schemes, the implementation method of this application does not rely on the protocol support of external servers. Connection establishment and data reception can be completed through the gateway device's own protocol stack, which reduces dependence on external resources to a certain extent, reduces system resource consumption, and ensures the forwarding performance of the gateway device.
[0013] In some implementations, the step of extracting the server indication name field from the current packet fragment based on the received cached packet fragment and the current packet fragment includes: Extract the metadata group of the transport layer message, wherein the transport layer message is obtained by encapsulating the current message segment based on the transport layer protocol; Based on the transport layer protocol and destination port in the metadata group, determine whether the current message fragment contains an encryption protocol; In the presence of the encryption protocol, the server indication name field is extracted from the current message segment based on the cached message segment and the current message segment.
[0014] In this way, by quickly filtering out message fragments containing encryption protocols through metadata groups, and only performing server indication name field extraction processing on messages containing encryption protocols, invalid extraction of non-encrypted protocol messages is avoided to a certain extent, improving the processing efficiency of gateway devices and reducing system resource waste. At the same time, based on the judgment of transport layer protocol and destination port, the field extraction is ensured to be targeted, avoiding misprocessing of non-target protocol messages to a certain extent, improving the accuracy of server indication name field extraction, and laying the foundation for the effective execution of subsequent access control policies.
[0015] In some embodiments, the method further includes: In the absence of the encryption protocol, extract the Uniform Resource Locator (URL) field of the current message segment; The target operation for the current message segment is determined based on the preset rule base and the Uniform Resource Locator field.
[0016] Thus, by supplementing the webpage access control processing logic when no encryption protocol is available, comprehensive coverage of the two mainstream webpage access protocols, Hypertext Transfer Protocol (HTTP) and Encrypted Hypertext Transfer Protocol (HTTP), is achieved. This improves the applicability and comprehensiveness of webpage access control to a certain extent. Furthermore, by extracting the Uniform Resource Locator (URL) field and comparing it with the preset rule base, access control in the HTTP scenario is ensured to be executed accurately. This avoids omissions in access control for plaintext protocol messages to a certain extent, thereby improving the stability and reliability of webpage access control.
[0017] In some implementations, the step of extracting the server indication name field from the current message segment based on the cached message segment and the current message segment, when the encryption protocol is present, includes: The source Internet Protocol address and destination Internet Protocol address in the metadata group are compared with a preset cache table of allowed Internet Protocol addresses; If the encryption protocol exists and the source Internet Protocol address and the destination Internet Protocol address are not in the allowed Internet Protocol address cache table, the server instruction name field of the current message fragment is extracted based on the cached message fragment and the current message fragment.
[0018] In this way, by allowing the filtering mechanism of the Internet Protocol address cache table, duplicate access requests that have passed access control can be quickly screened out. This avoids repeated extraction and comparison operations of access requests to a certain extent, reduces the system resource consumption of the gateway device, and improves the forwarding performance and processing efficiency of the gateway device. Furthermore, for first-time access or requests that have not passed verification, the complete field extraction and comparison process is still performed, which improves the security and accuracy of web page access control to a certain extent.
[0019] In some implementations, the step of determining the target operation for the target packet corresponding to the current packet fragment based on the server indication name field and a preset rule base, when the complete server indication name field has been extracted, includes: If the server indicates that the name field matches the allow policy in the preset rule base, it is determined to allow the target packet, and the source Internet Protocol address and the destination Internet Protocol address are added to the allow Internet Protocol address cache table. If the server indicates that the name field matches the interception policy in the preset rule base, it is determined that an interception operation will be performed on the target message.
[0020] In this way, by clearly defining the policy execution logic after extracting the server instruction name field, the problem of unclear policy execution in traditional solutions is solved, ensuring the normal forwarding of legitimate requests and the effective blocking of illegal requests. This improves the accuracy and reliability of access control to a certain extent. Furthermore, by adding address pairs that conform to the allow policy to the cache table, the optimization effect of the cache table is enhanced, reducing the invalid processing of duplicate requests to a certain extent, reducing system resource consumption, and improving the overall processing efficiency of the gateway device.
[0021] In some implementations, the step of determining to allow the target packet and adding the source Internet Protocol address and the destination Internet Protocol address to the allowed Internet Protocol address cache table when the server indicates that the name field matches the allow policy in the preset rule base includes: Construct a reset message, wherein the source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the reset message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the metadata group of the target message, respectively. The reset message is sent to the target terminal, wherein the target terminal, after receiving the reset message, initiates a webpage access request again.
[0022] Thus, by constructing a reset message and sending it to the target terminal, the current invalid connection can be quickly terminated, avoiding transmission anomalies caused by connection context mismatch. This ensures that the target terminal can successfully establish a direct connection with the target server, guaranteeing the normal execution of legitimate access requests. Furthermore, the reverse address and port configuration of the reset message prevent the target terminal from detecting the intervention of the gateway device, ensuring the transparency and smoothness of the access process to a certain extent. In addition, the access request re-initiated by the target terminal can hit the cache table, enabling the rapid passage of web page access requests. This effectively improves the processing efficiency of duplicate legitimate requests to a certain extent, fully leveraging the optimization function of the cache table.
[0023] In some implementations, determining to perform an interception operation on the target packet when the server indicates that the name field matches the interception policy in the preset rule base includes: Construct a response message including a preset interception prompt page, wherein the source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the response message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the target message, respectively. The response message is sent to the target terminal so that the interception prompt page can be displayed on the target terminal.
[0024] In this way, by constructing a response message that includes an interception prompt page, clear feedback information is provided to the user while performing the interception operation, improving the user experience, avoiding repeated access requests by the user due to lack of awareness, reducing the waste of network and system resources, and ensuring that the target terminal can receive and parse the response message normally to a certain extent, avoiding message dropping or parsing failure due to mismatched configurations, ensuring the effective delivery of interception prompt information, and enhancing the stability and practicality of the web access control function.
[0025] In some implementations, the gateway device is a Vector Packet Processing Engine (VPP), and the detection filtering node is registered in the packet processing characteristic arc of the VPP. The step of obtaining the current message segment sent by the target terminal includes: The Vector Message Processing Engine (VPP) receives the current message segment; The step of extracting the server indication name field from the current message fragment based on the received cached message fragments and the current message fragment includes: The local upper-layer protocol stack of the Vector Message Processing Engine (VPP) performs server indication name field extraction processing on the current message fragment based on the cached message fragment and the current message fragment. Upon extracting the complete server indication name field, determining the target operation for the target message corresponding to the current message fragment based on the complete server indication name field and a preset rule base includes: When the local upper-layer protocol stack extracts the complete server indication name field, it determines the target operation for the target message corresponding to the current message fragment based on the complete server indication name field and a preset rule base.
[0026] Thus, by configuring the gateway device as the Vector Packet Processing Engine (VPP), the high-performance forwarding architecture and user-space protocol stack of the VPP improve the speed of packet reception, field extraction, and policy matching. This reduces the overhead of context switching and hierarchical encapsulation in traditional architectures to some extent, lowers the latency of web access control, and the flexible deployment of functional nodes in the scalable architecture of the VPP, as well as the collaborative work between the detection and filtering nodes and the local upper-layer protocol stack, improves the packet flow speed between different processing nodes, thereby improving the efficiency and accuracy of packet processing to a certain extent.
[0027] In some implementations, the native upper-layer protocol stack of the Vector Packet Processing Engine (VPP) performs server indication name field extraction processing on the current packet fragment based on the cached packet fragment and the current packet fragment, including: The metadata group of the transport layer message is extracted by detecting the filtering node; Based on the transport layer protocol and destination port in the metadata group, determine whether the current message fragment contains an encryption protocol; In the presence of the encryption protocol, the Vector Packet Processing Engine (VPP) determines the target node of the current packet fragment as the local node of the VPP and sends the current packet fragment to the local node in order to send the current packet fragment to the local upper-layer protocol stack. The local upper-layer protocol stack extracts the server indication name field from the current message segment based on the cached message segment and the current message segment.
[0028] In this way, by accurately filtering the detection nodes, only encrypted protocol messages are allowed to enter the local upper-layer protocol stack for processing, avoiding the resource occupation of unencrypted protocol messages. This improves the processing efficiency and targeting of web access control under the vector message processing engine architecture. Furthermore, the connection between the local node and the local upper-layer protocol stack reduces the message transmission latency to a certain extent, ensures the rapid execution of server instruction name field extraction, and improves the detection and interception efficiency of the server instruction name field.
[0029] In some implementations, the local upper-layer protocol stack performs server indication name field extraction processing on the current packet fragment based on the cached packet fragment and the current packet fragment, including: The Vector Message Processing Engine (VPP) establishes a Transmission Control Protocol (TCP) connection with the target terminal. Based on the Transmission Control Protocol connection, the current message fragment is sent to the local upper-layer protocol stack via the local node; The local upper-layer protocol stack extracts the server indication name field from the current message segment based on the cached message segment and the current message segment.
[0030] In this way, the vector message processing engine simulates the establishment of a transmission control protocol connection between the target server and the target terminal through the local upper-layer protocol stack. All message fragments under this connection are stored in the corresponding receive buffer. With the help of the caching and sorting mechanism of the protocol stack, missing message fragments are filled in, ensuring that the complete server indication name field can be extracted in the case of message fragmentation and out-of-order delivery. This effectively avoids the problem of interception omissions and improves the accuracy of access control in the case of encryption protocol to a certain extent. Moreover, this connection does not require full proxy communication and encryption / decryption operations, which reduces memory resource consumption to a certain extent and ensures gateway forwarding performance and network access efficiency.
[0031] Furthermore, the graph node scheduling architecture of the vector packet processing engine enables packets to be quickly sent to the local upper-layer protocol stack after being diverted by the detection and filtering nodes, without the need for complex kernel forwarding paths. This reduces packet processing latency to a certain extent, improves detection and interception efficiency, and optimizes the user's network experience.
[0032] In some implementations, when the local upper-layer protocol stack extracts the complete server indication name field, it determines the target operation on the target packet corresponding to the current packet fragment based on the complete server indication name field and a preset rule base, including: If the server indicates that the name field matches the blocking policy in the preset rule base, a response message including a preset blocking prompt page is constructed, wherein the source Internet Protocol address, destination Internet Protocol address, source port and destination port of the response message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port and source port of the target message, respectively. Set the target node of the response message to the Internet Protocol Address Route Lookup node of the Vector Message Processing Engine (VPP); The response message is sent to the target terminal via the Internet Protocol address routing lookup node, so that the interception prompt page is displayed on the target terminal.
[0033] Thus, the vector packet processing engine architecture clearly defines the construction, configuration, and forwarding mechanisms for intercepted response packets. By introducing Internet Protocol address routing lookup nodes, the high-performance routing lookup advantage of the vector packet processing engine is fully utilized. This ensures, to a certain extent, that response packets can be delivered to the target terminal quickly and accurately via the optimal path, improving the efficiency and reliability of interception feedback. Furthermore, the reverse address and port configuration of the response packets ensures, to a certain extent, that the target terminal can correctly identify and parse the packets, successfully display the interception prompt page, and allow users to know the reason for the interception in a timely manner, avoiding invalid retries and reducing the waste of network resources and the processing capacity of the vector packet processing engine.
[0034] This application also provides an electronic device, including a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, it implements the methods described in some of the above embodiments.
[0035] This application also provides a computer-readable storage medium storing a computer program that, when executed by one or more processors, implements the methods described in some of the above embodiments.
[0036] This application also provides a computer program product, including a computer program / instructions that, when executed by a processor, implement the methods described in some of the above embodiments.
[0037] The electronic device, computer-readable storage medium, and computer program product provided in this application, when implementing the above method, first obtain the current message fragment sent by the target terminal; then, based on the received cached message fragments and the current message fragment, perform server indication name field extraction processing on the current message fragment, wherein the cached message fragment and the current message fragment are fragments of the same message; finally, if the complete server indication name field is extracted, the target operation for the message corresponding to the current message fragment is determined based on the complete server indication name field and a preset rule base. In this way, by associating cached fragments and the current fragment with the same target message for field extraction, the problem that traditional packet-by-packet independent analysis schemes cannot handle message fragmentation and out-of-order delivery is solved. This ensures, to a certain extent, the complete extraction of the server indication name field, improves the accuracy of access control interception, and effectively avoids interception omissions caused by incomplete field extraction.
[0038] Additional aspects and advantages of embodiments of this application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of embodiments of this application. Attached Figure Description
[0039] The above and / or additional aspects and advantages of this application will become apparent and readily understood from the description of the embodiments taken in conjunction with the following drawings, wherein: Figure 1 This is one of the flowcharts illustrating a webpage access control method according to certain embodiments of this application; Figure 2 This is a second flowchart illustrating a webpage access control method according to certain embodiments of this application; Figure 3 This is a third flowchart illustrating a webpage access control method according to certain embodiments of this application; Figure 4 This is the fourth flowchart of a web page access control method according to certain embodiments of this application; Figure 5 This is the fifth flowchart illustrating a webpage access control method according to certain embodiments of this application; Figure 6 This is a flowchart of a web page access control method according to certain embodiments of this application, number six. Figure 7 This is the seventh flowchart of a web page access control method according to certain embodiments of this application; Figure 8 This is the eighth flowchart of a web page access control method according to certain embodiments of this application; Figure 9 This is the ninth flowchart of a web page access control method according to certain embodiments of this application; Figure 10 This is the tenth flowchart of a web page access control method according to certain embodiments of this application; Figure 11 This is a schematic diagram illustrating the flow of encrypted Hypertext Transfer Protocol (HTTP) messages from the local area network (LAN) side to the gateway in certain embodiments of this application. Figure 12 This is a schematic diagram illustrating the flow of the local upper-layer protocol stack replacing the target server's reply message in certain embodiments of this application; Figure 13 This is a schematic diagram illustrating the flow of Hypertext Transfer Protocol (HTTP) messages from the local area network (LAN) side to the gateway in certain embodiments of this application. Figure 14 This is a schematic diagram illustrating the message flow from the local upper-layer protocol stack to the target terminal in certain embodiments of this application. Figure 15 This is eleventh of the flowcharts illustrating a webpage access control method according to certain embodiments of this application; Figure 16 This is a schematic diagram of the message flow of a graph node architecture based on a vector message processing engine in some embodiments of this application; Figure 17 This is the twelfth flowchart of a web page access control method according to certain embodiments of this application; Figure 18 This is the thirteenth flowchart of a web page access control method according to certain embodiments of this application; Figure 19 This is the fourteenth flowchart of a web page access control method according to certain embodiments of this application; Figure 20 This is a schematic diagram illustrating the packet flow from the WAN side to the gateway in certain embodiments of this application. Detailed Implementation
[0040] The embodiments of this application are described in detail below. Examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the embodiments of this application, and should not be construed as limiting the embodiments of this application.
[0041] In the field of network security technology, web access control is one of the security functions of gateway devices or firewall devices. Its core purpose is to allow or prohibit users from accessing specific websites or web pages on the Internet through the gateway according to preset policies. This function is also known as Uniform Resource Locator (URL) filtering. In a typical network architecture, enterprise intranet users access the Internet through the gateway device and then through the operator's router. When intranet users need to access web pages or multimedia resources on the Internet, they usually send access requests using Hypertext Transfer Protocol (HTTP) or Hypertext Transfer Protocol Secure (HTTPS). This request is forwarded by the gateway device to the target World Wide Web (Web) server, and the response message generated by the Web server is then sent back to the intranet user through the gateway device.
[0042] The Uniform Resource Locator (URL) filtering and detection module on the gateway device analyzes the URL or SNI fields involved in the HTTP request transmission process, and then decides whether to continue forwarding the request or send back the corresponding HTTP response.
[0043] In real-world web access control scenarios, some gateway devices analyze each packet forwarded from the LAN to the internet independently. They sequentially extract the Internet Protocol (IP) and Transmission Control Protocol (TCP) layer payloads, obtain the Uniform Resource Locator (URI) or Server Indicator Name (SIN) field, and then compare the obtained content with a preset interception policy. If interception is required, a Transmission Control Protocol Reset Message is constructed and forwarded back to the LAN-side user terminal. However, when packets exhibit complex network conditions such as TCP layer fragmentation or out-of-order delivery, this approach cannot analyze complete TCP connection information, making it difficult to fully extract the URI or SIN field, leading to missed interceptions.
[0044] In some solutions, when internal network users access the internet via HTTPS, a connection must first be established with the gateway proxy server. The proxy server then establishes a connection with the target web server and analyzes relevant information. If the packet meets the allow policy, it is decrypted and re-encrypted before forwarding; if it meets the block policy, the connection between the two ends is terminated. However, in this solution, the proxy server needs to reissue certificates and perform a large number of encryption and decryption operations, consuming excessive system resources. This is unsuitable for gateway devices with limited resources and will also reduce gateway forwarding performance, affecting network access efficiency.
[0045] Based on the above issues, please refer to Figure 1 This application provides a webpage access control method for a gateway device, the method comprising: 01: Obtain the current message segment sent by the target terminal; 02: Based on the received cached message fragments and the current message fragment, the server indication name field is extracted from the current message fragment. The cached message fragment and the current message fragment are fragments of the same target message and are both encapsulated by the transport layer protocol before being transmitted to the gateway device. 03: If the complete server instruction name field is extracted, determine the target operation for the target message corresponding to the current message fragment based on the complete server instruction name field and the preset rule base.
[0046] This application provides a web page access control device. The web page access control method of this application can be implemented by the web page access control device of this application. Specifically, the web page access control device includes an acquisition module, an extraction module, and a determination module. The acquisition module is used to acquire the current message fragment sent by the target terminal. The extraction module is used to extract the server indication name field from the current message fragment based on the received cached message fragment and the current message fragment, wherein the cached message fragment and the current message fragment are fragments of the same message. The determination module is used to determine the target operation for the target message corresponding to the current message fragment based on the complete server indication name field and a preset rule base when the complete server indication name field is extracted.
[0047] This application also provides a server, which includes a memory and a processor. The web page access control method of this application can be implemented by the server of this application. Specifically, the memory stores a computer program, and the processor is used to acquire a current message fragment sent by a target terminal. The processor is also used to extract the server indication name field from the current message fragment based on the received cached message fragment and the current message fragment, wherein the cached message fragment and the current message fragment are fragments of the same message. The processor is also used to, if the complete server indication name field is extracted, determine the target operation for the target message corresponding to the current message fragment based on the complete server indication name field and a preset rule base.
[0048] Specifically, a gateway device is a network device deployed at the network boundary to connect different networks and perform functions such as data forwarding and access control. For example, a gateway between an enterprise intranet and the Internet.
[0049] The target terminal is the terminal device that initiates the webpage access request, such as a computer, mobile phone, tablet, or other device that can access the network.
[0050] The current message segment is a data block that the gateway device receives in real time and belongs to a certain target message. During the transmission process, the target message may be split into multiple independent data blocks because its length is greater than or equal to the Maximum Transmission Unit (MTU) threshold or for other reasons, or it may be transmitted as a single data block without splitting.
[0051] Cached message fragments are other data blocks that the gateway device has previously received and stored, belonging to the same target message as the current message fragment, and are used to supplement field information that may be missing from the current message fragment.
[0052] The target message is a complete data message generated when the target terminal initiates an access request. It can be split into multiple message fragments for transmission according to transmission requirements.
[0053] Transport layer protocols are protocol layers in the network protocol stack responsible for end-to-end data transmission. They are used to encapsulate message fragments and ensure the reliability and orderliness of transmission.
[0054] The Server Indicator Name field is an extended field in the Client Hello message of the Transport Layer Security (TLS) protocol. It is used to identify the hostname of the server that the user wants to access and is a key field for identifying the access target in the HTTPS protocol scenario.
[0055] The default rule base is a database that stores user-defined access control policies, such as rules for allowing and blocking access, used to determine whether an access request meets the permission requirements.
[0056] The target operation is the processing action performed on the message based on the comparison results of the preset rule base. It usually includes the allow operation to allow the message to be forwarded and the block operation to prevent the message from being forwarded.
[0057] The gateway device receives network data encapsulated by transport layer protocols from the target terminal in real time and uses this network data as the current packet fragment. Before receiving the current packet fragment, the gateway device may have already received other fragments of the target packet corresponding to this current packet fragment. The other received fragments are stored in the buffer area to form a buffered packet fragment. The buffered packet fragment and the current packet fragment are essentially different components of the same target packet.
[0058] Subsequently, the gateway device initiates the server indication name field extraction process. This process does not operate on the current message fragment individually, but rather integrates the current message fragment with other cached fragments belonging to the same target message based on the correlation of the same target message. Since the server indication name field is located in the extended field of the TLS protocol Client Hello message, it may be split into multiple fragments due to message fragmentation. By integrating all relevant fragments of the same target message, the integrity of the field extraction can be ensured.
[0059] Once the complete server indication name field is successfully extracted through fragment integration, the gateway device compares this field with the preset rule base. The preset rule base includes various access control policies set by the user according to actual needs, such as lists of allowed servers and prohibited website keywords.
[0060] By matching the extracted server indication name field with policies in the rule base one by one, the gateway device can determine whether the access request meets the permission requirements, and then determine the target operation for the complete message corresponding to the current message fragment, thereby achieving precise control over web page access.
[0061] In this way, by associating cached fragments of the same target packet with the current fragment for field extraction, the problem that traditional packet-by-packet independent analysis schemes cannot handle packet fragmentation and out-of-order delivery is solved. To a certain extent, this ensures the complete extraction of the server instruction name field, improves the accuracy of access control interception, and effectively avoids interception omissions caused by incomplete field extraction.
[0062] Please see Figure 2 In some implementations, step 02 includes: 021: Extract the server indication name field from the current message segment to determine the first server indication name field segment; 022: Combine the first server indication name field fragment and the second server indication name field fragment, wherein the second server indication name field fragment is determined by extracting the server indication name field from the cached message fragment.
[0063] In some implementations, the extraction module is further configured to extract the server indication name field from the current message segment to determine a first server indication name field segment. The extraction module is also configured to combine the first server indication name field segment and a second server indication name field segment, wherein the second server indication name field segment is determined by extracting the server indication name field from a cached message segment.
[0064] In some implementations, the processor is further configured to perform server indication name field extraction processing on the current message segment to determine a first server indication name field segment. The processor is also configured to combine the first server indication name field segment and a second server indication name field segment, wherein the second server indication name field segment is determined by performing server indication name field extraction processing on a cached message segment.
[0065] Specifically, the first server indication name field fragment is a portion of the content belonging to the server indication name field extracted from the currently received message fragment.
[0066] The second server indication name field fragment is a portion of the server indication name field extracted from other fragments of the same message that have been cached by the gateway device.
[0067] The combined processing involves splicing and completing the first server indication name field fragment and the second server indication name field fragment according to their inherent order in the original message, integrating them into a complete server indication name field.
[0068] After receiving a message, the gateway device performs server indication name field extraction processing on the received current message fragment. This extraction process is typically based on the format characteristics of the Client Hello message in the TLS protocol. It identifies and extracts the byte sequence related to the server indication name field from the TCP payload of the message fragment to extract the first server indication name field fragment. Since the current message fragment may only be a part of the target message, the first server indication name field fragment may also be a partial fragment of the complete server indication name field.
[0069] In addition, before receiving the current message segment, the gateway device may have received and cached other segments of the same message in advance, and has already performed the same extraction logic as the current message segment on the cached message segments. That is, the gateway device has parsed the cached message segments according to the format characteristics of the Client Hello message, identified and extracted the content related to the server indication name field, and then obtained the second server indication name field segment.
[0070] The second server indication name field fragment and the first server indication name field fragment extracted from the current message fragment are essentially different parts of the same complete server indication name field after being split. Since the fragmentation of the target message during transmission follows a fixed order rule, the second and first server indication name field fragments naturally have a clear sequential relationship within the target message. This inherent sequential association provides a solid logical foundation for the subsequent combination of the second and first server indication name field fragments, ensuring that a complete and accurate server indication name field can be reconstructed after combination.
[0071] Based on the aforementioned explicit sequential association, the gateway device combines the first server indication name field fragment and the second server indication name field fragment.
[0072] In some implementations, the combination process can involve concatenating the second server indication name field fragment and the first server indication name field fragment according to the inherent order of the server indication name field in the target message. This includes deduplication of any overlapping field content and completion of any missing parts to ultimately form a complete server indication name field. For example, if the first server indication name field fragment includes the first half of the server indication name field and the second server indication name field fragment includes the second half, then the two parts are concatenated into a complete server indication name field through the combination process. If there is some overlap between the second and first server indication name field fragments, one complete fragment is retained to ensure the accuracy of the combined server indication name field.
[0073] In this way, by extracting and recombining fragments, the problem of the server indication name field being split into multiple message fragments is effectively solved. This ensures that no matter how the server indication name field is split, the complete server indication name field content can be obtained through combination processing. To a certain extent, this improves the success rate and completeness of server indication name field extraction, providing a reliable guarantee for accurate comparison with the preset rule base. This enhances the accuracy of web page access control policy execution to a certain extent and avoids misjudgment or omission caused by incomplete extraction of the server indication name field.
[0074] Please see Figure 3 In some implementations, the method includes: 04: If the result of the combined processing is an incomplete server indication name field, cache the result of the combined processing and record the sequence number of the last byte in the current message segment, where the sequence number is the sequence position of the last byte in the current message segment in the target message.
[0075] In some implementations, the extraction module is further configured to cache the result of the combined processing if the result of the combined processing is an incomplete server indication name field, and record the sequence number of the last byte in the current message segment, wherein the sequence number is the sequence position of the last byte in the current message segment in the target message.
[0076] In some implementations, the processor is further configured to cache the result of the combined processing if the result of the combined processing is an incomplete server indication name field, and record the sequence number of the last byte in the current message segment, wherein the sequence number is the sequence position of the last byte in the current message segment in the target message.
[0077] Specifically, the incomplete server indication name field is a field that, even after being processed, still lacks some bytes and cannot fully identify the target server hostname.
[0078] The sequence number is used to mark the specific position of the last byte in the current message segment within its target message. The sequence number determines the order of the bytes.
[0079] After extracting and combining the first and second server indication name field fragments, the gateway device performs an integrity check on the combined result. The integrity check is usually based on the protocol format characteristics of the server indication name field, such as the field length range and the field end identifier. If the combined result does not conform to the protocol format characteristics, it is determined to be an incomplete server indication name field.
[0080] If the combination result is determined to be incomplete, the gateway device stores the incomplete result in a dedicated cache area to preserve the extracted field content and avoid duplicate extraction. Simultaneously, the gateway device records the sequence number of the last byte in the current message segment. This sequence number accurately identifies the specific location of that byte in the target message. By recording this sequence number, the gateway device can determine the byte range corresponding to the currently received segment, thereby determining the byte range that subsequent message segments should include, providing a clear location basis for subsequent field completion.
[0081] For example, if the sequence number of the last byte of the current message fragment is 1200, it indicates that the combined incomplete fields cover the range of bytes 1 to 1200 of the target message. The remaining part of the server indication name field must exist in the message fragment with sequence number 1201 and thereafter. When the subsequent message fragment carrying this part arrives, the gateway device will first extract the sequence number of the new fragment, combine it with the previously recorded sequence number to quickly locate the missing field, then extract the corresponding server indication name field fragment from the new fragment, and perform a secondary combination with the cached incomplete combination result until the complete field is obtained.
[0082] In this way, by caching incomplete combination results, the repetitive operation of re-extracting existing field content after subsequent fragments arrive is avoided, reducing redundant calculations and reducing the system resource consumption of the gateway device to a certain extent. Furthermore, by recording the sequence number of the last byte of the current packet fragment, a reliable location basis is provided for subsequent field completion, ensuring that the missing part can be quickly located after subsequent fragments arrive, achieving accurate field completion, and guaranteeing the continuity and integrity of the server-indicated name field extraction. Thus, to a certain extent, the problem of difficult field extraction caused by packet fragmentation delay and out-of-order delivery is effectively solved, enhancing the adaptability and reliability of web page access control in complex network environments.
[0083] Please see Figure 4 In some implementations, step 02 includes: 023: Establish a transmission control protocol connection with the target terminal through the gateway device's local upper-layer protocol stack; 024: Receive the current message segment via Transmission Control Protocol connection; 025: Based on the received cached message fragments and the current message fragment, extract the server instruction name field from the current message fragment.
[0084] In some implementations, the extraction module is further configured to establish a Transmission Control Protocol (TCP) connection with the target terminal via the gateway device's local upper-layer protocol stack. The extraction module is also configured to receive the current message fragment via the TCP connection. Furthermore, the extraction module is configured to extract the server indication name field from the current message fragment based on the received cached message fragments and the current message fragment.
[0085] In some implementations, the processor is further configured to establish a Transmission Control Protocol (TCP) connection with the target terminal via the gateway device's local upper-layer protocol stack. The processor is also configured to receive the current message segment via the TCP connection. Furthermore, the processor is configured to extract the server indication name field from the current message segment based on the received buffered message segments and the current message segment.
[0086] Specifically, the local upper-layer protocol stack is a collection of software modules in the gateway device that implement layered processing of network protocols, including protocol processing logic such as transport layer and application layer. It does not rely on protocol support from external servers and can independently complete functions such as connection management and data transmission for end-to-end communication. Examples include the Linux kernel protocol stack and the Vector Packet Processing (VPP) network protocol stack.
[0087] Transmission Control Protocol (TCP) connection is a connection-oriented communication link established based on the TCP. It has characteristics such as reliable transmission, ordered transmission, and flow control, which can ensure that data is transmitted completely and out of order between the gateway device and the target terminal.
[0088] Upon receiving a synchronization segment request from the target terminal, the gateway device initiates the connection establishment function of its local upper-layer protocol stack to establish a Transmission Control Protocol (TCP) connection with the target terminal that sent the current segment. The connection establishment process follows the TCP three-way handshake: the target terminal sends a synchronization segment requesting connection establishment; upon receiving this segment, the gateway device's local upper-layer protocol stack immediately replies with a synchronization acknowledgment segment, confirming its readiness to receive data; the target terminal, upon receiving the synchronization acknowledgment segment, replies with an acknowledgment segment, thus formally establishing the TCP connection. Furthermore, during the connection establishment process, the local upper-layer protocol stack simulates the communication behavior of the target server, preventing the target terminal from recognizing the gateway device as the communication partner, thereby ensuring smooth subsequent data transmission.
[0089] After the connection is established, the current message fragment sent by the target terminal is transmitted through the aforementioned transmission control protocol connection. That is, the current message fragment sent by the target terminal is first encapsulated into an independent transmission transport layer message based on the transport layer protocol, and then the transport layer message is transmitted to the gateway device through the transmission control protocol connection. Since the transmission control protocol has reliable transmission characteristics, it can ensure the orderly and complete reception of message fragments through mechanisms such as sequence numbers and acknowledgment numbers, effectively avoiding the problem of message loss or out-of-order delivery.
[0090] After receiving a transport layer message from the target terminal via the Transmission Control Protocol (TCP) connection, the gateway device parses the transport layer message to obtain the current message fragment and stores it in the corresponding receive buffer. Subsequently, the detection module of the local upper-layer protocol stack retrieves the current message fragment from the receive buffer and performs field extraction processing according to the protocol format characteristics of the server indication name field. If the server indication name field is split into multiple fragments, the local upper-layer protocol stack determines the order of the fragments based on the TCP sequence number, combines and concatenates the server indication name field content extracted from different fragments, and finally extracts the complete server indication name field.
[0091] Thus, by establishing a Transmission Control Protocol (TCP) connection and fully utilizing its reliable and ordered transmission characteristics, problems such as message fragment loss and out-of-order delivery are effectively solved. This provides a complete and stable data foundation for extracting the server instruction name field, improving the accuracy and success rate of field extraction. At the same time, compared with the complex encryption and decryption operations in man-in-the-middle proxy schemes, the implementation method of this application does not rely on the protocol support of external servers. Connection establishment and data reception can be completed through the gateway device's own protocol stack, which reduces dependence on external resources to a certain extent, reduces system resource consumption, and ensures the forwarding performance of the gateway device.
[0092] Please see Figure 5 In some implementations, step 02 includes: 026: Extract the metadata group of the transport layer message, where the transport layer message is obtained by encapsulating the current message fragment based on the transport layer protocol; 027: Determine whether the current message fragment contains an encryption protocol based on the transport layer protocol and destination port in the metadata group; 028: In the presence of an encryption protocol, extract the server instruction name field from the current message fragment based on the cached message fragment and the current message fragment.
[0093] In some implementations, the extraction module is further configured to extract the metadata group of the transport layer message, wherein the transport layer message is obtained by encapsulating the current message fragment based on the transport layer protocol. The extraction module is also configured to determine whether the current message fragment contains an encryption protocol based on the transport layer protocol and destination port in the metadata group. If an encryption protocol exists, the extraction module is further configured to extract the server indication name field from the current message fragment based on the cached message fragment and the current message fragment, if such an encryption protocol exists.
[0094] In some implementations, the processor is further configured to extract the metadata group of the transport layer message, wherein the transport layer message is obtained by encapsulating the current message fragment based on the transport layer protocol. The processor is further configured to determine whether an encryption protocol exists in the current message fragment based on the transport layer protocol and destination port in the metadata group. If an encryption protocol exists, the processor is further configured to extract the server indication name field from the current message fragment based on the cached message fragments and the current message fragment, if such an encryption protocol exists.
[0095] Specifically, a transport layer message is a complete transmission unit formed by encapsulating the current message fragment based on the transport layer protocol. It includes a metadata group and the payload content of the current message fragment, and is used to transmit data in the network.
[0096] Metadata sets are collections of information that include message transmission attributes, typically including transport layer protocol, source Internet Protocol address, source port, destination Internet Protocol address, destination port, etc., used to identify the transmission attributes of a message.
[0097] Transport layer protocols are protocol layers in the network protocol stack that are responsible for end-to-end data transmission. They determine the reliability and transmission method of data transmission. Common types are TCP and User Datagram Protocol (UDP).
[0098] The destination port is the port number on the target server to which the message is intended. Different ports correspond to different network services and serve as identifiers to distinguish the service type of the message. For example, port 80 corresponds to HTTP service, and port 443 corresponds to HTTPS service.
[0099] An encryption protocol is a protocol used to encrypt the transmission of data. The implementation method in this application is the TLS protocol, which is often used in conjunction with the HTTPS protocol to ensure the security of data transmission.
[0100] After receiving the current packet fragment, the gateway device first processes the transport layer packet encapsulating the current packet fragment, extracting the metadata group of the transport layer packet. The extraction of the metadata group is based on the layered parsing capability of the network protocol stack, obtaining attributes such as transport layer protocol and destination port from the header information of the transport layer packet, without parsing the packet payload content, ensuring that the extraction process is efficient and fast.
[0101] Subsequently, the gateway device determines whether the current packet fragment contains an encryption protocol based on the transport layer protocol and destination port information in the metadata group. For example, if the transport layer protocol in the metadata group is TCP and the destination port is a fixed port corresponding to the encryption protocol, then the current packet fragment is determined to contain an encryption protocol, and the packet is an HTTPS packet, requiring the extraction of the server indication name field.
[0102] If an encryption protocol is determined to be present, the gateway device retrieves the received cached message fragments and the current message fragment, and extracts the server indication name field from the TCP payload of the message fragment according to the format characteristics of the Client Hello message in the TLS protocol.
[0103] If it is determined that no encryption protocol exists, there is no need to extract the server indication name field. Instead, subsequent operations can be performed directly according to the processing logic corresponding to other protocols. For example, the Uniform Resource Locator (URL) field can be extracted from HTTP messages.
[0104] In this way, by quickly filtering out message fragments containing encryption protocols through metadata groups, and only performing server indication name field extraction processing on messages containing encryption protocols, invalid extraction of non-encrypted protocol messages is avoided to a certain extent, improving the processing efficiency of gateway devices and reducing system resource waste. At the same time, based on the judgment of transport layer protocol and destination port, the field extraction is ensured to be targeted, avoiding misprocessing of non-target protocol messages to a certain extent, improving the accuracy of server indication name field extraction, and laying the foundation for the effective execution of subsequent access control policies.
[0105] Please see Figure 6 In some implementations, the method further includes: 05: Extract the Uniform Resource Locator (URL) field of the current message segment in the absence of an encryption protocol; 06: Determine the target operation for the current message segment based on the preset rule base and the Uniform Resource Locator field.
[0106] In some implementations, the extraction module is further configured to extract the Uniform Resource Locator (URL) field of the current message segment in the absence of an encryption protocol. The extraction module is also configured to determine the target operation for the current message segment based on a preset rule base and the URL field.
[0107] In some implementations, the processor is further configured to extract the Uniform Resource Locator (URL) field of the current message segment in the absence of an encryption protocol. The processor is also configured to determine the target operation for the current message segment based on a preset rule base and the URL field.
[0108] Specifically, the Uniform Resource Locator (URL) field is an address field used to identify resources on the Internet, including information such as the resource's access protocol, server address, and resource path. It is a key field for identifying the user's access target in the HTTP plaintext protocol scenario.
[0109] When the gateway device determines that the current message fragment does not contain an encryption protocol based on the transport layer protocol and destination port in the metadata group, it indicates that the target message is an HTTP plaintext protocol message, and the gateway device initiates the Uniform Resource Locator (URL) field extraction process.
[0110] Extraction of the Uniform Resource Locator (URL) field is typically based on the request format characteristics of the HTTP protocol. The request line of an HTTP request message includes the URL field. Gateway devices parse the current message fragment, identify the format of the HTTP request line, and accurately extract the URL field from it. Furthermore, for cases where HTTP messages may be fragmented, since the URL field is located at the beginning of the request line, even if the target message is fragmented, the URL field can be quickly obtained by prioritizing the extraction of the beginning fragments, without waiting for all fragments to arrive.
[0111] After extracting the complete Uniform Resource Locator (URL) field, the gateway device compares the URL field with the preset rule base to determine the target operation for the current packet segment, i.e., the target packet.
[0112] Thus, by supplementing the webpage access control processing logic when no encryption protocol is available, comprehensive coverage of the two mainstream webpage access protocols, Hypertext Transfer Protocol (HTTP) and Encrypted Hypertext Transfer Protocol (HTTP), is achieved. This improves the applicability and comprehensiveness of webpage access control to a certain extent. Furthermore, by extracting the Uniform Resource Locator (URL) field and comparing it with the preset rule base, access control in the HTTP scenario is ensured to be executed accurately. This avoids omissions in access control for plaintext protocol messages to a certain extent, thereby improving the stability and reliability of webpage access control.
[0113] Please see Figure 7 In some implementations, step 028 further includes: 0281: Compare the source and destination Internet Protocol addresses in the metadata group with the preset cache table of allowed Internet Protocol addresses; 0282: When an encryption protocol exists and the source Internet Protocol address and destination Internet Protocol address are not in the allowed Internet Protocol address cache table, the server instruction name field of the current packet fragment is extracted based on the cached packet fragment and the current packet fragment.
[0114] In some implementations, the extraction module is further configured to compare the source Internet Protocol address and destination Internet Protocol address in the metadata group with a preset cached Internet Protocol address list. The extraction module is also configured to, in the case where an encryption protocol exists and the source and destination Internet Protocol addresses are not in the cached Internet Protocol address list, extract the server indication name field from the current packet fragment based on the cached packet fragment and the current packet fragment.
[0115] In some implementations, the processor is further configured to compare the source Internet Protocol address and destination Internet Protocol address in the metadata group with a preset allowed Internet Protocol address cache table. The processor is also configured to, in the case where an encryption protocol exists and the source Internet Protocol address and destination Internet Protocol address are not in the allowed Internet Protocol address cache table, extract the server indication name field from the current packet fragment based on the cached packet fragment and the current packet fragment.
[0116] Specifically, the source Internet Protocol address is the Internet Protocol address corresponding to the target terminal that sent the current message segment, used to identify the sender of the message.
[0117] The destination Internet Protocol address is the Internet Protocol address corresponding to the target server that the current message segment is trying to access, and is used to identify the recipient of the message.
[0118] The Internet Protocol address cache table is a cache database that stores source and destination Internet Protocol address pairs that have passed access control verification and are allowed to be directly allowed subsequently. It can be used to quickly filter duplicate legitimate access requests.
[0119] When the gateway device determines that the current message segment contains an encryption protocol, it extracts the source Internet Protocol address and the destination Internet Protocol address from the metadata group to form an address pair. The address pair can uniquely identify the sender and receiver of an access request.
[0120] The gateway device then compares the address pairs with a pre-defined cache table of allowed Internet Protocol (IP) addresses. This cache table stores source and destination IP address pairs that have passed access control verification and are permitted access.
[0121] If the query results show that the address pair already exists in the Internet Protocol address cache table, it indicates that the access request has been verified before. There is no need to extract the server instruction name field and compare the rule base again. The gateway device can directly allow the current packet fragment to continue forwarding and quickly complete the access control process.
[0122] If the query results show that the address pair is not in the cache table, it indicates that the access request is the first access or has not passed the verification before. The gateway device starts the server indication name field extraction process, that is, it combines the received cached message fragments and the current message fragment to extract the complete server indication name field, and compares it with the preset rule base to determine the corresponding target operation.
[0123] Furthermore, the Internet Protocol address cache table can be dynamically updated based on access control results. When an access request corresponding to a certain address pair passes verification, the address pair is added to the cache table. When an entry in the cache table reaches its aging time or is determined to be invalid, it is deleted from the cache table, ensuring the effectiveness of the cache table and the rationality of storage space. Understandably, the dynamic updating of the cache table can adapt to the dynamic changes in network access, maintaining the effectiveness and simplicity of the cache table to a certain extent, balancing processing efficiency and system resource consumption, and improving the practicality of web access control methods in complex network environments.
[0124] In some implementations, the allowed Internet Protocol address cache table consists of a hash table and a least recently used linked list. The hash table, as a globally shared data structure, is accessible to all packet processing threads. Its key-value pairs use the source and destination Internet Protocol address pairs as keys, and the corresponding values include the index of the least recently used linked list entry and the thread number that processed that address pair. This allows any thread to quickly locate the address pair by checking if it exists, without traversing all the data, resulting in low search overhead and effectively meeting the fast retrieval requirements in high-concurrency scenarios.
[0125] Secondly, the Least Recently Used (LRU) list is a linear data structure used for cache eviction. Data is sorted according to its access frequency, with the least recently accessed data located at the end of the list for priority eviction, ensuring efficient use of cache table storage space. Each LRU list corresponds one-to-one with a packet processing thread, with each thread maintaining its own independent list without interference. The packet processing thread is an independent execution unit within the gateway device used for parallel processing of network packets. Multiple threads can process different packet streams simultaneously, improving the overall processing throughput of the gateway to some extent.
[0126] When a thread processes a message and needs to add matching address pairs to its cache table, it first adds a new entry to its corresponding Least Recently Used (LRU) list to record the creation time and related information of the address pairs. The list is sorted according to the LRU principle: each time an entry is accessed, it is moved to the head of the list, while long-unaccessed entries are gradually moved to the tail. When the number of entries in the list exceeds or equals a preset threshold, or when some entries reach their aging time, the thread prioritizes eliminating inefficient data at the tail of the list to ensure the list remains concise and efficient, preventing redundant data from consuming excessive memory.
[0127] Furthermore, when a thread adds or removes an entry from its least recently used list, the corresponding entry in the global hash table is updated synchronously. When adding an entry, the address pair, the corresponding list index, and the thread number are written to the hash table; when removing an entry, the key-value pair corresponding to that address is deleted from the hash table, ensuring data consistency between the hash table and all linked lists. In essence, this combination of thread isolation and global synchronization avoids contention caused by multiple threads operating on the same linked list simultaneously to some extent, while also ensuring cross-thread data sharing and fast access through the hash table.
[0128] When the gateway device processes a packet, the thread first checks the hash table to see if the source and destination Internet Protocol (IP) address pairs of the current packet already exist in the cache table. If they exist in the cache table, the packet is allowed to pass directly. If they do not exist in the cache table, the subsequent field extraction and policy matching process is executed, and address pairs that meet the allowance conditions are added to the thread-specific least recently used linked list and the global hash table.
[0129] In this way, by allowing the filtering mechanism of the Internet Protocol address cache table, duplicate access requests that have passed access control can be quickly screened out. This avoids repeated extraction and comparison operations for such requests to a certain extent, reduces the system resource consumption of the gateway device, and improves the forwarding performance and processing efficiency of the gateway device. Furthermore, for first-time access or requests that have not passed verification, the complete field extraction and comparison process is still executed, which improves the security and accuracy of web page access control to a certain extent.
[0130] Please see Figure 8 In some implementations, step 03 further includes: 031: If the server indicates that the name field matches the allow policy in the preset rule base, determine to allow the target packet and add the source Internet Protocol address and destination Internet Protocol address to the allow Internet Protocol address cache table; 032: If the server indicates that the name field matches the interception policy in the preset rule base, determine to perform an interception operation on the target packet.
[0131] In some implementations, the determining module is further configured to determine, if the server-indicated name field matches the allow policy in the preset rule base, to perform an allow operation on the target packet, and to add the source Internet Protocol address and destination Internet Protocol address to the allowable Internet Protocol address cache table. The determining module is also configured to determine to perform an intercept operation on the target packet if the server-indicated name field matches the intercept policy in the preset rule base.
[0132] In some implementations, the processor is further configured to determine, if the server indicates that the name field matches the allow policy in a preset rule base, to perform an allow operation on the target packet, and to add the source Internet Protocol address and destination Internet Protocol address to the allowable Internet Protocol address cache table. The processor is also configured to determine, if the server indicates that the name field matches the intercept policy in a preset rule base, to perform an intercept operation on the target packet.
[0133] Specifically, the allow policy is a set of policies in the preset rule base that allow access, including a list of server indication names that allow access, legal domain keywords, etc. Access requests that conform to the allow policy will be allowed to be forwarded.
[0134] The allow operation is the processing action performed on messages that conform to the allow policy, ensuring the normal execution of legitimate access requests.
[0135] The blocking policy is a set of policies that prohibit access from a preset rule base, including a list of prohibited server names, prohibited keywords, etc. Access requests that match the policy will be blocked.
[0136] The interception operation is the processing action performed on packets that conform to the interception policy, preventing the packets from being forwarded and terminating the unauthorized access request.
[0137] After the gateway device extracts the complete server indication name field, it performs a comprehensive comparison of the server indication name field with the policies in the preset rule base. The allow and block policies in the preset rule base are pre-configured by the user according to actual needs. For example, the allow policy includes a list of server indication names that the enterprise allows employees to access, and the block policy includes a list of server indication names of malicious websites and related keywords that are prohibited from access.
[0138] If the server indication name field completely matches any rule in the allow policy, or meets the keyword matching rule of the allow policy, then the server indication name field is determined to comply with the allow policy. At this point, the gateway device determines to allow the packet corresponding to the current packet fragment. Simultaneously, to avoid repeated field extraction and comparison for subsequent access requests of this address pair, the gateway device adds the address pair formed by the source and destination Internet Protocol addresses of the packet to the allowable Internet Protocol address cache table. Subsequent access requests for this address pair can then be quickly allowed through the cache table.
[0139] If the server indicates that the name field matches a rule in the interception policy, or conforms to the keyword matching rule of the interception policy, then the field is determined to conform to the interception policy. At this point, the gateway device determines to perform an interception operation on the packet corresponding to the current packet fragment, preventing the packet from being forwarded further and terminating the access request. For intercepted access requests, their address pairs will not be added to the allowed Internet Protocol address cache table, ensuring that subsequent access requests for that address pair still need to undergo the complete field extraction and comparison process, thus guaranteeing the security of access control.
[0140] In this way, by clearly defining the policy execution logic after extracting the server instruction name field, the problem of unclear policy execution in traditional solutions is solved, ensuring the normal forwarding of legitimate requests and the effective blocking of illegal requests. This improves the accuracy and reliability of access control to a certain extent. Furthermore, by adding address pairs that conform to the allow policy to the cache table, the optimization effect of the cache table is enhanced, reducing the invalid processing of duplicate requests to a certain extent, reducing system resource consumption, and improving the overall processing efficiency of the gateway device.
[0141] Please see Figure 9 In some embodiments, step 031 further includes: 0311: Construct a reset message, wherein the source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the reset message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the metadata group of the target message, respectively. 0312: Send a reset message to the target terminal. After receiving the reset message, the target terminal initiates web page access again.
[0142] In some implementations, the determining module is further configured to construct a reset message, wherein the source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the reset message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the metadata group of the target message, respectively. The determining module is also configured to send the reset message to the target terminal, and upon receiving the reset message, the target terminal initiates web page access again.
[0143] In some implementations, the processor is further configured to construct a reset message, wherein the source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the reset message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the metadata group of the target message, respectively. The processor is also configured to send the reset message to the target terminal, and upon receiving the reset message, the target terminal initiates web page access again.
[0144] Specifically, the reset message is used to terminate the current TCP connection. By setting the reset flag in the TCP header, both parties to the connection are triggered to reset the connection, which can quickly disconnect the current communication link.
[0145] The source port is the port number on the target terminal that initiated the access request, which, together with the source Internet Protocol address, identifies the communication endpoint of the initiator.
[0146] When the server indicates that the name field matches the allow policy in the preset rule base, the gateway device needs to perform an allow operation. However, the current TCP connection is established between the gateway device's local upper-layer protocol stack and the target terminal, not between the target terminal and the actual target server. If the current TCP connection is continued to be used to forward data, it will lead to an inconsistency in the communication context between the target terminal and the actual target server, resulting in data transmission failure or anomalies. Therefore, the gateway device needs to terminate the current TCP connection first, and then guide the target terminal to establish a direct connection with the target server.
[0147] The gateway device first constructs a reset message based on the TCP protocol specification. In this message, the source Internet Protocol address (IPA) is set to the destination IPA in the target message's metadata group (i.e., the target server's address), the destination IPA is set to the source IPA in the target message's metadata group (i.e., the target terminal's address), the source port is set to the destination port in the target message's metadata group (i.e., the target server's service port), and the destination port is set to the source port in the target message's metadata group (i.e., the target terminal's request port). This reverse configuration causes the target terminal, upon receiving the reset message, to mistakenly interpret it as a connection reset initiated by the target server, thus successfully triggering its own connection reset process.
[0148] Subsequently, the gateway device sends the constructed reset message to the target terminal. At the same time as sending the reset message, the gateway device adds the address pair formed by the source Internet Protocol address and the destination Internet Protocol address of the current message segment to the allowed Internet Protocol address cache table, and sets the aging time of the table entry to ensure the validity of the table entry.
[0149] Upon receiving the reset message, the target terminal, according to the TCP protocol's processing logic, immediately terminates the existing TCP connection with the gateway device's local upper-layer protocol stack and releases related connection resources. Subsequently, the target terminal automatically re-initiates a webpage access request to the target server. Since the gateway device has already added the corresponding Internet Protocol address pair to its allowed Internet Protocol address cache table, the newly initiated access request will directly hit the cache table upon reaching the gateway device. This eliminates the need for server instruction name field extraction and rule base comparison processes, allowing for rapid forwarding through the gateway to the target server and completing legitimate access.
[0150] Thus, by constructing a reset message and sending it to the target terminal, the current invalid connection can be quickly terminated, avoiding transmission anomalies caused by connection context mismatch. This ensures that the target terminal can successfully establish a direct connection with the target server, guaranteeing the normal execution of legitimate access requests. Furthermore, the reverse address and port configuration of the reset message prevent the target terminal from detecting the intervention of the gateway device, ensuring the transparency and smoothness of the access process to a certain extent. In addition, the access request re-initiated by the target terminal can hit the cache table, enabling the rapid passage of web page access requests. This effectively improves the processing efficiency of duplicate legitimate requests to a certain extent, fully leveraging the optimization function of the cache table.
[0151] Please see Figure 10 In some embodiments, step 032 further includes: 0321: Construct a response message including a preset interception prompt page, wherein the source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the response message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the target message, respectively. 0322: Send a response message to the target terminal to display the interception prompt page on the target terminal.
[0152] In some implementations, the determining module is further configured to construct a response message including a preset interception prompt page, wherein the source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the response message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the target message, respectively. The determining module is further configured to send the response message to the target terminal to display the interception prompt page on the target terminal.
[0153] In some embodiments, the processor is further configured to construct a response message including a preset interception prompt page, wherein the source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the response message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the target message, respectively. The processor is further configured to send the response message to the target terminal to display the interception prompt page on the target terminal.
[0154] Specifically, the response message is a message constructed by the gateway device to provide feedback on the interception result to the target terminal. The implementation of this application adopts the HTTP response format and may include the content of the interception prompt page.
[0155] The default blocking notification page is a Hyper Text Markup Language (HTML) page pre-configured in the gateway device, which includes information such as the reason for the access being blocked, relevant regulations, and compliance guidelines, and is used to inform users of the blocking situation.
[0156] First, when the server indicates that the name field matches the blocking policy in the preset rule base, the gateway device determines to perform an blocking operation, preventing the packet from being forwarded to the target server. Furthermore, in some implementations, to ensure the user clearly understands the reason for the blocking and avoids unnecessary retries, the gateway device constructs a response message including the content of a blocking notification page.
[0157] The construction of response messages must strictly adhere to the HTTP protocol specifications to ensure that the target terminal's browser can parse them correctly. During construction, the address and port configuration of the response message follows a reverse mapping principle: the source Internet Protocol address of the response message is set to the destination Internet Protocol address of the target message (i.e., the target server's address); the destination Internet Protocol address is set to the source Internet Protocol address of the target message (i.e., the target terminal's address); the source port is set to the destination port of the target message (i.e., the target server's service port); and the destination port is set to the source port of the target message (i.e., the target terminal's request port). This configuration, to a certain extent, ensures that the response message is accurately delivered to the target terminal and that the target terminal perceives the response message as originating from the target server, thus enabling normal reception and parsing.
[0158] The TCP payload of the response message typically includes the complete content of a pre-configured interception alert page. This alert page is an HTML page pre-configured in the gateway device, and its content can be customized according to actual needs. It usually includes a clear message indicating that access has been blocked, an explanation of the reason for the block, references to relevant compliance rules, and information on user consultation channels, helping users understand the basis for the block and guiding them to compliant access. The gateway device copies the complete HTML code of the pre-configured interception alert page into the TCP payload of the response message, ensuring the complete transmission of the page content.
[0159] After the response message is constructed, the gateway device sends it to the target terminal. Because the address and port configuration of the response message correspond inversely to the current message and conform to the HTTP protocol specification, the target terminal's browser can successfully receive the response message and parse its HTML content, subsequently displaying a pre-set interception prompt page on the interface. Users can intuitively understand the reason for the blocked access through this pre-set interception prompt page, effectively preventing the repeated initiation of invalid requests.
[0160] Furthermore, after sending the response message, the gateway device will terminate the current TCP connection with the target terminal and release the relevant connection resources to avoid excessive resource consumption.
[0161] In this way, by constructing a response message that includes an interception prompt page, clear feedback information is provided to the user while performing the interception operation, improving the user experience, avoiding repeated access requests by the user due to lack of awareness, reducing the waste of network and system resources, and ensuring that the target terminal can receive and parse the response message normally to a certain extent, avoiding message dropping or parsing failure due to mismatched configurations, ensuring the effective delivery of interception prompt information, and enhancing the stability and practicality of the web access control function.
[0162] Please refer to the following: Figure 11 , Figure 12 , Figure 13 and Figure 14 The following explanation uses an example where the target terminal is an intranet user and the target server is a web server to illustrate the web page access control method of this application: When an intranet user initiates a webpage access request as the target terminal, the current packet fragment is encapsulated into a transport layer packet using a transport layer protocol. This packet then enters the gateway device via its local area network (LAN) interface and flows through a detection and routing node. This node extracts the five-tuple information from the transport layer packet to perform initial packet routing. First, the detection and routing node compares the source and destination Internet Protocol (IP) addresses of the packet with the allowed IP address cache table. If the IP address is in the allowed IP cache table, the packet is directly allowed to the web server on the wide area network (WAN) side. If it is not in the allowed IP cache table, the packet proceeds to the next processing step.
[0163] Subsequently, the gateway device determines whether the packet is encrypted based on the transport layer protocol and destination port in the metadata group. If the packet fragment is encrypted, the TCP protocol layer of the local upper-layer protocol stack establishes a Transmission Control Protocol (TCP) connection with the internal network user to reliably receive the current packet fragment and buffered packet fragments, complete TCP fragmentation and reassembly, and store them in the receive buffer. Before the TCP receive buffer content is read by the TLS protocol layer, the SNI detection module of the local upper-layer protocol stack extracts the complete server indication name field from the TCP protocol layer's receive buffer and compares the server indication name field with the preset rule base. If it meets the interception policy and an interception prompt interface needs to be displayed, the current TLS handshake process continues. After the internal network user sends an HTTP request, the application layer HTTP server constructs a response packet including the preset interception prompt page, reverses the configured address and port, and sends it to the internal network user. If no prompt page needs to be displayed, the packet is directly discarded or the connection is terminated.
[0164] If the message fragment is unencrypted, the message format is matched to determine if it is an HTTP GET message. The URL detection module extracts the Uniform Resource Locator (URL) field and the server hostname field and compares them with the preset rule base. If the interception policy is matched, the message rewriting module constructs a Transmission Control Protocol (TCP) reset message or an HTTP response message to send back to the internal network user. If the interception policy is not matched, the message is allowed to pass to the web server.
[0165] Understandably, by using the local upper-layer protocol stack to replace the web server in completing the TCP handshake with the intranet user, the complete Client Hello message can be obtained from the TCP receive buffer. This improves the SNI extraction success rate in various packet fragmentation and out-of-order packet scenarios to some extent. At the same time, the local upper-layer protocol stack only needs to complete a three-way handshake message and a TSL Client Hello message with the intranet user before disconnecting the connection. Compared to completely proxying the interaction between the user and the web server, this reduces the consumption of system resources and improves system performance to some extent.
[0166] Please refer to the following: Figure 15 and Figure 16 In some implementations, the gateway device is a Vector Packet Processing Engine (VPP), and the detection filtering node is registered in the packet processing characteristic arc of the VPP. Step 01 includes: 011: The Vector Message Processing Engine (VPP) receives the current message fragment; Step 02 includes: 029: The native upper-layer protocol stack of the Vector Message Processing Engine (VPP) extracts the server indication name field from the current message fragment based on the cached message fragments and the current message fragment; Step 03 includes: 033: When the local upper-layer protocol stack extracts the complete server indication name field, it determines the target operation for the target message corresponding to the current message fragment based on the complete server indication name field and the preset rule base.
[0167] In some implementations, the acquisition module is further configured to receive the current packet fragment by the Vector Packet Processing Engine (VPP). The extraction module is further configured to allow the VPP's local upper-layer protocol stack to extract the server indication name field from the current packet fragment based on the cached packet fragments and the current packet fragment. The determination module is further configured to, when the local upper-layer protocol stack has extracted the complete server indication name field, determine the target operation for the target packet corresponding to the current packet fragment based on the complete server indication name field and a preset rule base.
[0168] In some implementations, the processor is further configured to receive the current packet fragment by the Vector Packet Processing Engine (VPP). The processor is also configured to allow the VPP's native upper-layer protocol stack to extract the server indication name field from the current packet fragment based on the cached packet fragments and the current packet fragment. Furthermore, the processor is configured to, upon obtaining the complete server indication name field, determine the target operation for the target packet corresponding to the current packet fragment based on the complete server indication name field and a preset rule base.
[0169] Specifically, the Vector Packet Processing Engine (VPP) is a fast and scalable multi-platform network protocol stack that runs in Linux user space, supports various hardware architectures such as x86 and ARM, and has high-performance packet forwarding capabilities. It is a commonly used underlying architecture core for high-performance gateways.
[0170] The detection and filtering node is a functional node registered in the VPP message processing characteristic arc. It is used to filter, divert, and perform preliminary processing on messages flowing through VPP, and is a key node for achieving precise message control.
[0171] The message processing characteristic arc is the logical link within VPP that carries the message processing flow. Various functional nodes are mounted on it in a preset order, and the message passes through each node sequentially along the characteristic arc to complete the processing.
[0172] The local upper-layer protocol stack is the network protocol stack built into VPP, which runs in user space and has the ability to process protocols such as TCP and TLS. It does not rely on the operating system kernel protocol stack and can independently complete operations such as message reception, parsing, and field extraction.
[0173] In some implementations, the Vector Packet Processing Engine (VPP) can be used as the gateway device. VPP directly interfaces with the hardware network card through technologies such as the Data Plane Development Kit (DPDK), which can efficiently receive and send packets, avoiding the context switching overhead between kernel mode and user mode in traditional architectures and improving packet processing speed.
[0174] In some implementations, a detection and filtering node can also be registered in the message processing characteristic arc of the VPP. This detection and filtering node is given the function of message filtering and diversion, and becomes the first gate for messages to enter the subsequent processing flow.
[0175] When a message sent by the target terminal arrives at the VPP, the VPP first receives the current segment of the message and quickly completes the initial reception and encapsulation parsing of the message through hardware acceleration technology, ensuring that the message can quickly enter the processing flow.
[0176] Subsequently, the detection and filtering node performs preliminary processing on the current packet fragment. After confirming the TCP connection to which it belongs and its related attributes, the packet fragment is directed to the VPP's local upper-layer protocol stack.
[0177] Subsequently, VPP's native upper-layer protocol stack invokes its own TCP connection management and fragment reassembly capabilities to associate and integrate the current packet fragment with other fragments of the same packet that have been received and cached.
[0178] During the integration process, the local upper-layer protocol stack determines the correct order of each segment in the original message based on information such as the TCP sequence number, ensuring the accuracy of segment concatenation. Simultaneously, following the format characteristics of the Client Hello message in the TLS protocol, it extracts the server indication name field from the assembled message segments. For fields split into multiple segments, the protocol stack's ordered caching and concatenation capabilities ensure the completeness of the extracted field.
[0179] Once the local upper-layer protocol stack successfully extracts the complete server indication name field, it compares this field with a preset rule base. If the field matches the allow policy in the rule base, the packet is allowed to continue forwarding; if it matches the block policy, the packet forwarding is blocked, thus achieving precise control over web page access.
[0180] Thus, by configuring the gateway device as the Vector Packet Processing Engine (VPP), the high-performance forwarding architecture and user-space protocol stack of the VPP improve the speed of packet reception, field extraction, and policy matching. This reduces the overhead of context switching and hierarchical encapsulation in traditional architectures to some extent, lowers the latency of web access control, and the flexible deployment of functional nodes in the scalable architecture of the VPP, as well as the collaborative work between the detection and filtering nodes and the local upper-layer protocol stack, improves the packet flow speed between different processing nodes, thereby improving the efficiency and accuracy of packet processing to a certain extent.
[0181] Please refer to the following: Figure 16 and Figure 17 In some implementations, step 029 further includes: 0291: Extract the metadata group of transport layer packets by detecting filtering nodes; 0292: Determine whether the current message fragment contains an encryption protocol based on the transport layer protocol and destination port in the metadata group; 0293: In the presence of an encryption protocol, the Vector Packet Processing Engine (VPP) determines the target node of the current packet fragment as the local node of the VPP and sends the current packet fragment to the local node in order to send the current packet fragment to the local upper-layer protocol stack. 0294: The local upper-layer protocol stack extracts the server instruction name field from the current message fragment based on the cached message fragment and the current message fragment.
[0182] In some implementations, the extraction module is further configured to extract the metadata group of the transport layer packet by detecting filtering nodes. The extraction module is also configured to determine whether an encryption protocol exists in the current packet fragment based on the transport layer protocol and destination port in the metadata group. If an encryption protocol exists, the extraction module is further configured to, if so, determine the target node of the current packet fragment as the local node of the Vector Packet Processing Engine (VPP) and send the current packet fragment to the local node to send the current packet fragment to the local upper-layer protocol stack. The extraction module is also configured to, based on the cached packet fragments and the current packet fragment, perform server indication name field extraction processing on the current packet fragment by the local upper-layer protocol stack.
[0183] In some implementations, the processor is further configured to extract the metadata group of the transport layer packet by detecting the filtering node. The processor is also configured to determine whether an encryption protocol exists in the current packet fragment based on the transport layer protocol and destination port in the metadata group. If an encryption protocol exists, the processor is further configured to, if so, determine the destination node of the current packet fragment as the local node of the Vector Packet Processing Engine (VPP) and send the current packet fragment to the local node to send the current packet fragment to the local upper-layer protocol stack. The processor is also configured to, based on the cached packet fragments and the current packet fragment, perform server indication name field extraction processing on the current packet fragment using the local upper-layer protocol stack.
[0184] Specifically, the local node is a built-in functional node of VPP, used to receive packets that need to be processed by the local upper-layer protocol stack. It is a key node for packets to enter the protocol processing path from the VPP forwarding path.
[0185] Upon entering VPP, the current packet fragment first flows through the detection and filtering nodes registered in the packet processing feature arc. These nodes perform preliminary parsing of the transport layer packet, extracting metadata groups. The efficient packet parsing capabilities of VPP enable the extraction of metadata groups, quickly retrieving key transmission information from the IP and TCP headers without requiring deep parsing of the packet payload, thus improving packet processing efficiency to some extent.
[0186] Subsequently, the detection and filtering nodes determine whether the current packet fragment contains an encryption protocol based on the transport layer protocol and destination port in the metadata group.
[0187] When an encryption protocol is detected, VPP sets the target node of the current message fragment to its own local node and accurately sends the current message fragment to the local node through its internal node scheduling mechanism. The local node then imports the message fragment into the receive buffer of the local upper-layer protocol stack.
[0188] Finally, the local upper-layer protocol stack retrieves the current message fragment from the receive buffer, combines it with other fragments of the same message that have been received and cached, and performs server instruction name field extraction processing according to the preset extraction logic.
[0189] In this way, by accurately filtering the detection nodes, only encrypted protocol messages are allowed to enter the local upper-layer protocol stack for processing, avoiding the resource occupation of unencrypted protocol messages. This improves the processing efficiency and targeting of web access control under the vector message processing engine architecture. Furthermore, the connection between the local node and the local upper-layer protocol stack reduces the message transmission latency to a certain extent, ensures the rapid execution of server instruction name field extraction, and improves the detection and interception efficiency of the server instruction name field.
[0190] Please refer to the following: Figure 16 and Figure 18 In some implementations, step 0294 further includes: 02941: The Vector Message Processing Engine (VPP) establishes a Transmission Control Protocol (TCP) connection with the target terminal; 02942: Based on the Transmission Control Protocol connection, the local node sends the current message fragment to the upper-layer protocol stack of the local machine; 02943: The local upper-layer protocol stack extracts the server instruction name field from the current message fragment based on the cached message fragment and the current message fragment.
[0191] In some implementations, the extraction module is also used to establish a Transmission Control Protocol (TCP) connection between the Vector Packet Processing Engine (VPP) and the target terminal. The extraction module is also used to send the current packet fragment to the local upper-layer protocol stack via the local node based on the TCP connection. The extraction module is further used by the local upper-layer protocol stack to extract the server indication name field from the current packet fragment based on the cached packet fragments and the current packet fragment.
[0192] In some implementations, the processor is also used to establish a Transmission Control Protocol (TCP) connection between the Vector Packet Processing Engine (VPP) and the target terminal. The processor is also used to send the current packet fragment to the local upper-layer protocol stack via the local node based on the TPP connection. The processor is also used by the local upper-layer protocol stack to extract the server indication name field from the current packet fragment based on the cached packet fragments and the current packet fragment.
[0193] Specifically, the native upper-layer protocol stack of the Vector Packet Processing Engine (VPP) simulates the behavior of the target server, actively establishing a Transmission Control Protocol (TCP) connection with the target terminal that sent the current packet fragment. Furthermore, the establishment of the TCP connection follows the TCP three-way handshake process: the target terminal sends a SYN packet to request a connection; after receiving this packet, the VPP's native upper-layer protocol stack replies with a SYN_ACK packet for confirmation; and after receiving the confirmation packet, the target terminal replies with an ACK packet, thus completing the TCP connection establishment.
[0194] After the connection is established, the current message fragment sent by the target terminal is transmitted to VPP through the TCP connection. VPP's local upper-layer protocol stack receives the current message fragment through the local node. The local node acts as a bridge for messages to enter the protocol stack from the VPP forwarding path, and can quickly import message fragments into the protocol stack's receive buffer, reducing transmission latency.
[0195] Subsequently, the local upper-layer protocol stack retrieves the current message fragment from the receive buffer and simultaneously retrieves other fragments of the same message that have been previously received and cached. Following the format characteristics of the Client Hello message in the TLS protocol, it extracts the server indication name field from the current message fragment. If the field is split into multiple fragments, the protocol stack determines the order of the fragments based on the Transmission Control Protocol sequence numbers and concatenates the fragments according to their sequence numbers to ensure that the complete server indication name field is extracted.
[0196] In this way, the vector message processing engine simulates the establishment of a transmission control protocol connection between the target server and the target terminal through the local upper-layer protocol stack. All message fragments under this connection are stored in the corresponding receive buffer. With the help of the caching and sorting mechanism of the protocol stack, missing message fragments are filled in, ensuring that the complete server indication name field can be extracted in the case of message fragmentation and out-of-order delivery. This effectively avoids the problem of interception omissions and improves the accuracy of access control in the case of encryption protocol to a certain extent. Moreover, this connection does not require full proxy communication and encryption / decryption operations, which reduces memory resource consumption to a certain extent and ensures gateway forwarding performance and network access efficiency.
[0197] Furthermore, the graph node scheduling architecture of the vector packet processing engine enables packets to be quickly sent to the local upper-layer protocol stack after being diverted by the detection and filtering nodes, without the need for complex kernel forwarding paths. This reduces packet processing latency to a certain extent, improves detection and interception efficiency, and optimizes the user's network experience.
[0198] Please refer to the following: Figure 16 and Figure 19 In some embodiments, step 033 further includes: 0331: If the server indicates that the name field matches the interception policy in the preset rule base, construct a response message including the preset interception prompt page, wherein the source Internet Protocol address, destination Internet Protocol address, source port and destination port of the response message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port and source port of the target message, respectively. 0332: Set the target node of the response message to the Internet Protocol Address Route Lookup node of the Vector Message Processing Engine (VPP); 0333: The response message is sent to the target terminal via Internet Protocol address routing lookup node to display the interception prompt page on the target terminal.
[0199] In some implementations, the determining module is further configured to construct a response message including a preset interception prompt page if the server-indicated name field matches the interception policy in a preset rule base. The source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the response message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the target message, respectively. The determining module is also configured to set the target node of the response message as the Internet Protocol Address Routing Lookup Node (IPR) of the Vector Packet Processing Engine (VPP). The determining module is further configured to send the response message to the target terminal via the IPR RTR node to display the interception prompt page on the target terminal.
[0200] In some implementations, the processor is further configured to construct a response message including a preset interception prompt page if the server-indicated name field matches the interception policy in a preset rule base. The source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the response message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the target message, respectively. The processor is further configured to set the target node of the response message as the Internet Protocol address routing lookup node of the Vector Packet Processing Engine (VPP). The processor is further configured to send the response message to the target terminal via the Internet Protocol address routing lookup node to display the interception prompt page on the target terminal.
[0201] Specifically, the Internet Protocol address routing lookup node is a built-in functional node in VPP used to determine the packet forwarding path. It can look up the routing table based on the destination Internet Protocol address of the packet to determine the packet's exit interface and next-hop address. It is the node in VPP that implements packet forwarding.
[0202] When the VPP's local upper-layer protocol stack extracts the complete server indication name field, and the server indication name field matches the interception policy in the preset rule base, it needs to perform an interception operation and report the result to the user. The local upper-layer protocol stack then initiates the response message construction process.
[0203] During the construction of the response message, the address and port configuration of the response message must follow a reverse correspondence principle: the source Internet Protocol address of the response message is set to the destination Internet Protocol address of the target message, i.e., the address of the target server; the destination Internet Protocol address of the response message is set to the source Internet Protocol address of the target message, i.e., the address of the target terminal; the source port of the response message is set to the destination port of the target message, i.e., the service port of the target server; and the destination port of the response message is set to the source port of the target message, i.e., the request port of the target terminal. This reverse configuration ensures that the response message is accurately delivered to the target terminal, and the target terminal will believe that the response message comes from the target server, thus correctly parsing and displaying the prompt page.
[0204] After the response message is constructed, the local upper-layer protocol stack sets the target node of the response message as the Internet Protocol address (IPA) lookup node of the VPP. The IPA lookup node has efficient route lookup capabilities, which can quickly look up the routing table based on the destination IPA of the response message to determine the message's exit interface and forwarding path, thereby reducing the latency and redundant operations of route lookup in traditional architectures to a certain extent.
[0205] Finally, the routing lookup node sends the response message to the target terminal according to the determined forwarding path. After receiving the response message, the target terminal parses the HTML content and displays a preset blocking prompt page, informing the user that access has been blocked, thus completing the blocking operation loop.
[0206] Thus, the vector packet processing engine architecture clearly defines the construction, configuration, and forwarding mechanisms for intercepted response packets. By introducing Internet Protocol address routing lookup nodes, the high-performance routing lookup advantage of the vector packet processing engine is fully utilized. This ensures, to a certain extent, that response packets can be delivered to the target terminal quickly and accurately via the optimal path, improving the efficiency and reliability of interception feedback. Furthermore, the reverse address and port configuration of the response packets ensures, to a certain extent, that the target terminal can correctly identify and parse the packets, successfully display the interception prompt page, and allow users to know the reason for the interception in a timely manner, avoiding invalid retries and reducing the waste of network resources and the processing capacity of the vector packet processing engine.
[0207] Please see Figure 16 The following example illustrates the webpage access control method of this application, using the target terminal as an intranet user, the target server as a web server, and a preset Internet Protocol address cache table as an IP whitelist: When an intranet user initiates a web page access request as the target terminal, the current packet fragment is encapsulated into a transport layer packet using a transport layer protocol and enters the VPP processing framework via the gateway's LAN-side interface. It is first received by the input node ip4-input and then forwarded to the detection and filtering node. The detection and filtering node extracts the five-tuple information of the transport layer packet and determines the packet type based on the transport protocol and destination port in the five-tuple. If the destination port is 443, it indicates that the current packet fragment is encrypted. This detection and filtering node compares the source and destination Internet Protocol addresses of the packet with an IP whitelist. If a match is found, the packet is allowed to pass directly. After the packet undergoes a route lookup by the Internet Protocol address lookup node ip4-lookup, it is forwarded to the web server on the WAN side via the output node ip4-output, completing the normal access process.
[0208] If the IP whitelist is not matched, VPP will determine the target node of the packet as the local node ip4-local, causing it to enter the local upper-layer protocol stack. The TCP protocol layer of the local upper-layer protocol stack will establish a Transmission Control Protocol (TCP) connection with the internal network user, receive the current packet fragment and other cached fragments of the same packet, complete TCP fragmentation and reassembly, and store them in the receive buffer. The SNI extraction and detection module of the local upper-layer protocol stack will parse the TCP receive buffer content byte by byte to extract the complete Server Indicator Name (SIN) field before the TCP receive buffer content is read by the TLS protocol layer, and compare the SIN field with the preset rule base. If it meets the allow policy, a TCP RST procedure is initiated, and the source Internet Protocol (IP) address and destination IP address pair are added to the IP whitelist. If it meets the interception policy and an interception prompt interface needs to be displayed, the HTTP protocol layer will fill the preset HTTP response interception prompt page, complete the construction of the response packet, and route the response packet back to the internal network user via the ip4-lookup node. If it meets the interception policy and no interception prompt interface needs to be displayed, a response packet will be constructed directly and routed back to the internal network user via the ip4-lookup node.
[0209] If the filtering node determines the destination port to be 80, it indicates that the current packet fragment is unencrypted and will proceed to the HTTP URL extraction and detection process. After confirming the packet is an HTTP GET packet through packet format matching, the URL detection module parses the TCP payload, extracts the Uniform Resource Locator (URL) and hostname fields, and performs keyword matching against a preset rule base. If the interception policy is matched, the packet is sent to the packet rewriting module, rewritten as a TCP RST packet or an HTTP response packet including an interception notification interface, and then routed through the ip4-lookup node and replied to intranet users via the LAN side. If the interception policy is not matched, the packet is forwarded normally to the web server on the WAN side.
[0210] Please see Figure 20 In some implementations, when a web server sends a response message or other external network data to an intranet user, the relevant data enters the gateway through the WAN-side network interface of the gateway device and bypasses the detection and routing nodes, directly entering the gateway's basic routing and forwarding framework.
[0211] After entering the basic routing and forwarding framework, the gateway device parses the IP layer information of the packet, extracts routing information such as the source Internet Protocol address and the destination Internet Protocol address, and then sends the packet to the routing decision module. The routing decision module determines the target forwarding path of the packet and successfully sends the packet to the internal network user via the LAN-side network interface, completing the transmission from the external network to the internal LAN.
[0212] This application also provides an electronic device, including a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, it implements the methods of some of the above-described embodiments.
[0213] This application also provides a computer-readable storage medium storing a computer program that, when executed by one or more processors, implements the methods of some of the above-described embodiments.
[0214] This application also provides a computer program product, including a computer program / instructions that, when executed by a processor, implement the methods of some of the above-described embodiments.
[0215] It is understood that a computer program includes computer program code. Computer program code can be in the form of source code, object code, executable files, or some intermediate form. Computer-readable storage media can include: any entity or device capable of carrying computer program code, recording media, USB flash drives, portable hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM), random access memory (RAM), and software distribution media, etc.
[0216] In this specification, the terms "specifically," "furthermore," "particularly," "understandably," etc., refer to specific features, structures, materials, or characteristics described in connection with embodiments or examples that are included in at least one embodiment or example of this application. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples. Moreover, without contradiction, those skilled in the art can combine and integrate the different embodiments or examples described in this specification, as well as the features of different embodiments or examples.
[0217] Any process or method description in the flowchart or otherwise herein can be understood as representing a module, segment, or portion of executable request code comprising one or more steps for implementing a particular logical function or process, and the scope of the preferred embodiments of this application includes additional implementations in which functions may be performed not in the order shown or discussed, including substantially simultaneously or in reverse order according to the functions involved, as should be understood by those skilled in the art to which embodiments of this application pertain.
[0218] Although embodiments of this application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting this application. Those skilled in the art can make changes, modifications, substitutions and variations to the above embodiments within the scope of this application.
Claims
1. A webpage access control method, characterized in that, The method is used for a gateway device, and the method includes: Obtain the current message segment sent by the target terminal; Based on the received cached message fragments and the current message fragment, the server indication name field of the current message fragment is extracted. The cached message fragment and the current message fragment are fragments of the same target message and are both encapsulated by the transport layer protocol before being transmitted to the gateway device. If the complete server indication name field is extracted, the target operation for the target message corresponding to the current message fragment is determined based on the complete server indication name field and the preset rule base.
2. The webpage access control method according to claim 1, characterized in that, The step of extracting the server indication name field from the current message fragment based on the received cached message fragments and the current message fragment includes: The server indication name field is extracted from the current message segment to determine the first server indication name field segment; The first server indication name field fragment and the second server indication name field fragment are combined, wherein the second server indication name field fragment is determined by extracting the server indication name field from the cached message fragment.
3. The webpage access control method according to claim 2, characterized in that, The method further includes: If the result of the combined processing is an incomplete server indication name field, the result of the combined processing is cached, and the sequence number of the last byte in the current message segment is recorded, wherein the sequence number is the sequence position of the last byte in the current message segment in the target message.
4. The webpage access control method according to claim 1, characterized in that, The step of extracting the server indication name field from the current message fragment based on the received cached message fragments and the current message fragment includes: The gateway device establishes a transmission control protocol connection with the target terminal through its local upper-layer protocol stack. The current message segment is received via the transmission control protocol connection; Based on the received cached message fragments and the current message fragment, the server indication name field is extracted from the current message fragment.
5. The webpage access control method according to claim 1, characterized in that, The step of extracting the server indication name field from the current message fragment based on the received cached message fragments and the current message fragment includes: Extract the metadata group of the transport layer message, wherein the transport layer message is obtained by encapsulating the current message segment based on the transport layer protocol; Based on the transport layer protocol and destination port in the metadata group, determine whether the current message fragment contains an encryption protocol; In the presence of the encryption protocol, the server indication name field is extracted from the current message segment based on the cached message segment and the current message segment.
6. The webpage access control method according to claim 5, characterized in that, The method further includes: In the absence of the encryption protocol, extract the Uniform Resource Locator (URL) field of the current message segment; The target operation for the current message segment is determined based on the preset rule base and the Uniform Resource Locator field.
7. The webpage access control method according to claim 5, characterized in that, In the presence of the encryption protocol, the step of extracting the server indication name field from the current message fragment based on the cached message fragment and the current message fragment includes: The source Internet Protocol address and destination Internet Protocol address in the metadata group are compared with a preset cache table of allowed Internet Protocol addresses; If the encryption protocol exists and the source Internet Protocol address and the destination Internet Protocol address are not in the allowed Internet Protocol address cache table, the server indication name field of the current message fragment is extracted based on the cached message fragment and the current message fragment.
8. The webpage access control method according to claim 7, characterized in that, When the complete server indication name field is extracted, determining the target operation for the target message corresponding to the current message fragment based on the server indication name field and a preset rule base includes: If the server indicates that the name field matches the allow policy in the preset rule base, it is determined to allow the target packet, and the source Internet Protocol address and the destination Internet Protocol address are added to the allow Internet Protocol address cache table. If the server indicates that the name field matches the interception policy in the preset rule base, it is determined that an interception operation will be performed on the target message.
9. The webpage access control method according to claim 8, characterized in that, When the server indicates that the name field matches the allow policy in the preset rule base, it is determined to allow the target packet, and the source Internet Protocol address and the destination Internet Protocol address are added to the allow Internet Protocol address cache table, including: Construct a reset message, wherein the source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the reset message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the metadata group of the target message, respectively. The reset message is sent to the target terminal, wherein the target terminal, after receiving the reset message, initiates a webpage access request again.
10. The webpage access control method according to claim 8, characterized in that, If the server indicates that the name field matches the interception policy in the preset rule base, determining to perform an interception operation on the target packet includes: Construct a response message including a preset interception prompt page, wherein the source Internet Protocol address, destination Internet Protocol address, source port, and destination port of the response message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port, and source port of the target message, respectively; The response message is sent to the target terminal so that the interception prompt page is displayed on the target terminal.
11. The webpage access control method according to claim 1, characterized in that, The gateway device is a Vector Packet Processing Engine (VPP), and the detection and filtering nodes are registered in the packet processing characteristic arc of the VPP. The step of obtaining the current message segment sent by the target terminal includes: The Vector Message Processing Engine (VPP) receives the current message segment; The step of extracting the server indication name field from the current message fragment based on the received cached message fragments and the current message fragment includes: The local upper-layer protocol stack of the Vector Message Processing Engine (VPP) performs server indication name field extraction processing on the current message fragment based on the cached message fragment and the current message fragment. Upon extracting the complete server indication name field, determining the target operation for the target message corresponding to the current message fragment based on the complete server indication name field and a preset rule base includes: When the local upper-layer protocol stack extracts the complete server indication name field, it determines the target operation for the target message corresponding to the current message fragment based on the complete server indication name field and a preset rule base.
12. The webpage access control method according to claim 11, characterized in that, The native upper-layer protocol stack of the Vector Packet Processing Engine (VPP) performs server indication name field extraction processing on the current packet fragment based on the cached packet fragment and the current packet fragment, including: The metadata group of the transport layer message is extracted by detecting the filtering node; Based on the transport layer protocol and destination port in the metadata group, determine whether the current message fragment contains an encryption protocol; In the presence of the encryption protocol, the Vector Packet Processing Engine (VPP) determines the target node of the current packet fragment as the local node of the VPP and sends the current packet fragment to the local node in order to send the current packet fragment to the local upper-layer protocol stack. The local upper-layer protocol stack extracts the server indication name field from the current message segment based on the cached message segment and the current message segment.
13. The webpage access control method according to claim 12, characterized in that, The local upper-layer protocol stack performs server indication name field extraction processing on the current packet fragment based on the cached packet fragment and the current packet fragment, including: The Vector Message Processing Engine (VPP) establishes a Transmission Control Protocol (TCP) connection with the target terminal. Based on the Transmission Control Protocol connection, the current message fragment is sent to the local upper-layer protocol stack via the local node; The local upper-layer protocol stack extracts the server indication name field from the current message segment based on the cached message segment and the current message segment.
14. The webpage access control method according to claim 11, characterized in that, Upon extracting the complete server indication name field, the local upper-layer protocol stack determines the target operation for the target packet corresponding to the current packet fragment based on the complete server indication name field and a preset rule base, including: If the server indicates that the name field matches the blocking policy in the preset rule base, a response message including a preset blocking prompt page is constructed, wherein the source Internet Protocol address, destination Internet Protocol address, source port and destination port of the response message are the same as the destination Internet Protocol address, source Internet Protocol address, destination port and source port of the target message, respectively. Set the target node of the response message to the Internet Protocol Address Route Lookup node of the Vector Message Processing Engine (VPP); The response message is sent to the target terminal via the Internet Protocol address routing lookup node, so that the interception prompt page is displayed on the target terminal.
15. An electronic device, characterized in that, The method includes a memory and a processor, wherein the memory stores a computer program, which, when executed by the processor, implements the method according to any one of claims 1-14.
16. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by one or more processors, implements the method of any one of claims 1-14.
17. A computer program product comprising a computer program / instructions, characterized in that, When the computer program / instruction is executed by the processor, it implements the method described in any one of claims 1-14.