Network packet protocol identification method and system

A network data packet and protocol identification technology, applied in the network field, can solve the problems of scalability and processing efficiency, and achieve the effects of improving identification performance, enhancing applicability, and strong scalability

Active Publication Date: 2013-04-03
科来网络技术股份有限公司
2 Cites 13 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0003] The purpose of the present invention is to overcome the inadequacy of scalability and processing efficiency existing in the prior art, and...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention discloses a network packet protocol identification method. The method includes a protocol configuration step and a packet identification step. The protocol configuration step includes: storing characteristic information of protocols; establishing a protocol tree; and establishing a table of characteristic values and judgment logic. The packet identification step includes: acquiring a data packet to be recognized; selecting the protocol tree for identifying the data packet protocol; and comparing values of keywords, read from the data packet, to the table of characteristic values and the judgment logic so as to identify the data packet protocol. The invention further provides a network packet protocol identification system. The table of characteristic values and the judgment logic are established according to characteristic information of all protocols, the protocol used by the data packet can be quickly found by single table look-up, and accordingly the method and system are high in identification efficiency. Protocol identification information of a new protocol needs to be added only when the new protocol is added, so that the method and system are highly extensible.

Application Domain

Technology Topic

Image

  • Network packet protocol identification method and system
  • Network packet protocol identification method and system
  • Network packet protocol identification method and system

Examples

  • Experimental program(1)

Example Embodiment

[0042] The present invention will be further described in detail below in combination with test examples and specific implementations. However, it should not be understood that the scope of the above-mentioned subject of the present invention is limited to the following embodiments, and all technologies implemented based on the content of the present invention belong to the scope of the present invention.
[0043] The invention discloses a network data packet protocol identification method, which includes a protocol configuration step and a data packet protocol identification step.
[0044] Such as figure 1 As shown, the protocol configuration steps include:
[0045] S101: Input protocol feature information of all protocols and store it.
[0046] In this step, if a new protocol is added, enter the protocol feature information of the new protocol and store it.
[0047] In this step, the protocol feature information (or called protocol identification information) includes: the lower-level protocols of the protocol (there may be multiple lower-level protocols, for example, the lower-level protocols of the TCP protocol include IP, IPv6, etc.); Keyword, the keyword indicates the position and length of the keyword in the data packet, that is, reading the value of the specified length at the specified position in the data packet is the value of the keyword in the data packet; characteristic value; comparison method (comparison method Including greater than, equal to, and less than); keyword definition. Each protocol has its unique protocol feature information, and the protocol feature information of each protocol is specified and published by the organization that promulgated the protocol standard. For example, common protocols (such as IP, TCP) are specified by standards organizations (IEEE, IANA) And announced.
[0048] The method of the invention can also be used to identify private protocols. If it is a private protocol, the protocol feature information of the private protocol is input and stored by the person who makes or understands the private protocol.
[0049] It should be noted that a data packet has multiple protocols, and these protocols are arranged hierarchically, one layer on top of the other. For example, the expression of EthernetII\IP\TCP\HTTP means that the beginning part of the data packet uses the EthernetII protocol, the upper level is the IP protocol, the upper level uses the TCP protocol, and the upper level is the HTTP protocol. For example, when accessing a web page on a PC, the protocol format generally used for data packets is: EthernetII\IP\TCP\HTTP. To identify such data packets, you need to know the protocol feature information of the four protocols EthernetII, IP, TCP, and HTTP. If IPv6 is used, the protocol format used by the data packet is EthernetII\IPv6\TCP\HTTP. To identify such a data packet, you need to enter IPv6 protocol feature information, as shown in Table 1:
[0050] Table 1
[0051] Agreement
[0052] It should be noted that the EthernetII protocol is the initial protocol of Ethernet, there is no lower-level protocol, only the physical medium needs to be specified. The physical medium is determined by the hardware that captures the data packet. If an Ethernet card is used, the physical medium is Ethernet, and if a wireless network card is used, the physical medium is a wireless network.
[0053] It should be noted that each protocol has a keyword, which indicates the location and length of the keyword in the data packet. The keyword of some protocols can directly read its position and length in the data packet, such as EthernetII, the keyword is D16[12], which means that the keyword is two consecutive bytes starting from the 12th byte of the data packet (16 bits), and the keywords of some protocols cannot directly reflect their position and length in the data packet, but are reflected in the lower-level protocol of the protocol, that is, the keyword definition can be read from the lower-level protocol The position and length of the protocol keyword in the data packet. For example, in the TCP protocol, the keyword is the protocol number Pro, and the location and length of the keyword in the data packet cannot be read directly, but the location and length of the keyword in the data packet are defined in the keyword definitions of the lower-level protocols IP and IPv6 , Pro=D8[9] in the IP protocol, the position of the keyword in the data packet is the 9th byte (8 bits), in the IPv6 protocol Pro=D8[6], the position of the keyword in the data packet is The 6th byte (8 bits). In the identification of the data packet protocol, as for the selection of Pro=D8[9] or Pro=D8[6], the selection is based on the lower-level protocol. If the lower-level protocol is an IP protocol, select Pro=D8[9], if the lower-level protocol is IPv6 For the protocol, choose Pro=D8[6].
[0054] S102: Integrate protocol feature information of all protocols (including existing protocols and newly added protocols). One protocol serves as a protocol node, and all protocol nodes with the same lower-level protocol form a node layer.
[0055] S103: According to the upper and lower level protocol relationships of the protocol, all protocol nodes are connected to form a protocol tree, all protocol nodes forming the same node layer are located in the same layer of the protocol tree, and the entrance of the protocol tree is a physical medium.
[0056] For example, the form of the protocol tree whose physical medium is Ethernet is:
[0057] Ethernet
[0058]
[0059] It should be noted that the protocol tree listed above is not a complete protocol tree, but only shows the components of all the protocols recorded in Table 1 in the complete protocol tree. EthernetII serves as the root node of the protocol tree; protocols IP and IPv6 with the same subordinate protocol EthernetII form a node layer, that is, IP and IPv6 are located at the same node layer of the protocol tree, and both are located at the next node layer of the root node in the protocol tree ; The upper-level protocol of IP is TCP, TCP as a protocol node is located at the next node layer of IP in the protocol tree; the upper-level protocol of TCP is HTTP, and HTTP as a protocol node is located at the next node layer of TCP in the protocol tree; IPv6 The upper-level protocol of TCP is TCP, which is located at the next node layer of IPv6 in the protocol tree as a protocol node; the upper-level protocol of TCP is HTTP, and HTTP is located at the next node layer of TCP in the protocol tree as a protocol node.
[0060] S104: Traverse the protocol feature information of all protocols, collect all the protocols whose comparison mode is equal, build a feature value table for one keyword according to the keywords and feature values ​​of the protocol, and record all feature values ​​of the keyword in the feature value table And the protocol name corresponding to each characteristic value. For example, the form of the feature value table whose keyword is EType is shown in Table 2 (Table 2 only shows part of the structure of the feature value table of EType, and does not fully display all the feature values ​​recorded in the feature value table and the corresponding protocol) Shown:
[0061] Table 2
[0062] Eigenvalues ​​
[0063] If the protocol corresponding to the feature value in the feature value table is "none", it means that there is no protocol with a feature value of 0xFFFF, and a new protocol with a feature value of 0xFFFF can be defined. That is, if the characteristic value of a new protocol is 0xFFFF, the new protocol is added to the characteristic value table of EType.
[0064] S105: Collect judgment conditions of all protocols whose comparison form is greater than or less than, and establish judgment logic. The judgment logic is composed of the name of each protocol and the judgment condition of the protocol.
[0065] S106: Store all feature value tables and judgment logic to generate a recognition engine. The recognition engine is the execution code including all feature value tables and judgment logic. As a dynamic link library (DLL), it can be loaded and called to run.
[0066] reference figure 2 , The step of identifying the data packet protocol includes:
[0067] S201: Input the data packet to be recognized.
[0068] S202: Determine the physical medium according to the physical device capturing the data packet, and then select the protocol tree that identifies the data packet protocol, and then enter the root node of the protocol tree.
[0069] S203: Read from the data packet the values ​​of the keywords of all protocol nodes in the current node layer in the data packet.
[0070] S204: Query the feature value table of the keywords of all protocol nodes in the current node layer, and determine whether there is a value in the feature value table among all the values ​​read, that is, whether there is a value among all the values ​​read It is equal to a certain characteristic value in the characteristic value table. If a value is equal to a certain value in the characteristic value table, it means that there is a value in the characteristic value table. If it is in the characteristic value table, the current agreement is obtained The protocol of the node is the protocol corresponding to the feature value in the feature value table, and step S206 is entered, if not, step S205 is entered.
[0071] S205: Execute the judgment logic, and compare the values ​​of all keywords of the current node with the judgment conditions in the judgment logic in turn. If a certain judgment condition in the judgment logic is met, it is concluded that the protocol of the current node is the protocol corresponding to the judgment condition If the protocol of the current node is obtained, step S206 is entered, and if the value of the keyword does not satisfy all the judgment conditions in the judgment logic, the recognition result is output.
[0072] S206: Enter the next node layer of the current node layer in the protocol tree, and return to step S203, and execute steps S203 to S205 cyclically.
[0073] The following is an example to illustrate the process of packet protocol identification. For example, if the physical medium used to capture the data packet to be identified is Ethernet, the protocol tree whose entry to the protocol tree is Ethernet is selected to identify the data packet protocol. The form of the protocol tree is:
[0074] Ethernet
[0075]
[0076]
[0077] It should be noted that the protocol tree is not a complete protocol tree, and only a part of the protocol tree is intercepted to assist this example to illustrate the process of identifying the data packet protocol. Starting from the root node of the protocol tree, the key of the root node is D16[12], and the value of the two-byte length read from the 12th byte in the data packet is 34525, which is the key D16 [12] For the value in the data packet, query the feature value table of the keyword D16[12. If 34525 is not found in the feature value label, then the judgment logic will be executed. Among them, if 34525 meets the judgment condition> 1500, the root node's The protocol is EthernetII. Then enter the next node layer of the root node. The keywords of all protocol nodes in the current node layer are EType. According to the keyword definition in the EthernetII protocol, EType=D16[12], then read the keyword EType in the data packet The value is 34525. Query the feature value table of EType and find that 34525 is equal to the feature value 0x86DD in the feature value table, and then the protocol of the current protocol node is IPv6. Then enter the next node layer. The key of the protocol node in the current node layer is Pro. Since the lower layer protocol is IPv6, read Pro=D8[6] from the key definition of the IPv6 protocol, and the sixth from the data packet Reading the value of one byte length at the beginning of each byte is 6. By querying the feature value table of the keyword Pro, it can be known that the protocol of the current protocol node is TCP. Then enter the next node layer, read the keywords of all protocol nodes in the current node layer as DP and SP. From the keyword definition of the TCP protocol, we can see that DP=D16[2], SP=D16[0], from the data packet Read the value of the keyword as 80 and 150 respectively, query the feature value table of the keyword DP and SP, and find 80 in the feature value table of the keyword DP, then the protocol is HTTP. Output the recognition result, the protocol used by the data packet to be recognized is EthernetII\IPv6\TCP\HTTP.
[0078] The traditional data packet protocol identification method is to call the protocol plug-ins in sequence, each time a recognition plug-in is called, the value of the keyword in the data packet is read once, and the read value is compared with the value in the plug-in, if they are equal , The agreement is found, otherwise, continue to call other agreement plug-ins until the agreement is found. For example, in the process of determining the IPv6 protocol in this example, first call the IP protocol plug-in to obtain the value of the keyword in the data packet as 34525. Compare: 34525=0x0800? If it is not established, it is not a protocol IP. Then call the ARP plug-in to get the value of the keyword in the data packet as 34525, compare: 34525=0x0806? If it is not established, it is not a protocol ARP. Then call the IPX plug-in, call the SNMP plug-in, etc., until the IPv6 plug-in is called, and the value of the keyword in the data packet is 34525. Compare: 34525=0x86DD? If yes, it is judged to be the protocol IPv6, then the search at this layer ends and the search at the subsequent layers is entered.
[0079] The method of the present invention identifies the protocol used by the data packet by establishing a characteristic value table and judgment logic, and searching the characteristic value table and judgment logic. The protocol used can be quickly found by searching the feature value table once, instead of the comparison operation of each protocol in the traditional general recognition method, the recognition efficiency has been increased from O(n) to O(0), which improves Network packet protocol identification processing performance. After a lot of research, the system supports more than 800 protocols including commonly used Ethernet, WAN, and wireless networks, and the recognition efficiency exceeds 1 million data packets per second. O(n) and O(0) are representation methods of execution efficiency. For detailed definition, please refer to "Data Structure". O(n) means that the efficiency is proportional to the factor. For example, when there are n protocols that can be identified and the recognition time is s, then when there are 2n identifiable protocols, the identification time is 2s, and when there are 100n identifiable protocols, The recognition time is 100s. O(0) means that the efficiency is constant. If there are n protocols recognized by the system and the recognition time is s, then the protocol increases to 2n, 10n, 100n, and the recognition time is still s.
[0080] At the same time, the method of the present invention has good scalability. When a new protocol needs to be added, only the protocol feature information of the new protocol needs to be input and stored, and then the feature value of the new protocol is added to the corresponding feature according to the comparison mode of the new protocol. In the value table, or add the judgment condition of the new protocol to the judgment logic.
[0081] reference image 3 The present invention also provides a network data packet protocol identification system, including a protocol configuration device and a data packet protocol identification device, wherein the protocol configuration device includes a protocol feature storage unit, a protocol tree generation unit, a feature value table, and judgment logic A generating unit, the data packet protocol identification device includes: a data packet acquisition unit, a protocol tree selection unit, and a data packet protocol identification unit.
[0082] The protocol feature storage unit is used to store the protocol feature information of the protocol. The protocol feature information includes: the name of the protocol, the name of the subordinate protocol of the protocol, the keywords of the protocol, the feature value, the keyword definition and the comparison mode , The comparison mode includes greater than, equal to, and less than.
[0083] The protocol tree generating unit is used to generate the protocol tree according to the protocol feature information. The protocol tree generation unit regards a protocol as a protocol node, and all protocols with the same lower-level protocol form a node layer, and then according to the upper and lower levels of the protocol, all the protocol nodes are connected to form a protocol tree, forming a protocol node of the same node layer Located at the same node layer of the protocol tree, the entry of the protocol tree is a physical medium.
[0084] The characteristic value table and the judgment logic generating unit are used to generate a characteristic value table and judgment logic for identifying the data packet protocol according to the protocol characteristic information. The feature value table is composed of the name of the protocol whose comparison mode is equal and the feature value of the protocol; the judgment logic is composed of the name of the protocol whose comparison mode is greater than or less than and the judgment conditions of the protocol.
[0085] The data packet collection unit is used to collect data packets to be identified;
[0086] The protocol tree selection unit is used to select the protocol tree that identifies the data packet protocol according to the physical medium from which the data packet is obtained;
[0087] The data packet protocol identification unit is used to read the keywords in the data packet to be identified, and compare the read keywords with the feature value table and the judgment logic to identify the data packet protocol. The data packet protocol identification unit includes: a keyword acquisition module, which is used to read from the data packet the value of the keyword of each protocol node in the protocol tree in the data packet; the protocol identification module is used to query the characteristic value table and Execute the judgment logic to identify the data packet protocol. Query the feature value table, if the value of the keyword read from the data packet is the same as a feature value in the feature value table, then the protocol of the protocol node is the protocol corresponding to the feature value in the feature value table. If the value of the keyword read in the packet is different from any feature value in the feature value table, the judgment logic is executed. If the value of the keyword read from the data packet meets a certain judgment condition in the judgment logic, the agreement is obtained The protocol of the node is the protocol corresponding to the judgment condition. If any judgment condition in the judgment logic is not met, the recognition result is output.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Image processing system

InactiveCN110222646AEnhanced scene understandingImprove recognition performanceCharacter and pattern recognitionCMOSFocal plane detector
Owner:北京宏大天成防务装备科技有限公司

Classification and recommendation of technical efficacy words

  • Improve recognition performance
  • Easy to add

Memory test for alzheimer's disease

InactiveUS20110236864A1Improve recognition performanceMedical automated diagnosisMental therapiesTest itemDisease cause
Owner:ASHFORD JOHN WESSON

Mode training method based on ensemble learning and mode indentifying method

InactiveCN102521599AImprove recognition performanceImprove training efficiency and detection efficiencyCharacter and pattern recognitionDictionary learningEnsemble learning
Owner:INST OF COMPUTING TECH CHINESE ACAD OF SCI

Information distribution system

InactiveUS6148301AReducing and shortcoming and disadvantageEasy to addData processing applicationsDatabase distribution/replicationDistribution systemCentralized database
Owner:FIRST DATA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products