Intelligent protocol parsing method and device

A technology of protocol analysis and intelligent analysis, applied in digital transmission systems, electrical components, transmission systems, etc., can solve problems such as protocol in-depth analysis errors, difficulty in accurate protocol analysis, and inability to match, so as to improve accuracy and strong versatility of the method , the effect of high accuracy

Inactive Publication Date: 2007-09-12
BEIJING VENUS INFORMATION TECH
0 Cites 52 Cited by

AI-Extracted Technical Summary

Problems solved by technology

In this case, IDS/IPS products cannot correctly identify the protocol type of the message or the specific software usage according to the port mapping table or specific field pattern matching, which brings great trouble to some specific requirements. This requires intelligent identification of the protocol type of the message according to the operating behavior characteristics of the network protocol, otherwis...
View more

Method used

But in some cases it is more difficult to judge specific software or version, such as finding that a certain IP message carries the protocol static feature "HTTP", then the possibility that the protocol type of this IP message is HTTP is very large, But it is impossible to uniquely determine which software is in use. At this time, the protocol analysis or audit results obtained purely based on the protocol static matching features may be wrong, so it is necessary to further use behavior feature matching rules to determine the correctness of the protocol identification results. The protocol analysis method uses the protocol behavior feature recognition method while using the static feature matching. Because for any kind of application software, no matter what kind of protocol is used, it must have its specific behavior characteristics. Relying on specific behavioral characteristics for protocol identification will greatly improve the accuracy of identification. The protocol behavior feature set is related to the specific protocol application. Each rule in the set contains a series of behavior features, and this sequence of behavior features uniquely identifies the criteria that the IP message must meet when identifying it as a protocol of this type. . Therefore, the protocol behavior feature set established for a certain protocol type can be regarded as a set of necessary conditions for the protocol specification of this type.
Protocol feature extraction: Main feature extraction is divided into two steps, at first is the static feature extraction of protocol packet. This part mainly relies on a si...
View more

Abstract

The invention relates to smart agreement analytical methods and devices used for intruding detection defense (IDS / IPS) and audit products. The purpose of the invention is to provide an agreement not to rely solely on the static ports and matching agreement characteristics of intelligent field protocol analysis technology and analytical format of the agreement is automatically adjusted in different versions of the software and gives accurate results, which enhanced the accuracy of the analysis of the agreement. The invention consists of three major steps: the establishment of agreements features model; agreement recognition; intelligent analysis of that agreement. This invention solved the traditional IDS / IPS products for the non-standard ports or did not have static characteristics of field data packet network protocol identification of problems but for some applications or different versions of the agreement, such as the reasons for the analytical results can provide automated error rectification work.

Application Domain

Data switching networks

Technology Topic

Networking protocolWeb protocols +5

Image

  • Intelligent protocol parsing method and device
  • Intelligent protocol parsing method and device
  • Intelligent protocol parsing method and device

Examples

  • Experimental program(2)

Example Embodiment

[0051] Embodiment 1 (Static Features of BitTorrent Protocol):
[0052] %13BitTorrent%20Protocol can identify the BitTorrent message type in the BitTorrent protocol or software communication process that uses the BitTorrent protocol, and it can be used as the BitTorrent protocol static identification rule;
[0053] Establish the BITTORRENT protocol static identification rule set:
[0054] The text must contain the string "Bittorrent";
[0055] And so on, if the actual data packet sample is:
[0056] GET/announce? info_hash=%OD%40_%F3%0A%269%81%94%B9/%B80%5EC%8A%8
[0057] A%9A%9C%E5&peer_id=Plus---tL3l5oWGtwZ9o&port=9096&uploaded=0&dow
[0058] nloaded=0&left=28742712&event=started HTTP/1.0..Host:btfans.332
[0059] 2.org: 8000.. Accept-encoding: gzip.. User-agent: BitTorrent/Plus!
[0060] II 1.02 RC1....
[0061] However, in some cases, it is difficult to judge the specific software or version. For example, if an IP packet is found to carry the protocol static feature "HTTP", it is very likely that the protocol type of the IP packet is HTTP, but it cannot be unique. To determine what software is in use. At this time, the protocol analysis or audit result obtained purely based on the protocol static matching feature may be wrong, so it is necessary to further use the behavior feature matching rule to determine the correctness of the protocol identification result. This protocol analysis method uses the protocol behavior feature identification method while using static feature matching. Because for any kind of application software, no matter what protocol is used, it must have its specific behavior characteristics. Relying on specific behavior characteristics for protocol recognition will greatly improve the accuracy of recognition. The protocol behavior feature set is related to the specific protocol application. Each rule in the set contains a series of behavior features, and this behavior feature sequence uniquely identifies the criteria that an IP message must meet when it is judged as this type of protocol. . Therefore, the protocol behavior feature set established for a certain protocol type can be regarded as a necessary condition set of the protocol specification of that type.

Example Embodiment

[0062] Embodiment 2 (BitTorrent protocol behavior characteristics):
[0063] First use the track HTTP protocol that interacts with the tracker server:
[0064] 1) The client sends an HTTP GET request to the tracker
[0065] The feature of this step is: GET/announce...the GET request sent to Tracker by HTTP/1.0, including the keyword Bittorent:
[0066] 2) The tracker returns the information of the downloader of the same file to the other party. The feature of this step is: the Peers address and port of the dictionary list encoded by bencoded.
[0067] 3) The BitTorrent client sends a connection request according to the obtained peer list. The feature of this step is that the "BitTorrent" keyword is included in the connection request of each peer.
[0068] Protocol feature extraction: The feature extraction is mainly divided into two steps, the first is the static feature extraction of protocol packets. This part mainly relies on a single data packet to make preliminary judgments on the protocol, including text command format protocol; fixed header format protocol and no fixed format protocol. In this step, extract as many feature fields as possible in the protocol data packet to narrow the scope of behavior feature matching. Next is the extraction of protocol operating behavior characteristics. This part is for a single data packet that cannot effectively identify information such as protocol type or version. It is necessary to monitor the actual operation process and extract to further accurately determine the specific protocol type and version number used. feature. The matching of behavior characteristics is aimed at the detailed behaviors and actions of the protocol running in a stage, so the accuracy is higher.
[0069] The protocol behavior feature rule set is related to specific protocol types and versions. The purpose of establishing protocol behavior feature rule sets for various types of agreements is mainly as follows:
[0070] 1) The correctness of the static protocol rule matching result can be verified through the protocol behavior feature rule set, that is, the possible protocol type or software usage set uniquely identifies the specific identification result after the static protocol rule matches.
[0071] 2) Based on the protocol type judged after the static protocol rule is matched, the specific protocol running version and other details can be identified to ensure the correctness of the subsequent protocol analysis results.
[0072] 3) For the matching of protocol behavior characteristics, you can in-depth inspection or audit of specific protocol or software operation events and actions. Only the messages after static rule matching and behavior characteristic matching can accurately locate the specific information of the protocol or software used in the communication .
[0073] The protocol behavior feature rule set established for a certain type of protocol is a rule set, and the control flow graph (CFG) model is used to describe the protocol behavior feature rule set. As shown in Figure 3, in the CFG model representation method, each step of the protocol operation behavior feature is represented by an ellipse node. Here, except for the two special rules TRUE and FALSE used to return the protocol matching results, the other verification rules are all a Boolean Logic, its execution result can only be true or false. This protocol verification rule set is executed from the root node. If the execution result of the current protocol verification rule is true, the verification rule tree on the left is executed, if it is false, the verification rule tree on the right is executed until the execution reaches TRUE or FALSE node. Figure 3 is an example of the behavior feature rule set of the BitTorrent protocol: defines the behavior feature rule set of the BitTorrent protocol. The execution of the protocol behavior matching rule set starts from the root node, and an IP message only passes the match of the behavior feature sequence. May return BitTorrent protocol ID, otherwise return FALSE. The size of the behavior feature sequence established for a certain protocol feature model directly affects the accuracy and efficiency of the protocol recognition result: when there are more entries for the static feature and behavior feature sequence of a certain type of protocol, the accuracy of the protocol recognition result is reduced. The higher the value, the lower the efficiency of protocol identification; when there are fewer entries for a certain type of protocol static feature and behavior feature sequence, the protocol identification efficiency will be high, but the accuracy of the protocol identification result may be reduced. Therefore, it should be Define the protocol validation rule set reasonably as needed.
[0074] The intelligent analysis and correction stage of the C protocol is shown in Figure 2:
[0075] For the determined protocol type, use the corresponding analysis method to analyze. If there is an error in the analysis format result, use the intelligent analysis correction method to try the analysis until a more accurate analysis result is obtained. In the actual network communication environment, especially in the use of certain proprietary protocols, the upgrade or change of the software version usually brings changes in the analysis format and method. In this case, it is unrealistic to hope to establish a uniform and applicable analysis format and method. Even if the use type of the protocol and the related version information are determined in the previous modules, it is actually for the currently existing software version. For many software, version upgrades are carried out very frequently. Therefore, the parsing speed of the existing version often cannot keep up with the update speed of the software. In this case, if a comprehensive analysis is required for each new or unknown version, the workload is very large and there is a lot of repetitive work. In fact, the structural change of the protocol used for this kind of change is very small, and the intelligent analysis and correction method is used in this device to unnecessary duplication of work.
[0076] In the actual analysis process, the main changes to the agreement include the following aspects:
[0077] 1. Change of field size
[0078] 2. Change of field offset
[0079] 3. Changes in the order of fields
[0080] The purpose of the protocol intelligent analysis modification attempt module is mainly to automatically analyze and realize the changes made to the data packet format part of the protocol used by some proprietary software in the version changes or certain specific behaviors. In the case of parsing errors caused by similar problems, the workload of re-analysis is greatly reduced, so that the understanding of protocol relevance provides greater accuracy and flexibility for specific analysis in the case of determining the protocol type.
[0081] The intelligent analysis used when a certain protocol cannot be accurately analyzed. The selection of the trial range when correcting the trial will affect the accuracy and efficiency of protocol analysis: the more the trial range is selected, the software or protocol that can be correctly resolved will be covered The more types and versions there are, the efficiency will decrease. When the range of attempts is less, the accuracy of the in-depth analysis results for a specific type or version will be poor, but the efficiency is higher at this time. It is recommended that users formulate an appropriate range of corrections based on their understanding of the specific analytical protocol and possible changes.
[0082] This device uses algorithm:
[0083] 1. Fast matching of protocol static feature rules;
[0084] After the static feature rules of various types of protocols are defined in the protocol sample extraction stage, the multi-mode matching algorithm is used to match the static feature rules, which is used to discover and quickly match the static feature of the IP message application data in the protocol identification stage , So as to find the set of possible protocol types to which the IP message belongs. The multi-pattern matching algorithm can be used to perform the fast matching process of the static characteristics of the protocol: the IP packet application layer payload data is used as the Text of the multi-pattern matching algorithm, and all the extracted static characteristic sets of the protocol are used as the pattern set, and the multi-pattern matching algorithm is used to find Collect all possible protocol types, and then call the protocol behavior feature matching module to eliminate the wrong protocol type until a suitable protocol type is found.
[0085] 2. Establishment and matching of protocol behavior characteristics rules;
[0086] In the process of extracting protocol behavior feature rules, data mining is performed on a large number of collected protocol samples, and association rules and self-learning methods are used to gradually extract and modify behavior feature sequences. For the sake of efficiency, the size of the protocol behavior feature sequence generated by different protocol operation processes is different. The length of the behavior feature sequence can be determined according to the specific accuracy requirements. If necessary, multiple behavior feature sequence matching can be realized for different behaviors of a specific protocol. . Among the protocol sets output by the protocol static feature matching, the multi-pattern matching algorithm is used to match all the protocol behavior feature sequence sets until detailed information such as the specific protocol type and version is determined.
[0087] 3. Intelligent protocol analysis and correction algorithm;
[0088] After the protocol type is determined through protocol static feature matching and behavior feature matching, if a data packet that cannot be parsed correctly is encountered, the intelligent protocol analysis and correction module will be called to make corrections. Here we mainly adopt the method of cyclic traversal verification, and verify the possible conditions one by one according to the change of field size, field offset change and field coding sequence until a more detailed protocol analysis result is obtained. Due to the work of loop traversal verification, this part of the module has a more obvious impact on efficiency, and it is necessary to appropriately set the correction range.
[0089] An intelligent protocol analysis device, as shown in Figure 4: includes a protocol static rule library, a protocol behavior feature model library, a protocol static rule matching engine, a protocol behavior feature matching engine, and an automated adjustment analysis attempt module; the protocol static rule library It is connected with the protocol static matching engine; the protocol behavior characteristic model library is connected with the protocol behavior characteristic matching engine; the protocol behavior characteristic matching engine is connected with the protocol analysis engine; the protocol analysis engine is connected with the intelligent analysis correction attempt module.
[0090] Among them, the protocol static rule library and the protocol behavior feature model library respectively store the static matching rules established in the protocol feature model stage and the behavior feature sequence extracted according to the actual running process of the protocol or software. The protocol static rule matching engine implements fast matching algorithms for all data field features that can be matched in a single data packet. The protocol behavior feature matching engine needs to record a series of actions and states during the protocol operation to match the established behavior feature sequence.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.

Similar technology patents

Imaging apparatus and flicker detection method

ActiveUS20100013953A1reduce dependencyimprove accuracy
Owner:RENESAS ELECTRONICS CORP

Color interpolation method

InactiveUS20050117040A1improve accuracy
Owner:MEGACHIPS

Emotion classifying method fusing intrinsic feature and shallow feature

ActiveCN105824922AImprove classification performanceimprove accuracy
Owner:CHONGQING UNIV OF POSTS & TELECOMM

Scene semantic segmentation method based on full convolution and long and short term memory units

InactiveCN107480726Aimprove accuracylow resolution accuracy
Owner:UNIV OF ELECTRONIC SCI & TECH OF CHINA

Classification and recommendation of technical efficacy words

  • improve accuracy

Golf club head with adjustable vibration-absorbing capacity

InactiveUS20050277485A1improve grip comfortimprove accuracy
Owner:FUSHENG IND CO LTD

Stent delivery system with securement and deployment accuracy

ActiveUS7473271B2improve accuracyreduces occurrence and/or severity
Owner:BOSTON SCI SCIMED INC

Method for improving an HS-DSCH transport format allocation

InactiveUS20060089104A1improve accuracyincrease benefit
Owner:NOKIA SOLUTIONS & NETWORKS OY

Catheter systems

ActiveUS20120059255A1increase selectivityimprove accuracy
Owner:ST JUDE MEDICAL ATRIAL FIBRILLATION DIV

Gaming Machine And Gaming System Using Chips

ActiveUS20090075725A1improve accuracy
Owner:UNIVERSAL ENTERTAINMENT CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products