Unlock instant, AI-driven research and patent intelligence for your innovation.

Mode matching algorithm and system based on traffic high-frequency content

A pattern matching and matching algorithm technology, which is applied in character and pattern recognition, calculation, computer components, etc., can solve the problems of low matching performance and ignoring characteristics, and achieve the effect of improving matching performance

Active Publication Date: 2021-07-02
HARBIN INST OF TECH
View PDF10 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The AC algorithm needs to scan character by character, and the algorithm complexity is O(n), ignoring the characteristics of repeated content in the traffic, and the matching performance is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mode matching algorithm and system based on traffic high-frequency content
  • Mode matching algorithm and system based on traffic high-frequency content
  • Mode matching algorithm and system based on traffic high-frequency content

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0038] Embodiment 1, with reference to figure 1 , to illustrate this embodiment, a pattern matching system based on traffic high-frequency content in this embodiment, including an AC automaton, a mapping module, and a UHC matching module; communication between the UHC matching module and the AC automaton is established through the mapping module , the mapping module is composed of multiple mapping sets; the UHC matching module creates multiple subsets corresponding to the mapping sets; the AC automaton is used to scan text, the mapping module is used to match high-frequency content, the The UHC matching module is used to process high-frequency content and save the state for jumping back to the AC automaton.

[0039] The mapping module is a fast search module, in which a smaller mapping set is generated to build a bridge between the AC node and the high-frequency content matching module, and a fast search is performed while scanning to determine whether the high-frequency conte...

Embodiment 2

[0042] Embodiment 2, with reference to Figure 1 to Figure 5 , to illustrate this embodiment, a pattern matching algorithm based on traffic high-frequency content in this embodiment includes the following steps:

[0043] Step 1. Create an automaton;

[0044] Step 1.1. Construct the automaton according to the pattern set, first create the root node;

[0045] Step 1.2: Enter the next character in the pattern in alphabetical order, if there is no side of the character, then execute step 1.3, otherwise, execute step 1.4; when all characters of all patterns are inserted into the automaton, Execute step 1.5;

[0046] Step 1.3. Create a new node, set the edge value to scan characters, and return to step 1.2;

[0047] Step 1.4, the state of the automaton jumps to the next node along the edge, and returns to step 1.2;

[0048] Step 1.5. Traversing the automata in depth, adding failure pointers to each node;

[0049] Step 1.6. Extract the first characters of all patterns of high-fr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a mode matching algorithm and system based on traffic high-frequency content, and relates to a mode matching algorithm, in particular to the mode matching algorithm and system based on the traffic high-frequency content. A current scanning character is matched with a high-frequency content set through a mapping set by establishing an automaton of the high-frequency content set and the mapping set; and the mapping set is a set formed by deduplication of first characters of all character strings in the high-frequency content set. When accessing the automaton node, a fast search is performed to determine whether a secondary search needs to be performed on the high frequency content set. The automaton scans from left to right, the automaton scans a character from a root node, matches the character string of the high-frequency content set when scanning a character, and skips the high-frequency content when the scanned character matches the character string of the high-frequency content set, therefore, the problem that in the prior art, the matching efficiency is low due to the fact that the features of the repeated content in the flow are ignored is solved, and the matching efficiency is improved.

Description

technical field [0001] The present application relates to a pattern matching algorithm, in particular to a pattern matching algorithm and system based on traffic high-frequency content. Background technique [0002] Among the NIDS detection traffic of the enterprise gateway, a large amount of HTTP traffic contains a lot of repetitive content, including complete repetition and partial repetition. Full repetition refers to multiple occurrences of the entire string, like a stylesheet (eg, <html,,), while partial repetition is a substring, such as shared html code. In addition, the traffic from the same Internet content provider is very similar, the same html frame, similar files. [0003] The classic algorithm for pattern matching is the AC algorithm. The AC algorithm is an automaton algorithm based on prefix search. It uses prefixes to build a finite state automaton (DFA). This automaton is used to match and scan text to find text neutralization patterns. Aggregates ident...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/32G06K9/62
CPCG06V30/418G06V30/40G06V20/62G06V10/751G06V30/10
Inventor 余翔湛刘立坤韦贤葵史建焘叶麟葛蒙蒙李精卫石开宇车佳臻王久金冯帅赵跃宋赟祖
Owner HARBIN INST OF TECH