Character string matching method based on finite state automation and content filtering equipment

A string matching and finite state technology, applied in the retrieval field, can solve problems such as long character matching delays, achieve the effects of improving processing speed and efficiency, reducing delays, and solving system performance bottlenecks

Inactive Publication Date: 2010-11-03
RUIJIE NETWORKS CO LTD
View PDF0 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0027] The embodiment of the present invention provides a character string matching method and a content filtering device based on a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Character string matching method based on finite state automation and content filtering equipment
  • Character string matching method based on finite state automation and content filtering equipment
  • Character string matching method based on finite state automation and content filtering equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] In order to solve the problem that the frequent main memory access that exists when using DFA to search the database or filter the network data stream in the prior art seriously affects the speed and efficiency of character string matching, and to reduce the time delay in the character matching process, it is necessary to reduce the character string as much as possible. The number of memory accesses during the match.

[0042] Carefully analyzing the existing DFA state table (Table 1), we can find states 2, 5, 8, 9, 10, and 11. These six states have the following two characteristics:

[0043] (1) State sequence association.

[0044] For example: state 2, 5, 8, 9, 10 and 11 in Table 1, in state 2 or 5, enter the character 'R', enter state 8; in state 8, enter the character 'Z', enter state 9 ;In state 9, enter the character 'W' to enter state 10; in state 10, enter the character 'X' to enter state 11. These six states are therefore sequentially associated.

[0045] (2)...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a character string matching method based on a finite state automation and content filtering equipment. The character string matching method comprises the step of combining two sequence-associated states meeting the set conditions in a DFA (Deterministic Finite Automaton) to obtain a combined DFA; the corresponding matching process of character strings comprises the following steps of: sequentially reading characters from a character string database and judging whether the matching is character string matching according to the current state and the read characters; if not, skipping to the next state according to the current state and the read characters; if so, acquiring a matched character string of the current state from the corresponding storage address of the character string, reading next character and judging whether to be matched with next character matched with the character string or not; during matching, continuously reading next character and skipping to the next state until the character strings are successfully matched; and if not, skipping to the next state according to the current state and the read characters. By using the method, the times for accessing a memory during the matching of the character strings can be reduced and the speed and the efficiency for the matching of the character strings can be improved.

Description

technical field [0001] The invention relates to the technical field of retrieval, in particular to a method for string matching and a content filtering device based on a finite state automaton (Deterministic Finite State Automaton, DFA). Background technique [0002] The Aho-Corasick algorithm was proposed by Aho and Corasick of Bell Labs in "EfficientString Matching: An Aid to Bibliographic Search" in 1975. Its core is a finite state automaton (Deterministic Finite State Automaton, DFA) covering all query keywords. ). Each character in the database to be searched is input into DFA one by one, and when a certain query keyword hits, DFA outputs a report. It can be used in string matching, text retrieval, deep content filtering of network data flow, intrusion detection, network antivirus and other fields. [0003] In the process of obtaining DFA through the Aho-Corasick algorithm, three functions need to be constructed: GOTO, FAILURE and OUTPUT. The process of constructing ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 黄凯明
Owner RUIJIE NETWORKS CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products