Method and device for constructing AC (aho-corasick) state machine

A construction method and state machine technology, applied in the field of pattern matching, can solve problems such as reducing the scope of use of AC state machines

Inactive Publication Date: 2012-08-22
BEIJING XINWANG RUIJIE NETWORK TECH CO LTD
View PDF0 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0023] However, the existing AC state machine can only handle certain search patterns, and the AC state machine can...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for constructing AC (aho-corasick) state machine
  • Method and device for constructing AC (aho-corasick) state machine
  • Method and device for constructing AC (aho-corasick) state machine

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0045] This embodiment provides a method for constructing an AC state machine. The method is applicable to the construction device of the AC state machine. The construction device of the AC state machine regards the wildcard as a specific character to construct a standard goto function table and an output function table, and disassembles the state node containing the transferred state based on the wildcard. Divide and merge to eliminate uncertain goto functions, modify the output function table at the same time, and finally construct an AC state machine that can handle wildcards.

[0046] Such as Figure 2A As shown, it is a schematic flowchart of the construction method of the AC state machine in this embodiment. The construction method of the AC state machine includes:

[0047] Step 201, setting each wildcard character in each search pattern as a specific character.

[0048] The specific character in this step may be represented by a non-ASCII code value, such as the valu...

Embodiment 2

[0058] This embodiment provides a further supplementary description of the method for constructing the AC state machine in the foregoing embodiments.

[0059] After step 203 and before step 204, the method for constructing the AC state machine in this embodiment further includes: identifying uncertain goto functions in the copied goto function table and the original goto function table of sibling state nodes. This identification duplicates the goto function table and the original goto function table of the brother status node is uncertain. The goto functions include:

[0060] Identify each transfer function expression and transfer function output value of the original goto function table of the duplicated goto function table and sibling state nodes;

[0061] The transfer functions with the same transfer function expression and different transfer function output values ​​are regarded as indeterminate goto functions.

[0062] The construction method of the AC state machine in t...

Embodiment 3

[0064] This embodiment further defines the method for constructing the AC state machine in the foregoing embodiments.

[0065] In this embodiment, after copying the goto function table of the state node corresponding to the output value of the new goto function to the goto function table of the state node corresponding to the output value of the old goto function, after removing the new goto function, and returning to execute the above The identification step, until the unidentified goto function is not identified, also includes:

[0066] Judging whether the state node corresponding to the output value of the new goto function is the final state node, when the judgment result is yes, copy the output function table of the state node corresponding to the output value of the new goto function to the state corresponding to the output value of the old goto function In the output function table of the node, the final state node is the state node corresponding to the last input chara...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and a device for constructing an AC (aho-corasick) state machine. The method comprises the following steps of: setting each wildcard character in each search mode as a specific character; constructing a keyword tree according to each search model; copying a goto function table of wildcard character nodes to brother state nodes, and recording input characters (transferred to the brother state nodes) as to-be-excluded characters; when a recognizing that an uncertain goto function exists, processing the uncertain goto function so as to exclude the uncertain goto function until no uncertain goto function is recognized; and excluding all to-be-excluded characters corresponding to the brother state nodes from the wildcard characters. According to the method and device for constructing the AC (aho-corasick) state machine disclosed by the invention, the AC state machine can deal with wildcard characters.

Description

technical field [0001] The invention relates to pattern matching technology, in particular to a method and device for constructing an AC state machine. Background technique [0002] The multi-pattern matching problem is one of the basic problems in computer science. The multi-pattern matching problem can be simply described as: when there is a search text and a search pattern set, the search pattern set includes more than two search patterns, and each search pattern is usually is a string. Finds each search pattern in the search pattern collection in the search text. For example: the search text A is: abcdefg123456, and the search pattern set C is {abc, ef, tian, 123, 67, 890}. Then, after performing multi-pattern matching, the output result is that the search text A contains the search patterns abc, ef and 123. [0003] Multi-pattern matching only needs to scan the search text once to find out all the search patterns matched by the search text. It has high matching effic...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 陈国鹏
Owner BEIJING XINWANG RUIJIE NETWORK TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products