Deterministic finite automaton (DFA) matching method and device based on TCAM (ternary content addressable memory)

A state automaton and matching method technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of large automata, inability to store DFA, and inability to use DFA storage.

Inactive Publication Date: 2013-09-11
UNIV OF SCI & TECH OF CHINA
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the number of states of DFA may have an exponential growth relationship with the size of the regular expression rule set, resulting in the fact that regular expression rules usually cannot be stored in DFA
The current DFA-based methods cannot break through a bottleneck of storage volume, that is, the number of sto...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deterministic finite automaton (DFA) matching method and device based on TCAM (ternary content addressable memory)
  • Deterministic finite automaton (DFA) matching method and device based on TCAM (ternary content addressable memory)
  • Deterministic finite automaton (DFA) matching method and device based on TCAM (ternary content addressable memory)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0035] The specific processing flow of a TCAM-based DFA matching method provided in this embodiment is as follows figure 1 As shown, the following processing steps are included:

[0036] Step S101, each state of the DFA is represented by several TCAM entries, and each TCAM entry is composed of three fields: a source state field, an input character field, and a destination state field, and the source state field and the destination state field are composed of a copy ID field and a The private ID domain consists of two subdomains.

[0037] DFA can be implemented using TCAM. As a high-speed parallel search engine, TCAM has been widely deployed in various high-speed network devices for network applications such as IP address search and header-based packet classification. The TCAM storage unit can store the bit values ​​of "0", "1" and "*", among which the bit value "*" means "don't care about the state", that is, it can be either "0" or "1" . A TCAM chip consists of a certain ...

Embodiment 2

[0130] Use the Thompson algorithm to transform the regular expression into a standard NFA, and then use the subset construction method and the Hopcroft algorithm to transform this NFA into a DFA with the least number of states.

[0131] figure 2 Shown is the NFA used to match the regular expressions ab[^h]*cd and ef[^h]*c[hH]. The circle is the NFA state, and the arrow is the state transition edge of the NFA. State 0 is the initial state, state 13 and state 14 are receiving states, and the receiving states are represented by double circles.

[0132] Define the "ε edge" of NFA, which is an edge that does not consume any input characters to transfer to a new state; figure 2 The state 0, state 5, state 6, state 7 and state 8 in have ε edges;

[0133] Define "ε-NFA", which is an NFA with ε edges; figure 2 The NFA of is an ε-NFA;

[0134] Define the "combined edge" of NFA, which is an edge composed of all outgoing edges that start from the same NFA state and arrive at the same...

Embodiment 3

[0151] A method for identifying chains in the NFA provided in this embodiment includes:

[0152] Find all the NFA chain head states, that is, there is no NFA state with any non-ε entry edge; for example figure 2 , state 1, state 2, state 9, and state 10 can be used as the chain head state;

[0153] Begin to extend the chain starting from each chain head state in turn: if the current NFA state has a non-ε outgoing edge to the next NFA state, and the next NFA state has no ε edge pointing to the state that has been extended on the chain, then extend the chain to the next One NFA state, and use the next NFA state as the current state, continue to expand the chain according to the above method; otherwise, stop extending the chain; if from figure 2 The chain head in state 1 starts to extend the chain, state 1 has a non-ε outgoing edge to state 3, and state 3 has no ε edge, so the chain is extended to state 3; state 3 is extended to state 5 after character b; state 5 tries Extend...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a deterministic finite automaton (DFA) matching method and device based on TCAM (ternary content addressable memory). The method comprises the following steps: expressing all states of a DFA with a plurality of TCAM items, and identifying a chain structure in the DFA based on a chain structure in a nondeterministic finite automata (NFA) for generating the DFA and the features of an internal chain structure between the NFA and the DFA, wherein each TCAM item comprises a source state field, an input character field and a targeted state field; renumbering and recoding the DFA state based on the chain structure in the DFA, and recoding the state transitional edge of the DFA in the TCAM; and taking the matching of the specific source state field and the input character field as a search keyword, searching in all TCAM items of the DFA based on the search keyword, and taking the searched targeted state field as an output result. Through coding and compressing the state transitional edge of the DFA in the TCAM, the deterministic finite automaton matching method and device, provided by the embodiment of the invention, enable the transcript chain structure of the DFA to be combined in the TCAM so as to fulfill the purpose of reducing the storage space.

Description

technical field [0001] The present invention relates to the field of computer application technology, in particular to a DFA matching method and device based on TCAM (ternary content addressable memory, ternary content addressable memory). Background technique [0002] Regular expression technology is a core basic technology of computer network system, which is widely used in intrusion detection and protection, signature matching, worm detection, packet content filtering, traffic analysis, protocol identification and other fields. Regular expressions have a flexible and powerful ability to describe string patterns. The matching of regular expressions is realized by finite automaton, which includes NFA (non-deterministic finite automaton, non-deterministic finite automaton) and DFA (deterministic finite automaton). finite automaton, deterministic finite automaton). That is, the regular expression can be compiled into an NFA, and then the NFA can be converted into an equivale...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 董群峰彭坤杨
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products