Method for matching in speedup regular expression based on finite automaton containing memorization determination

A finite automaton and expression matching technology, applied in the field of information processing, can solve the problems of increasing the number of states of regular expressions, and achieve the effects of size reduction, development difficulty, cost and complexity reduction

Inactive Publication Date: 2008-06-18
ZHEJIANG UNIV
View PDF0 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0017] Although the above conversion method can support repetition operators, when the number of repetitions is large and the range is large, the number of states of the regular expression will increase sharply, causing a huge burden on the rule compiler and pattern matching engine

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for matching in speedup regular expression based on finite automaton containing memorization determination
  • Method for matching in speedup regular expression based on finite automaton containing memorization determination
  • Method for matching in speedup regular expression based on finite automaton containing memorization determination

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] Regular expression matching acceleration method based on deterministic finite automata with memory: its core is to use deterministic finite automata with memory to directly support the repetition operator (A{n, m}), which basically does not reduce the matching performance Under the condition, for rules with a large number of repetitions, the scale of generating deterministic finite automata (DFA) can be greatly reduced, and the storage cost can be reduced. At the same time, it can also simplify the design of the compiler and greatly reduce the rule processing time. To use a deterministic finite automaton with memory, both the rule compiler and the pattern matching engine must support it.

[0040] The rule description file: the rule description file contains the character strings (regular expressions) to be searched. There can be any rule in the rule description file, and each rule consists of the following parts: a unique identifier, used to distinguish it from other r...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a matching and accelerating method of a regular expression based on a deterministic finite automaton with memory, including a rule compiler of the regular expression and a pattern matching engine; the rule compiler of the regular expression firstly transforms the regular expression into an analytic tree, and then transforms the analytic tree into a nondeterministic finite automaton with memory and the deterministic finite automaton with memory respectively; the pattern matching engine can accelerate pattern matching by using the deterministic finite automaton with memory generated by the rule compiler. The invention has the advantages that: 1) by directly supporting repeat operators, the compiler does not need to unfold the repeat expression, thus the difficulty of the development of the compiler is greatly reduced and the memory occupation and the compile time of the compiler are decreased as well; 2) for the same reason, the volume of a rules database generated by the compiler can be reduced, so the cost and complexity of the pattern matching engine can be lowered.

Description

technical field [0001] The invention relates to the field of information processing, in particular to a regular expression matching acceleration method based on a deterministic finite automaton with memory. Background technique [0002] String matching is one of the most basic operations in the field of information processing and the basis of many information processing applications. String matching is the process of finding substrings in an input string (hereinafter referred to as target string) that have a specific relationship with a given string (hereinafter referred to as feature string). String matching can be divided into string exact matching and string fuzzy matching. Specific substrings that are similar to the feature string (for example, the substring of the target string is increased, decreased, or modified by one or several characters compared with the feature string). Exact matching of strings is especially widely used. [0003] A regular expression is a lit...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
Inventor 王继民平玲娣潘雪增陈小平陈健陆魁军
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products