Regular expression matching method and device

A technology of expression matching and expression, which is applied in the field of data processing, can solve the problems of slow matching speed of regular expressions, and achieve the effect of improving the matching speed

Active Publication Date: 2014-03-05
浙江杭海新城控股集团有限公司
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Embodiments of the present invention provide a regular expression match...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Regular expression matching method and device
  • Regular expression matching method and device
  • Regular expression matching method and device

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0085] Example 1: The regular expression "a(bcd)ef" contains a layer of nested metacharacters "()". This layer of nested metacharacters does not contain branch metacharacters "|", and there are no repeated metacharacters behind them. Therefore, the nested metacharacter "()" can be deleted to obtain the required string "abcdef" of the regular expression "a(bcd)ef". Therefore, the fingerprint of the regular expression "a(bcd)ef" is "abc ", "bcd", "cde", and "def".

example 2

[0086] Example 2: The regular expression "abc+de" contains repeated metacharacters "+", indicating that the character "c" is repeated one or more times. When extracting the necessary strings of the regular expression "abc+de", the regular expression "abc+de" can be split into two branch regular expressions "abc" and "cde", and the branch regular expression "abc " is "abc", the required string of branch regular expression "cde" is "cde", and the required string of regular expression "abc+de" is "abc" and " cde", therefore, the fingerprints of the regular expression "abc+de" are "abc" and "cde".

example 3

[0087] Example 3: The regular expression "a(bc)+f" contains nested metacharacters "()" and repeated metacharacters "+", indicating that the string "bc" is repeated one or more times. Similar to Example 2, when extracting the necessary strings of the regular expression "a(bc)+f", the regular expression "a(bc)+f" can be split into two branches. The regular expression "a( bc)" and "(bc)f". Similar to Example 1, the nested metacharacter "()" in the two branch regular expressions can be deleted to obtain "abc" and "bcf", and the necessary strings of the branch regular expression "a(bc)" are is "abc", the required string of branch regular expression "(bc)f" is "bcf", and the required string of regular expression "a(bc)+f" is "abc" and "bcf ", so the regular expression "a(bc)+f" has fingerprints of "abc" and "bcf".

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a regular expression matching method and device and aims to increase matching speed of regular expressions. The method includes: determining fingerprint of a regular expression; determining representative fingerprint of the regular expression; determining a regular expression set according to the representative fingerprint of the regular expression, determining the regular expression set and determining representative fingerprint of the regular expression set; performing regular expression matching on data to be matched according to correspondence between the representative fingerprint of the regular expression set and a DFA (deterministic finite automaton) complied with the regular expression set.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a regular expression matching method and device. Background technique [0002] Regular expressions can describe complex data features with simple syntax, so they are widely used in many fields such as network intrusion detection and document content retrieval. [0003] Judging whether the data to be matched contains the data characteristics described by the regular expression is called the matching of the regular expression. In the current regular expression matching scheme, regular expressions containing the same string are usually grouped into a group, and the same string is called the generalized string of the regular expression group, and then each regular expression group Compile it into a deterministic finite state automaton (Deterministic Finite Automaton, DFA), establish the corresponding relationship between the generalized strings of each regular expression group and the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/90344
Inventor 王宇平王雨濛
Owner 浙江杭海新城控股集团有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products