Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Multi-character-string mode matching method, device, computer equipment and storage medium

A technology of pattern matching and character mode, which is applied in other database retrieval, special data processing applications, and other database queries, etc., can solve the problems that the throughput is difficult to exceed 40Gbps, high cost, and limited PCI-E bandwidth, etc., to overcome the calculation model and storage resource limitations, saving storage space, and increasing the number of characters

Pending Publication Date: 2021-05-11
TSINGHUA UNIV
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these hardware replacement products often have high cost, and the throughput is difficult to exceed 40Gbps
In addition, hardware replacements are often connected to the server via PCI-E, which makes it difficult to exploit its full potential due to the limited bandwidth of PCI-E
[0007] All in all, neither algorithm-optimized solutions nor hardware-accelerated solutions can provide the ideal throughput or capital cost to keep up with the dramatic increase in network traffic and network bandwidth

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-character-string mode matching method, device, computer equipment and storage medium
  • Multi-character-string mode matching method, device, computer equipment and storage medium
  • Multi-character-string mode matching method, device, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment

[0125] The following is a specific embodiment, a multi-character pattern string matching method, the method comprising:

[0126] Get the rule set for string matching;

[0127] Extracting a set of character pattern strings and a logical relationship between the character pattern strings from the rule set;

[0128] Based on the AC algorithm, according to the set of character pattern strings, a goto transfer table and a failure transfer table are constructed;

[0129] Build a failure transfer tree from the failure transfer table;

[0130] Based on the shadow encoding algorithm, the state of the failure transition table is encoded, and the triple code in the matching field and the exact code in the action field are assigned for each state of the failure transition table to obtain the code. After the failure transfer table;

[0131] converting the transfer edge of the goto transfer table into an entry, and assigning priority to the entry to obtain the converted goto transfer tab...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a multi-character-string mode matching method, a device, computer equipment and a storage medium. The method comprises the steps of obtaining a character string matching rule set; extracting a character pattern string set and a logic relation thereof; constructing an unconditional transfer table and a failure transfer table according to the character pattern string set based on an automaton algorithm; encoding the state of the transfer table, and distributing a ternary code and a precise code for each state of the transfer table; constructing a non-deterministic finite state automaton matching table according to the transfer table; constructing a strategy matching table according to the character pattern string set and the logic relationship; matching the character string according to the matching table, and outputting a matching result; the complete semantic meaning of the non-deterministic finite state automaton in an automaton algorithm is realized, the number of table items is ensured to be equal to the number of state transition table items of the unconditional transition table, the storage space is greatly saved, the limitation of a programmable switch calculation model and storage resources is overcome, and the number of characters processed by matching each time is increased to increase throughput.

Description

technical field [0001] The invention relates to the technical field of computer matching algorithms, in particular to a multi-character string pattern matching method, device, computer equipment and storage medium. Background technique [0002] String matching algorithm is an algorithm widely used in network security applications. It can be formalized as, given an alphabet Σ, an input string T = t 1 t 2 ...t n And a set of pattern strings P={P i}, where P i =p 1 ,p 2 …p m , and any t i ,p i ∈Σ. The multi-string pattern matching algorithm should output the pattern string P i The set of all positions in T that are substrings. [0003] The Aho-Corasick (automaton) algorithm, referred to as the AC algorithm, is an effective method for multi-string pattern matching. It constructs a nondeterministic finite automaton (NFA, Nondeterministic Finite Automata) by constructing a goto (unconditional transfer) state transition table through a dictionary tree (trie) composed o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/903
CPCG06F16/90344
Inventor 刘莹王士诚张梦豪李冠宇刘畅徐明伟
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products