Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A character string matching method, system, storage medium and device

A string matching and string technology, applied in other database retrieval, special data processing applications, other database query, etc., can solve the problems of increasing the total number of DFA states, small string matching storage space, low matching throughput, etc., to achieve The effect of improving matching throughput and reducing storage space overhead

Active Publication Date: 2022-01-21
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the number of string rules increases, the total number of DFA states (M) increases explosively, resulting in a rapid increase in DFA storage space overhead, exceeding the existing fast memory capacity of network devices
Although the existing string matching compression algorithms D2FA and ΔFA reduce the DFA space overhead, they have the following problems: the time complexity of D2FA to calculate the default migration edge is O(M2logM), and it is difficult to deal with DFA of large-scale string sets. In , reading a character in each state may need to find multiple default migration edges, resulting in low matching throughput; ΔFA needs to update the migration table of the next state once for each state transition, resulting in low string matching throughput
Therefore, existing string matching algorithms cannot meet the requirements of small string matching storage space, short construction time and high matching throughput at the same time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A character string matching method, system, storage medium and device
  • A character string matching method, system, storage medium and device
  • A character string matching method, system, storage medium and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0090] Aiming at the problems of large space overhead, long construction time and low matching throughput of existing character string matching algorithms, the present invention proposes a ranking-based string matching algorithm RDFA, which not only significantly compresses DFA storage space, but also solves the problems of existing character string matching algorithms. String matching compression algorithms suffer from long construction times and low matching throughput. The RDFA algorithm is applied to network systems based on string matching, such as intrusion detection systems, application traffic identification systems, and Web application firewalls. figure 1 It is a network system workflow based on string matching: first, the network system receives data packets through I / O devices; second, after network protocol analysis, string matching is performed on the contents of the data packets.

[0091]The core idea of ​​the RDFA algorithm is: firstly, construct a global trans...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention proposes a character string matching method, comprising: firstly, constructing a global migration table to store migration edges of the same input character with the same destination state; secondly, constructing a local migration table for each state, and storing Migration edges with different destination states in the global migration table, and use bitmap to further compress the local migration edge table. The construction time complexity of the global migration table is O(M×N), M represents the total number of DFA states, and N represents the number of unique characters in the alphabet, so the construction time of RDFA is less than that of existing algorithms; at the same time, the global migration table reduces With a large number of redundant migration edges, RDFA significantly compresses the DFA storage space; for each character read, RDFA only needs to look up the local migration table and the global migration table in the current state, thereby improving string matching throughput.

Description

technical field [0001] The invention relates to the fields of character string matching algorithms, deep data packet detection, application traffic identification and network security, and in particular to a ranking-based character string matching method, system, storage medium and device. Background technique [0002] String matching algorithms are widely used in network devices based on deep packet inspection, such as intrusion detection systems, traffic monitoring, etc. The algorithm finds out all matching signature string rules by matching the data packet content with a signature string rule set. With the increasing number of characteristic string rules, the string matching algorithm has become the performance bottleneck of network devices, and it is difficult to meet the performance and scalability requirements of deep packet inspection. [0003] The Aho-Corasick algorithm (AC algorithm) is currently the most widely used multi-string matching algorithm. It uses a deter...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/903
CPCG06F16/90344
Inventor 黄昆陈雪琳谢高岗
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products