Multiple character string matching method and chip

A string matching and character technology, applied in the field of information processing, can solve the problems of slow matching speed, impracticality, obvious impact on storage space, etc., and achieve the effect of reducing space requirements and solving the problem of space explosion

Inactive Publication Date: 2007-10-10
BEIJING ZHEAN TECH CORP +1
View PDF0 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although the traditional algorithm can support multi-string matching for a large-scale rule base, it

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multiple character string matching method and chip
  • Multiple character string matching method and chip
  • Multiple character string matching method and chip

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0078] This embodiment provides a method for matching multiple character strings, which is specifically based on the principle of a cache state machine and is mainly characterized by eliminating cross conversion rules. At the same time, this method can also completely eliminate restart conversion rules and failure conversion rules. Referring to Figure 5, in the current state S i , the current input is K, and the storage state number N=1 is taken as an example for illustration, the method includes the following steps:

[0079] Step 101: Define the cache rules of the cache state machine, that is, the cache state function, according to the application situation.

[0080] The cache rule of this embodiment is specifically described by the cache state function (ie θ), specifically: after the current state receives the current character, if the initial state S 0 Receive the current character and have the next state in the basic conversion rule, then store the next state in the cach...

Embodiment 2

[0098] This embodiment provides a multi-character string matching method, which is based on the cache state machine principle and mainly features the isomorphic path merging technology, a multi-character string matching method.

[0099] First, merge the existing basic conversion rules into isomorphic paths, taking P={betters, pattern} as an example, see Figure 8, the DFA diagram constructed for the AC algorithm (which does not include restart conversion rules and failure conversion rules) , a total of 14 basic transition rules and 15 states are required, the state S 2 -S 5 with state S 9 -S 12 have the same property, that is, they all receive the string "tter", and this form is called an isomorphic path. For isomorphic paths, cache state machines are used to merge them.

[0100] Referring to FIG. 9, it is a schematic diagram of the merged cache state machine in FIG. 8, and the next state is converged into a state S of a state 1and state S 8 Called the convergent state, t...

Embodiment 3

[0113] Referring to FIG. 10 , this embodiment provides a multi-character string matching chip, which includes: an interface module, a status register, a cache status register, a conversion rule module and a control module.

[0114] Wherein, the interface module is used for receiving input characters;

[0115] The state register is used to store the current state;

[0116] The cache state register is used to store the cache state, and there are N cache states, and N can be 1 or other values;

[0117] The conversion rule module is used for storing the state conversion rule base, and searches for the next state according to the character received by the interface module, the current state stored in the state register and the cache state stored in the cache state register;

[0118] The control module is used to control the interface module to normally receive input characters, control the status register to update the current status, control the cache status register to update th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method for matching multi-character includes seeking next state in conversion rule library according to current state and buffer storage state as ell as current character after current character is received, using state stored according to preset buffer storage rule as said buffer storage state, jumping to next state if seeking is successful, using said next state as current state and repeating above said steps till there is no character to be inputted. The chip used for realizing said method is also disclosed.

Description

technical field [0001] The invention relates to the field of information processing, in particular to a multi-character string matching method and a chip. Background technique [0002] Multi-string matching technology, also called multi-keyword matching technology, has been relatively mature and widely used in many fields such as text processing and content filtering. This technology can find one or more of a pre-defined set of strings in the one-dimensional content to be matched. The processed intermediate data structure performs content matching, so as to realize the matching of a set of predefined character strings. [0003] The performance of the multi-string matching algorithm is mainly affected by the following aspects: the number of string sets (also called rule sets, feature sets, keyword sets), the minimum length of the string set, the possibility of matching in the text to be matched, etc. . According to the different preprocessing methods of string sets by mult...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F17/30985G06F16/90344
Inventor 嵩天
Owner BEIJING ZHEAN TECH CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products