Supercharge Your Innovation With Domain-Expert AI Agents!

Method for large-scale feature matching of text content or network content analyses

A network content and feature matching technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of not being able to directly and effectively use text or network content, and not taking into account the uneven probability of characters, etc., to achieve easy management and practical applications, stable search rate and matching time, and high operating efficiency

Active Publication Date: 2013-11-27
TSINGHUA UNIV
View PDF4 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

These algorithms do not take into account the serious problems caused by the uneven occurrence probability of characters in the pattern set, and cannot be directly and effectively applied to the processing of text or network content with large-scale pattern sets.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for large-scale feature matching of text content or network content analyses
  • Method for large-scale feature matching of text content or network content analyses
  • Method for large-scale feature matching of text content or network content analyses

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

[0031] Such as figure 1 As shown, the method for large-scale feature matching according to an embodiment of the present invention includes steps:

[0032] S1. Read in all feature strings and create a double hash table.

[0033] S2. Establish a finite state machine in the hash table.

[0034] S3. Convert the finite state machine in the hash table into a double array structure for storage.

[0035] S4. Text matching search.

[0036] Wherein, step S2 and step S3 are cyclically and alternately executed.

[0037] Wherein, step S1 further includes:

[0038] S1.1 Read all feature strings {lightweight,facebook,globalcom,microsoft,sunshine,moonlight,starlight} in sequence, an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for large-scale feature matching of text content or network content analyses. The method comprises the first step of reading in all feature strings and building double hash tables, the second step of building a finite-state machine in each hash table, the third step of transforming the finite-state machine in each hash table into an even number set structure to be stored, and the fourth step of carrying out matching searching on text content or network content. By means of the method, the matching speed of the text content or network content analyses can be effectively improved, and memory consumption is reduced.

Description

technical field [0001] The invention belongs to the technical field of computer data processing, in particular to a large-scale feature matching method for text or network content analysis. Background technique [0002] Multiple pattern matching is one of the fundamental problems in the field of computer science. The problem to be solved is to quickly and accurately judge the positions of all occurrences of arbitrary pattern strings in the text to be tested or network content. The application fields of multi-pattern matching technology are very extensive, except for the widely used network intrusion detection / prevention system (IDS / IPS), virus scanning system, spam filtering system, application layer network protocol analysis system, network auditing system and the recently proposed The field of network security such as the unified threat management (Unified Threat Management, UTM) system has also been extended to other disciplines and fields, such as information management...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 薛一波袁振龙
Owner TSINGHUA UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More