Massive feature string sets matching method and apparatus

A matching method and feature string technology, applied in the Internet field, can solve the problems of no jumping ability, BG algorithm losing the jumping ability, etc.

Active Publication Date: 2017-04-26
NEUSOFT CORP
View PDF10 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0021] 2) The performance of the SOG algorithm is stable, and it will not repeatedly scan the text due to the increase in the number of feature strings in the feature string set, but also because it has no jumping ability, when the size of the feature string set decreases, its matching speed will not increase. promote
[0022] 3) Due to the shortcomings of the BNDM algorithm itself, when the number of feature strings in the feature string set reaches a certain scale, the BG algorithm will lose its jumping ability

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Massive feature string sets matching method and apparatus
  • Massive feature string sets matching method and apparatus
  • Massive feature string sets matching method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0084] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention. On the contrary, the embodiments of the present invention include all changes, modifications and equivalents coming within the spirit and scope of the appended claims.

[0085] figure 1 It is a schematic flowchart of a matching method for a massive feature string set proposed by an embodiment of the present invention.

[0086] see figure 1 , the matching method of this massive feature string set includes: preprocessing stage and scanning stage:

[0087] Among them, the workflow of the preprocessing stage includes:

[0088] S11: Receive...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a massive feature string sets matching method and apparatus. The method comprises the steps of determining a length of a matched window according to a quantity of characters included in a feature string in the feature string set and setting an initial offset value for current to-be-matched data; performing offsetting and positioning on the current to-be-matched data by taking the initial offset value as a start point, acquiring a character block and using the character block as a first character block, wherein the character block takes the tail end of the matched window as a terminal point within the length of the matched window and has a length equal to a preset first length; when the first character block is a sub-feature string of the feature string set, acquiring a character block and using the character block as a second character block, wherein the character block takes an initial end of the matched window as an initial point within the length of the matched window and has a length equal to a preset second length; calculating a conversion value of the second character block, reading a bit vector corresponding to the conversion value and using the bit vector as a current matched vector; and performing feature string matching on the current to-be-matched data according to the current matched vector. According to the method, the space of a bit vector mask table is fully used, filtration passing rate is lowered, and matching is faster.

Description

technical field [0001] The present invention relates to the technical field of the Internet, in particular to a matching method and device for a massive feature string set. Background technique [0002] Pattern matching is one of the important research directions in computer field, which is used to discover characteristic strings from target strings. With the rapid development of Internet technology, pattern matching is widely used in the fields of network security, information retrieval, and biomedicine. [0003] Pattern matching means that in the text T=t 1 t 2 ...t n Find a given feature string set P={p 1 ,p 2 ,...,p r}, where T and p i (1≤i≤r) is a sequence of characters on the finite character table Σ. With the development of networks and biology, while matching more feature string entries, it is necessary to maintain a high processing speed, which puts forward higher requirements for the processing capacity of multi-pattern matching. However, in many existing ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/90344
Inventor 尹延伟
Owner NEUSOFT CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products