An on-line string matching method without gap constraint

A string matching, gapless technology, applied in the field of online string matching without gap constraints, can solve the problems of incomplete character matching, difficult to effectively control space and time overhead, and can not solve frequent patterns well, etc. Achieve the effect of achieving high efficiency, solving space overhead and time overhead, and ensuring completeness

Active Publication Date: 2019-01-25
HEBEI UNIV OF TECH
View PDF20 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0013] Two conditions are very important in pattern matching: one is to meet the completeness requirements of the solution; the other is to improve the speed of solution and reduce the space overhead. Satisfy these two conditions at the same time, for example: the online method proposed in the literature "an efficient on-line algorithm for approximate pattern matching with wildcards and Length constraints, IEEE." published by Wu Xindong and Zhu Xinquan, although get rid of the traditional offline technology, use the online The method of reading characters saves a lot of time, but because it is aimed at one-time conditions, it cannot solve the problem of frequent patterns well; the literature published by Huang Guolin, Guo Dan, and Hu Xuegang "Approximate patterns based on wildcards and length constraints Matching method, computer application." Proposed an approximate pattern matching method based on EDM, which can handle three editing operations in approximate matching, namely insertion, replacement and deletion operations, but in the process of solving approximate matching, the incomplete matching of characters , so that it can only be a loose match; Wu Youxi and Shen Cong published the document "Strict pattern matching under non-overlapping condition, Science China Information Sciences." A patt

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An on-line string matching method without gap constraint
  • An on-line string matching method without gap constraint
  • An on-line string matching method without gap constraint

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0086] Example 1

[0087] Example of biological sequence matching: In the DNA sequence, it is a biological sequence composed of four bases, a, c, g, and t. The given biological sequence string is S=s 1 s 2 s 3 s 4 s 5 s 6 s 7 s 8 s 9 s 10 =acgatgacgg, the given pattern string is P=p 1 p 2 p 3 p 4 =aagg.

[0088] The first step is to read in the pattern string P and create multiple queues:

[0089] Read in the pattern string P=p 1 p 2 p 3 p 4 =aagg, determine that the length of the pattern string P is 4, and the characters of each pattern substring in the pattern string P are p 1 , P 2 , P 3 , P 4 , And establish 4 queues for the pattern string P. The numbers of these queues are queue 1, queue 2, queue 3, and queue 4, namely p 1 = A is queue 1, p 2 =a is queue 2, p 3 =g is queue 3, p 4 =g is queue 4;

[0090] The second step is to read the given sequence string S in sequence:

[0091] Sequentially read in the given biological sequence string S=s 1 s 2 s 3 s 4 s 5 s 6 s 7 s 8 s 9 s 10 = E...

Example Embodiment

[0143] Example 2

[0144] An example of shopping psychological matching: In order to discover the relationship between the behaviors from the user's multiple purchase behaviors, in order to take more effective targeted measures, symbolize the types of goods purchased by customers as a, b, c, d, e ,f,g. The symbolized sequence string S=s of the goods purchased by a customer 1 s 2 s 3 s 4 s 5 s 6 s 7 s 8 s 9 s 10 =adgacgacef, the given pattern string P=p 1 p 2 p 3 p 4 =agac, which means that after purchasing a, g, and a in sequence, then purchasing c.

[0145] The first step is to read in the pattern string P and create multiple queues:

[0146] Read in the pattern string P=p 1 p 2 p 3 p 4 =agac, determine that the length of the pattern string P is 4, and the characters of each pattern substring in the pattern string P are p respectively 1 , P 2 , P 3 , P 4 , And establish 4 queues for the pattern string P. The numbers of these queues are queue 1, queue 2, queue 3, and queue 4, namely...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an on-line string matching method without gap constraint, which relates to the technical field of electric digital data processing. The on-line mode is used for processing thepattern matching problem without gap constraint and without overlap, that is, characters at the same position in the sequence string can be matched at different positions of the pattern string. The method comprises the following steps: reading a pattern string P to establish a plurality of queues; reading a given sequence string S according to a pre-sequence and a post-sequence; determining whether queue i is capable of creating a node; determining whether the occurrence of a non-overlapping condition can be constituted, and when the occurrence of a condition is constituted, outputting on thedisplay until all the characters in the sequence string S are processed. The invention overcomes the defect that it is difficult to effectively control the space cost and the time cost on the basis of ensuring the completeness of the prior art, and not only improves the solving speed, but also ensures the completeness of the solution.

Description

technical field [0001] The technical solution of the present invention relates to the technical field of electrical digital data processing, in particular to an online string matching method without gap constraint. Background technique [0002] With the continuous progress of society and the vigorous development of the computer field, data processing has gradually become a research hotspot, and it is particularly important to retrieve more useful information from the data. Furthermore, researchers characterize the information in the data , and make statistics on it, so the technology of string matching will emerge. With the development of technology, researchers will find out all the substrings in a certain string that are the same as a given substring. Defined as string matching or pattern matching, string matching or pattern matching has a wide range of practical applications, not only for simple biological sequence matching, but also for shopping psychological matching in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/2458
Inventor 武优西王建姣刘靖宇张帅柴欣朱怀忠李艳
Owner HEBEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products