Check patentability & draft patents in minutes with Patsnap Eureka AI!

Approximate Pattern Matching Method with Local-Global Constraints

A pattern-matching, holistic technology, applied in data mining, instrumentation, computing, etc., to solve problems such as inability to deal with noise, difficulty in complete solutions, flexibility, accuracy and generality, sequence deviation, etc.

Active Publication Date: 2021-02-26
HEBEI UNIV OF TECH
View PDF20 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0024] Pattern matching problems generally require the completeness of the algorithm, and the flexibility, accuracy and generality of the solution, but it is difficult for the existing pattern matching technology to meet these conditions at the same time. For example: "Strict pattern matching" published by Wu Youxi and Shen Cong Under non-overlapping condition, Science China Information Sciences.” Based on the network tree structure, the pattern matching under the condition of no overlap is studied, and an occurrence is determined by iteratively searching the rightmost tree root leaf path of the network tree, and then pruning the occurrence and related invalid results. point, so that the proposed algorithm is complete, correct and effective, but this literature studies the exact pattern matching, which cannot deal with the noise problem, and does not have the generality of the solution; the literature published by Wu Youxi and Tang Zhiqiang "Approximate pattern matching with gap constraints, Journal of Information Science.” studied approximate pattern matching with gap constraints, and proposed an efficient solution algorithm based on a single net tree, which can find more valuable information in many fields than exact pattern matching information, but this literature studies the approximate pattern matching under the Hamming distance. The Hamming distance does not consider the local constraints between the sequences, resulting in a huge deviation between the sequences, which is not accurate; the literature published by Dong Shibo and Li Xungen "A Improved Multi-Pattern Matching Algorithm for Strings, Computer Engineering and Applications." Based on finite automata, a multi-pattern matching algorithm is proposed, which reduces unnecessary character matching and improves matching efficiency, but the research of this algorithm is that there are no gaps Constrained pattern matching lacks flexibility; the literature "NETASPNO: Approximate strict pattern matching under nonoverlapping condition, IEEE Access." published by Wu Youxi and Li Shasha studied approximate pattern matching based on Hamming distance under non-overlapping conditions, by avoiding backtracking and The pruning strategy improves the effectiveness of the algorithm. Although the literature considers the gap constraints, it is flexible and general, but there is a phenomenon of lost solutions and is not complete.
[0025] In short, for the approximate pattern matching problem with local-global constraints, the existing technology is difficult to balance the completeness of the solution and the flexibility, accuracy and generality of the solution, and there is no good method to solve this kind of problem so far

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Approximate Pattern Matching Method with Local-Global Constraints
  • Approximate Pattern Matching Method with Local-Global Constraints
  • Approximate Pattern Matching Method with Local-Global Constraints

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0087] Given the time series of Wolfer sunspot numbers from 1800 to 1847, a total of 48 data, divided into 12 segments, with 4 data in each segment, let the character set Σ={a,b,c,d,e, f}, then the time series is converted into a character sequence "ccabcbcecfce" after being symbolized by the SAX (Symbolic Aggregation Approximation) method, then the sequence S=ccabcbcecfce.

[0088] Time series frequent pattern mining is to find patterns whose support is greater than or equal to the minimum support threshold minsup in the symbolized time series, that is, frequent patterns; time series frequent pattern mining includes candidate pattern generation and calculation of the support of candidate patterns in the sequence Two steps, wherein the candidate pattern is generated by the character set, and the support of the candidate pattern in the sequence is calculated by the pattern matching method. When the support of the candidate pattern in the sequence is greater than or equal to the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention has local-overall constraint approximate pattern matching method, relates to the technical field of electric digital data processing, solves the approximate pattern matching problem under (δ, γ)-distance through network tree structure, firstly reads in sequence S, pattern P , the local threshold δ and the overall threshold γ, and then create a network tree according to the input conditions, and finally calculate all the occurrences of the pattern P in the sequence S according to each node in the leaf layer of the network tree. The method of the present invention realizes the approximate pattern matching under the (δ, γ)-distance under the condition of gap constraints, overcomes the problem of approximate pattern matching with local-global constraints in the prior art, and it is difficult to take into account the completeness of the solution and the flexibility, accuracy and generality of the solution.

Description

technical field [0001] The technical solution of the invention relates to the technical field of electrical digital data processing, in particular to an approximate pattern matching method with local-global constraints. Background technique [0002] With the advent of the big data era, a large amount of data has emerged in many fields. How to mine valuable information from these data has become a research hotspot. Frequent pattern mining refers to finding frequent patterns from large amounts of data. Its main task is pattern matching, because frequent pattern mining usually needs to calculate the support of a pattern, and the essence of support calculation is the pattern matching problem. Therefore, pattern matching is the basis and core of frequent pattern mining, not only applied to time series frequent Pattern mining, and its application to music information retrieval, has important research value. [0003] Pattern matching refers to the process of finding subsequences t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/2458
CPCG06F2216/03G06F16/2465
Inventor 武优西菅博境范金泉王月华刘茜张帅李艳
Owner HEBEI UNIV OF TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More