Non-overlapping adaptive frequent sequence pattern mining method

A frequent sequence and pattern mining technology, applied in character and pattern recognition, special data processing applications, instruments, etc.

Pending Publication Date: 2020-11-13
HEBEI UNIV OF TECH
View PDF15 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0029] The technical problem to be solved by the present invention is to provide a non-overlapping condition adaptive frequent sequence pattern mining method, the method uses the pattern growth strategy to reduce the generation of candidate patterns, and builds a single branch network tree through the depth-first mining strategy to improve support calculation efficiency, and then solve the problem of non-overlapping adaptive frequent sequential pattern mining

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Non-overlapping adaptive frequent sequence pattern mining method
  • Non-overlapping adaptive frequent sequence pattern mining method
  • Non-overlapping adaptive frequent sequence pattern mining method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0122] Given a symbolized time series database D={s 1 =s 1 the s 2 the s 3 the s 4 the s 5 the s 6 the s 7 the s 8 =AHAHYCAH,s 2 =s 1 the s 2 the s 3 the s 4 the s 5 the s 6 the s 7 =GCHAGHA}, given the minimum support threshold minsup=3.

[0123] The first step is to read the sequence database D and the minimum support threshold minsup:

[0124] Read in the given sequence database D and the minimum support threshold minsup to obtain the character set E and the number N of sequences in the sequence database D, each of which is recorded as sequence s 1 , sequence s 2 , ..., sequence s k , ..., sequence s N , where 1≤k≤N, the sequence s k Each character in is denoted as the character s 1 , character s 2 , ..., character s j , ..., character s n , where 1≤j≤n;

[0125] Read into the given sequence database D={s 1 =s 1 the s 2 the s 3 the s 4 the s 5 the s 6 the s 7 the s 8 =AHAHYCAH,s 2 =s 1 the s 2 the s 3 the s 4 the s 5 the s 6 the s 7...

Embodiment 2

[0211] Given a symbolized stock sequence database D={s 1 =s 1 the s 2 the s 3 the s 4 the s 5 the s 6 the s 7 the s 8 =FQFAQYCF,s 2 =s 1 the s 2 the s 3 the s 4 the s 5 the s 6 the s 7 the s 8 the s 9 the s 10 =GCDAQFGQCF}, the expected number of patterns K=3.

[0212] In addition to "the first step, read in the sequence database D and the minimum support threshold minsup; the second step, calculate the candidate pattern set C 1 The support degree of each candidate pattern in , and the pattern whose support degree is greater than or equal to the minimum support threshold minsup is added to the non-overlapping adaptive frequent sequence pattern set F with pattern length 1 1 In the fourth step, when the candidate pattern set C l+1 When it is empty, the mining ends and the sixth step is executed; otherwise, the candidate pattern set C is traversed l+1 , take out the pattern p in turn h and calculate the mode p h The pattern support in the sequence database ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a non-overlapping adaptive frequent sequence pattern mining method, and belongs to the field of sequence pattern analysis of data mining. According to the method, candidate mode generation is reduced by using a mode growth strategy, and a single-branch network tree is constructed through a depth-first strategy to improve the support degree calculation efficiency; the problem of non-overlapping adaptive frequent sequence pattern mining is solved; according to the method, under the condition that gap constraints are not given, the non-overlapping frequent pattern miningis realized; the problem that the existing sequence pattern mining technology is difficult to consider the flexibility, high efficiency and completeness of mining at the same time is solved; the method is convenient for a user to use, the time complexity and the space complexity can be effectively reduced, and complete and valuable information is obtained while the non-redundancy of a mining result is ensured.

Description

technical field [0001] The technical solution of the invention relates to the field of sequence pattern analysis, in particular to a frequent sequence pattern mining method with adaptive gaps under the condition of no overlap. Background technique [0002] Time series data is a common and important data. It is a numerical sequence arranged in chronological order and widely exists in human production and life, such as passenger flow analysis, marketing, river flow, weather temperature / humidity, stock market Stock prices, heart / EEG, etc. These time series data contain a large amount of potentially valuable information, which can reveal different characteristics of the dynamic system and help users make decision analysis and future trend judgment. However, due to the volatility, variability and uncertainty of time series, it is an important challenge to extract effective information directly from time series data. Therefore, it is necessary to convert it into a character seque...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/2458G06K9/62
CPCG06F16/2465G06F18/22
Inventor 王月华李艳王珠林刘锦赵晓倩陈明婕武优西
Owner HEBEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products