A Top-K contrast sequential pattern mining algorithm based on concurrence and having interval constraints

A technology of pattern mining and sequence comparison, applied in computing, special data processing applications, instruments, etc., can solve problems such as dissatisfaction, pattern clipping, and difficulty for users to set support thresholds, so as to improve applicability and efficiency. , the effect of improving the efficiency of the algorithm

Inactive Publication Date: 2018-12-18
STATE GRID CHONGQING ELECTRIC POWER CO ELECTRIC POWER RES INST +2
View PDF0 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Its goal is to mine the minimization pattern with positive support greater than or equal to α and negative support less than or equal to β under the interval constraints. The support threshold of , if an inappropri

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Top-K contrast sequential pattern mining algorithm based on concurrence and having interval constraints
  • A Top-K contrast sequential pattern mining algorithm based on concurrence and having interval constraints
  • A Top-K contrast sequential pattern mining algorithm based on concurrence and having interval constraints

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0048] Embodiment 1: as Figure 1 to Figure 5 Shown; A Top-k contrast sequence pattern mining algorithm based on concurrency with interval constraint, it includes:

[0049]S1: Input the data set and parameters according to the specified format;

[0050] S2: scan the data set, produce a set of candidate elements and the position information of all elements in it;

[0051] S3: The data set enumeration tree traverses all candidate patterns, and finds the k patterns with the most significant contrast;

[0052] S4: Output the k patterns with the most significant contrast to the specified position.

[0053] The data set parameters input in step S1 include: a) positive example data set; b) negative example data set; c) interval constraint; d) k value.

[0054] Step S2 specifically includes:

[0055] S211: scan the positive example data set in the data set;

[0056] S212: For the input sequence data set, traverse each sequence according to the order of the sequence, and then upda...

Embodiment 2

[0078] Embodiment 2: This embodiment is a preferred implementation. like Figure 1 to Figure 5 Shown; a concurrent Top-k contrast sequence pattern mining algorithm with interval constraints, for the input positive example data set (D+) and negative example data set (D-), the interval constraint γ and the user needs to mine Comparing the number (k) of the most significant comparison patterns, the present invention decomposes specific tasks into multiple parallel parts, and then executes these tasks concurrently, and finally obtains the most significant change in support from the positive example data set to the negative example data set. Sequential patterns of k alignments that are significant and satisfy the interval constraints.

[0079] In order to effectively and efficiently mine the top-k comparison sequence pattern with interval constraints, the present invention needs to solve the four problems of validity, applicability, high efficiency and scalability of high-dimensio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Top-K contrast sequential pattern mining algorithm based on concurrence and having interval constraints, which includes: inputting data sets and parameters according to the prescribed format; scanning a data set, a set of production candidate elements and position information of all elements therein; a data set enumeration tree traverses all candidate patterns to find thek patterns with the most significant contrast. The k modes with the most prominent contrast are output to the specified position. Based on the mining of contrast sequence patterns with interval constraints, the invention introduces the concept of top-k. The contrast sequential pattern mining algorithm with interval constraints aims to find the k contrast sequential patterns whose support degree varies most significantly between two data sets. This method can avoid the loss of useful patterns due to inappropriate thresholds. Only the number of desired modes can be set by the user, and the usedifficulty is greatly reduced compared with the previous methods; At the same time, it enhances the interpretability of the mining results.

Description

technical field [0001] The invention relates to sequence data mining in the field of data mining in computers, in particular to a pattern mining algorithm based on concurrency, which can solve interval constraints, and uses the top-k concept to replace specific support thresholds. Background technique [0002] Since Agrawal and Srikant proposed sequential pattern mining, sequential patterns as an important task of data mining have attracted the attention of a large number of researchers, and a variety of different sequential patterns have been proposed, such as frequent sequential patterns, contrastive sequential patterns, closed patterns, biased patterns, etc. sequential mode, periodic mode, etc. In real life, the sequence pattern has a wide range of applications. For example, the health and disease control department can mine the time-series pattern of infectious disease transmission, and the mining results can be used to discover the spatio-temporal aggregation outbreak ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 李刚邹波尹心侯兴哲周全胡晓锐吴彬周艳玲籍勇亮张羽
Owner STATE GRID CHONGQING ELECTRIC POWER CO ELECTRIC POWER RES INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products