Frequent adjacent sequence pattern mining method

A sequential pattern mining and adjacency technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as additional overhead, and achieve the effects of avoiding dimension explosion, good timeliness, and low time complexity

Pending Publication Date: 2019-02-15
NAT UNIV OF DEFENSE TECH
View PDF0 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0014] In view of the fact that in the prior art, a large number of useless patterns such as cyclic sequence patterns and redundant sequence patterns will be mined during the sequence pattern mining process, which will cause additional overhead, the purpose of the present invention is to provide a frequent adjacency sequence pattern mining method that removes a large number of The cyclic sequence pattern and redundant sequence pattern (that is, the indirect sequence pattern and the sequence pattern with a length of 1) make the mining results more in line with practical problems, and at the same time have better time efficiency and avoid additional overhead, which can be well applied to In the actual large-scale frequent sequential pattern mining task

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Frequent adjacent sequence pattern mining method
  • Frequent adjacent sequence pattern mining method
  • Frequent adjacent sequence pattern mining method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] In order to facilitate the implementation of the present invention, further description will be given below in conjunction with specific examples.

[0034] Such as figure 1 A frequent adjacency sequence pattern mining method shown includes the following specific steps:

[0035] Step 1, sequence data collation

[0036] S1. Organize the data in the sequence data set and obtain the sequence database

[0037] The relevant definition of Frequent Sequential Pattern (FSP) in this embodiment is: Given a sequence data set D, the FSP mining problem is to find out all frequent sequence patterns that appear in D, and these frequent sequence patterns are in D A subsequence of a sequence that accounts for a certain percentage of . Call the sequence alpha=α 1 → α 2 →...→α n beta = β 1 →β 2 →...→β m a subsequence of If and only if there exists a set of integer sequences 1≤i 1 ≤i 2 ...≤i n ≤m such that have That is, to make alpha a subsequence of beta, each term α in ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a frequent adjacent sequence pattern mining method, comprising the following steps: sequence data collation; Obtaining the number mt of all items in the sequence database and the maximum sequence length L; Creating a sparse tensor of order l and an empty array of dimension l; Traversing a sequence database, querying a sequence with a length of l and storing the sequence in an array with a dimension of l; corresponding Each row (column) in each array to the position index in the sparse tensor, and accumulating the value of each element in the sparse tensor, wherein the value is the frequency of the corresponding sequence pattern; selecting The element whose frequency is higher than the minimum support from the sparse tensor, wherein the sequence pattern correspondingto the element is the frequent sequence pattern. Removing Redundant frequent patterns and cyclic sequence patterns, and dimension explosion is effectively avoided by using sparse tensor data structure. The algorithm has low time complexity and good timeliness in large-scale frequent sequence mining. The invention is applied to the technical field of data mining.

Description

technical field [0001] The invention relates to the technical field of data mining, in particular to a frequent adjacent sequence pattern mining method. Background technique [0002] Sequence pattern mining is an important method to find frequent sequence patterns in sequence collections. Given a collection of different sequences, each sequence is arranged in order by different elements, and a user-specified minimum support degree threshold min sup , sequential pattern mining is to find out all occurrence frequency not less than min sup subsequence of . [0003] Commonly used basic sequential pattern mining algorithms include Apriori-like algorithms (AprioriAll, AprioriSome, DynamicSome) and algorithms based on data projection (FreeSpan, PrefixSpan). [0004] The idea of ​​the Apriori algorithm is roughly the same. First, it traverses the sequence database to generate candidate sequences and uses the prior properties to prune to obtain frequent sequence patterns. Each tra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2458
Inventor 王江周鋆王培超易侃任华
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products