Parallel-PSO-based maximum fault-tolerant frequent item set mining method

A frequent item set mining, the largest technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of maintaining the same efficiency and low efficiency of the algorithm, and achieve the effect of improving operating efficiency and high performance

Active Publication Date: 2019-10-08
GUILIN UNIV OF ELECTRONIC TECH
View PDF7 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention aims at the fact that most current fault-tolerant frequent itemset mining algorithms are not efficient when dealing with large-scale target transaction databases and sparse target transaction databases, and the algorithm cannot maintain the same efficiency under the condition of different fault-tolerant degrees under the same target transaction database Problem, provide a maximum fault-tolerant frequent itemsets mining method based on parallel PSO

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel-PSO-based maximum fault-tolerant frequent item set mining method
  • Parallel-PSO-based maximum fault-tolerant frequent item set mining method
  • Parallel-PSO-based maximum fault-tolerant frequent item set mining method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in combination with specific examples and with reference to the accompanying drawings.

[0031] Frequent itemset is a set of items satisfying the minimum support condition in the target transaction database. A fault-tolerant frequent itemset is an itemset that satisfies the minimum support under a certain degree of fault tolerance. The problem of fault-tolerant block mining is based on the binary representation of the target transaction database, to find out the area composed of transactions supporting the fault-tolerant item set, so finding out the fault-tolerant block that meets the conditions is equivalent to finding the corresponding fault-tolerant item set. A transactional database is a collection of sales transactions, and each transaction record is a collection of items. The common representation of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a parallel-PSO-based maximum fault-tolerant frequent item set mining method. The parallel-PSO-based maximum fault-tolerant frequent item set mining method includes the steps: converting the target transaction database into a corresponding binary matrix; if the converted binary matrix is a sparse matrix, setting a minimum item support degree to delete the item which does notmeet the threshold value; initializing a particle swarm according to the matrix dimension; adding a Gaussian disturbance item into a speed updating formula of the particle swarm algorithm to preventthe population from falling into local optimum; according to the concept of the fault-tolerant block, designing a fitness function of a particle swarm algorithm, converting particles in a population into an RDD data set in a Spark platform, achieving a termination condition after iteration, and obtaining a maximum fault-tolerant block, wherein when the support degree of the fault-tolerant block islarger than a minimum support degree threshold value, an item set corresponding to the fault-tolerant block is a maximum fault-tolerant item set. The parallel-PSO-based maximum fault-tolerant frequent item set mining method improves the operation efficiency of the algorithm, guarantees that the efficiency of the conditional algorithm with different fault tolerance degrees in the same target transaction database is unchanged, and meanwhile has high performance for the sparse target transaction database.

Description

technical field [0001] The invention relates to the technical field of frequent item set mining, in particular to a maximum fault-tolerant frequent item set mining method based on parallel PSO. Background technique [0002] The problem of frequent itemset mining is to find all frequent itemsets, that is, the set of items that reach a given minimum threshold. This problem is the basis of data mining and knowledge discovery, such as association rule mining, subspace clustering, etc. As an important achievement in the direction of data mining, frequent itemsets reflect the mining of data in a noise-free environment. In many engineering research fields, the mining of frequent itemsets becomes more difficult due to the presence of noise in the data. Noise in these data can be due to various reasons such as measurement errors, missing values, and certain anomalies. The previous method of eliminating noise is often to divide the larger pattern into smaller patterns. However, th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/2455G06F16/2458G06N3/00
CPCG06F16/24564G06F16/2465G06N3/006Y02D10/00
Inventor 张红梅齐东升
Owner GUILIN UNIV OF ELECTRONIC TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products