Unlock instant, AI-driven research and patent intelligence for your innovation.

A GPU-accelerated frequent itemsets mining method based on cuda framework

A technology of frequent itemset mining and frequent itemsets, which is applied in the direction of structured data retrieval, etc., can solve the problems of step application that cannot be interdependent between data, low algorithm efficiency, etc., achieve good acceleration performance, improve processing capacity, and reduce startup The effect of times

Active Publication Date: 2020-09-29
DALIAN UNIV OF TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Although the above-mentioned Apriori algorithm can be accelerated in parallel through the CUDA framework, when there are many frequent itemsets or the minimum support threshold is low, the algorithm still needs to generate a large number of candidate item sets and complete a large number of candidate itemsets by repeatedly scanning the database. The support calculation of itemsets, and the IO of frequent access to external memory will lead to low efficiency of the algorithm
Although the Eclat algorithm can avoid repeated scanning of the database to complete the calculation of support, it will generate a large number of candidate item sets without combining the priori nature of Apriori, and for discrete data sets, the bitmap generated by it may require more memory. Will exceed the limit of GPU global memory
At the same time, neither of them can apply the step of candidate generation, which is a complex logic and interdependent data, to the CUDA framework

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A GPU-accelerated frequent itemsets mining method based on cuda framework
  • A GPU-accelerated frequent itemsets mining method based on cuda framework
  • A GPU-accelerated frequent itemsets mining method based on cuda framework

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. The overall method flow chart is as follows figure 1 Shown:

[0027] The first step is to convert the traditional horizontally stored data set into a data structure stored in a vertical bit table

[0028] 1) The original transaction data set is that each row of data represents each transaction, each transaction has its transaction Tid, scans the original database, and converts it into vertical storage;

[0029] 2) Bitize the transaction data set stored vertically, and convert it into a vertical bit table storage structure , each item corresponds to the Tid set of the thing containing the item, and bitmap the Tid set. If the item exists in a certain transactio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of computer application, and provides a frequent item set mining algorithm based on GPU acceleration of a CUDA. The frequent item set mining algorithm based on the GPU acceleration of the CUDA adopts graph connection and dynamic queue modes. The method fully combines the advantages of an Apriori algorithm and an Eclat algorithm and converts a logic complex task for generating candidate item sets into a calculation intensive task to adapt to a calculation method of the CUDA, and the limitation of a global memory of a GPU is reasonably overcome through the dynamic queue mode, for example, after a discrete dataset is converted into a vertical bitmap, the situation is avoided that the required memory exceeds the limitation of the global memory of the GPU. It is proved through experiments that the method has the advantages that when various types of large-scale datasets are processed, the acceleration performance exceeds that of a serial algorithm, the processing capability is significantly improved, and the extracted frequent item sets are accurate and reliable. In the actual engineering application, the frequent item set mining algorithm has the advantages that other algorithms cannot replace.

Description

technical field [0001] The invention belongs to the field of computer application technology, and relates to a frequent item set mining method in an association rule mining algorithm. In particular, it relates to the method of using GPU to accelerate the algorithm of applying Apriori algorithm to frequent itemset mining on large-scale data sets. Background technique [0002] As the main step of association rule mining algorithm, frequent itemset mining can be used to derive association rules. It plays a huge role in mining the association between items and is widely used in financial forecasting, medical diagnosis and business recommendation. The purpose of frequent itemset mining is to mine potential combinations in the data set, that is, frequent items. The evaluation standard of frequent items is all itemsets whose support degree is greater than the minimum support degree threshold. [0003] The problem of this algorithm is described as follows: I={i1, i2,...,im} represe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/20
Inventor 王宇新徐彤坤薛世卿
Owner DALIAN UNIV OF TECH