Frequent item set mining method and device based on MapReduce and array

A frequent item set mining and frequent item set technology, which is applied in the fields of electrical digital data processing, special data processing applications, digital data information retrieval, etc., can solve the problems of low efficiency of frequent item set mining

Active Publication Date: 2019-08-27
禤世丽
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] To this end, the embodiment of the present invention provides a frequent itemset mining method, device, electronic equipment and storage medium based on MapReduce and arrays, to solve the frequent itemsets caused by the serial frequent itemsets mining algorithm in the prior art Mining inefficiencies

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Frequent item set mining method and device based on MapReduce and array
  • Frequent item set mining method and device based on MapReduce and array
  • Frequent item set mining method and device based on MapReduce and array

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The implementation mode of the present invention is illustrated by specific specific examples below, and those who are familiar with this technology can easily understand other advantages and effects of the present invention from the contents disclosed in this description. Obviously, the described embodiments are a part of the present invention. , but not all examples. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0049] figure 1 For the implementation environment involved in the frequent itemset mining method based on MapReduce and arrays provided by the embodiment of the present invention, see figure 1 , the implementation environment includes: a client 101 and a server 102.

[0050]Wherein, the client 101 may be a PDA, a notebook computer, a desktop computer, a tablet computer, a smart phone, etc., an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a frequent item set mining method based on MapReduce and an array. The frequent item set mining method comprises the following steps: converting a data set into a two-dimensional array; decomposing the two-dimensional array into a plurality of two-dimensional sub-arrays; allocating the plurality of two-dimensional sub-arrays to at least two nodes which execute frequent item set mining tasks in parallel, wherein each node mines a sub-frequent item set corresponding to the two-dimensional sub-array and retains a non-frequent item set of each node; and counting and summarizing the sub-frequent item sets and combining the non-frequent item sets to obtain a frequent item set of the data set. According to the method, the database is only scanned once andconverted into the two-dimensional array, scanning of the database is reduced, and the I/O time is shortened; meanwhile, the array is creatively decomposed into more sub-arrays through a horizontal division method; a MapReduce programming model of a Hadoop platform is used, and a method of processing two-dimensional subarrays in parallel is adopted, that is, frequent item set mining is performedon the subarrays in parallel at a plurality of nodes, so that the method has relatively good acceleration ratio and expandability, and is suitable for mining frequent item sets for big data sets.

Description

technical field [0001] Embodiments of the present invention relate to the technical fields of data mining and big data, and in particular to a frequent itemset mining method, device, electronic device and storage medium based on MapReduce and arrays. Background technique [0002] In the field of data mining, R. Agrawa and R. Srikant proposed the classic Aprior algorithm, and many literatures have proposed many improved frequent itemset mining algorithms. Compared with the Aprior algorithm, these algorithms shorten the I / O time to a certain extent, and appropriately improve the efficiency of finding frequent itemsets, but they still cannot solve the bottleneck problem of the algorithm well. [0003] At present, with the rapid development of information, the massive data that needs to be analyzed is also increasing day by day, which causes the current serial frequent itemset mining algorithm to face two difficult problems: one is limited by the memory of a single machine, It ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2458
CPCG06F16/2465Y02D10/00
Inventor 禤世丽
Owner 禤世丽
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products