The embodiment of the invention discloses a frequent
item set mining method based on MapReduce and an array. The frequent
item set mining method comprises the following steps: converting a
data set into a two-dimensional array; decomposing the two-dimensional array into a plurality of two-dimensional sub-arrays; allocating the plurality of two-dimensional sub-arrays to at least two nodes which execute frequent
item set mining tasks in parallel, wherein each node mines a sub-frequent item set corresponding to the two-dimensional sub-array and retains a non-frequent item set of each node; and counting and summarizing the sub-frequent item sets and combining the non-frequent item sets to obtain a frequent item set of the
data set. According to the method, the
database is only scanned once andconverted into the two-dimensional array, scanning of the
database is reduced, and the I / O time is shortened; meanwhile, the array is creatively decomposed into more sub-arrays through a horizontal division method; a MapReduce
programming model of a Hadoop platform is used, and a method of
processing two-dimensional subarrays in parallel is adopted, that is, frequent item set mining is performedon the subarrays in parallel at a plurality of nodes, so that the method has relatively good acceleration ratio and expandability, and is suitable for mining frequent item sets for
big data sets.