Multi-computing framework processing system applied in big data and association rule mining method thereof

A computing framework and processing system technology, applied in the field of big data, can solve problems such as big data processing

Inactive Publication Date: 2016-11-16
CHENGDU UNIV OF INFORMATION TECH
View PDF5 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Big data cannot be processed by a single compute

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-computing framework processing system applied in big data and association rule mining method thereof
  • Multi-computing framework processing system applied in big data and association rule mining method thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] A kind of system based on multi-computing framework, described system comprises a distributed computer cluster, shares the mapreduce framework of this distributed computer cluster and spark framework, and described computer cluster comprises the database cluster that forms business database, switchboard, application server, Hadoop Cluster, Spark cluster. The distributed computer group is connected to the Internet.

[0026] The present invention combines the operation characteristics of each calculation model, and can efficiently utilize the calculation resources of the cluster. Both Hadoop and Spark are based on the mapreduce framework. Spark is an abstract class of RDD sets. It encapsulates some commonly used mapreduce data processing models into classes. The intermediate data of spark is stored in memory, and a large amount of memory space is used to reserve and calculate intermediate data. data processing process. Compared with Hadoop, spark has obvious advantages ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-computing framework processing system applied in big data and an association rule mining method thereof. The system comprises a distributed computer cluster, a mapreduce system framework and a spark system framework sharing the distributed computer cluster. The computer cluster comprises a database cluster forming transactional databases, a switch, an application server, a Hadoop cluster, and a Spark cluster. Clusters are used to distribute computational tasks. Using advantages of multi-node parallel computation and distributed storage, multi-computing framework and the association rule mining algorithm are combined, transaction cluster division and a pruning strategy are combined, and data handling capacity is reduced, so processing efficiency is effectively improved. Through directly scanning value, a conditional pattern base is solved. The system has large improvement on aspect of overcoming limitation of shared memory, so computing resources are distributed in a balanced manner. The method effectively solves a problem that a conventional item-set mining algorithm is inadequate in computation and storage capability.

Description

technical field [0001] The invention belongs to the technical field of big data, and in particular relates to a multi-computing framework processing system and an association rule mining method applied to big data. Background technique [0002] Big data refers to a collection of data that cannot be captured, managed and processed by conventional software tools within an affordable time frame. , high growth rates and diverse information assets. [0003] "The Era of Big Data" written by Victor Meyer-Schönberger and Kenneth Cukeye explained that big data refers to the use of all data for analysis and processing without shortcuts such as random analysis (sampling survey). The 5V characteristics of big data (proposed by IBM): Volume (mass), Velocity (high speed), Variety (variety), Value (value) and Veracity (authenticity). [0004] In recent years, big data has penetrated into all walks of life in society, and its rapid rise has profoundly changed people's lives and thinking p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/50
CPCG06F9/5083
Inventor 李彤岩张婷赵伦苟瀚元徐嘉临
Owner CHENGDU UNIV OF INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products