Check patentability & draft patents in minutes with Patsnap Eureka AI!

MPP (massively parallel processing)-based parallel data mining framework and MPP-based parallel data mining method

A technology of data mining and distributed data, which is applied in the direction of electronic digital data processing, structured data retrieval, special data processing applications, etc., can solve the problems of inability to carry out data mining, incapacity, slow speed, etc., and achieve good application prospects, Effects of improving efficiency and capacity, and improving processing capacity

Active Publication Date: 2014-12-24
天津神舟通用数据技术有限公司
View PDF4 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the amount of data is particularly large, this mode will be extremely slow, and even show incapacity, that is, data mining tasks cannot be performed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • MPP (massively parallel processing)-based parallel data mining framework and MPP-based parallel data mining method
  • MPP (massively parallel processing)-based parallel data mining framework and MPP-based parallel data mining method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Embodiments of the present invention will be described in further detail below in conjunction with the accompanying drawings.

[0031] An MPP-based parallel data mining architecture, such as figure 1 and figure 2 As shown, it is a distributed data mining structure in which one mining engine drives multiple mining agents, that is, the E-As (Engine-Agents) mode, supplemented by load balancing and nearby mining strategies based on data distribution, and at the same time, the Master-Slaver (s ) operator mode to design the mining algorithm in parallel.

[0032]A parallel data mining architecture based on MPP, comprising a mining engine node and a plurality of distributed mining agent nodes, the mining engine node includes an engine resource supervision module, a task supervision module, a message service module, a metadata management module, Agent resource management module, task scheduling module, task load balancing module and computing load balancing module; the mining...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an MPP (massively parallel processing)-based parallel data mining framework and an MPP-based parallel data mining method. The MPP-based parallel data mining framework is mainly and technically characterized in that the mining framework comprises a mining engine node and a plurality of distributed mining agent nodes. The method comprises the steps of assigning the current data mining task to the mining agent node with less data mining task load by the mining engine node, and taking the mining agent node with less data mining task load as a Master mining agent node of the data mining task; assigning mining tasks to the corresponding mining agent nodes by the Master mining agent node which adopts the data-distributed load balancing and nearby mining strategy; enabling all the mining agent nodes to respectively execute a Slaver operator according to an allocated subtask, wherein each Slaver operator is only used for processing an allocated data block. According to the framework and the method, an MPP method is adopted, and the characteristics of data mining are combined, so that mass data can be effectively processed at high speed, the problems that the traditional data mining software is small in data processing capacity and slow in running speed can be solved, and the mass data processing efficiency and the data bearing capacity of a data mining algorithm can be greatly improved.

Description

technical field [0001] The invention belongs to the technical field of data mining, in particular to an MPP-based parallel data mining framework and a method thereof. Background technique [0002] With the rapid development of computer technology, especially the continuous application of Internet technology, people's ability to use network information technology to generate and collect data has been greatly improved, and the data has shown a rapid growth trend. How to obtain the required information from massive data has become an urgent research problem. Faced with such a challenge, data mining (Data Mining) technology emerged as the times require, using data mining technology to obtain hidden useful information from these massive data. However, due to the explosive growth of data, how to use data mining technology to quickly and effectively obtain hidden useful information from massive data is becoming more and more important. [0003] Distributed storage system is to di...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/183G06F16/1858G06F16/24532G06F16/24569G06F16/27G06F16/90335
Inventor 卢中亮黄瑞李海峰苏卫卫刘祺钱勇苗润华李靖王文青
Owner 天津神舟通用数据技术有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More