Data processing method and system thereof

A data processing and data technology, applied in the field of communication, can solve the problem of low efficiency of preprocessing operation, achieve the effect of improving efficiency and avoiding the limitation of memory capacity

Inactive Publication Date: 2010-08-11
CHINA MOBILE COMM GRP CO LTD
View PDF0 Cites 39 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Embodiments of the present invention provide a data processing method and a data processing system to solve the problem of low efficiency of preprocessing operations in the data mining process in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method and system thereof
  • Data processing method and system thereof
  • Data processing method and system thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The embodiment of the present invention adopts the Map / Reduce (mapping / simplification) mechanism for data processing, such as the data preprocessing operation in the data mining process. Map / Reduce is an implementation method for distributed processing of massive data. This mechanism allows programs to be executed step by step on a very large cluster composed of ordinary nodes.

[0020] Embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0021] see figure 1 , is a schematic diagram of the data mining preprocessing flow in the embodiment of the present invention. When the data preprocessing flow is started, the flow includes:

[0022] Step 101, generate multiple Map tasks, each Map task is responsible for processing part of the data to be processed.

[0023] In this step, multiple Map tasks can be generated by calling the Map function. The Map function can generate multiple Map tasks according to the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data processing method and a data processing system. The method of the invention comprises the following steps: concurrently executing a plurality of Map tasks, wherein each Map task acquires data of the corresponding part in the data to be processed and processes the acquired data to obtain the local processing results of the data to be processed; and executing the Reduce task, wherein the Reduce task obtains the global processing result of the data to be processed according to all the local processing results. The invention can improve the data preprocessing efficiency in the data mining process.

Description

technical field [0001] The invention relates to data mining technology in the field of communication, in particular to a data processing method and system thereof. Background technique [0002] Data mining is the process of extracting hidden, unknown but potentially useful information and knowledge from a large number of incomplete, noisy, fuzzy, and random practical application data. [0003] The data mining process usually includes three main steps: data preprocessing (ETL), data mining algorithm implementation, and result display. Through the ETL step, the source data can be preprocessed to obtain the data to be mined; through the data mining algorithm implementation step, the data mining algorithm that meets the business needs can be realized to obtain the analysis results; through the result display step, the data mining algorithm can be processed The results are displayed to the user. [0004] ETL (such as extraction, conversion, loading, etc.) is responsible for ext...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 高丹邓超徐萌罗治国周文辉何清谭庆马旭东郑诗豪沈亚飞陈磊
Owner CHINA MOBILE COMM GRP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products