Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and system for distributed calculating and enquiring magnanimity data in on-line analysis processing

A technology of online analysis processing and distributed computing, which is applied in computing, electrical digital data processing, special data processing applications, etc., and can solve problems such as computing and query tasks that have not studied MapReduce data cubes

Inactive Publication Date: 2010-05-19
SOUTH CHINA UNIV OF TECH
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the current literature does not study how to use MapReduce to deal with the calculation and query tasks of the data cube, and how many Map and Reduce tasks can make the data cube achieve a balance between storage space and query time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for distributed calculating and enquiring magnanimity data in on-line analysis processing
  • Method and system for distributed calculating and enquiring magnanimity data in on-line analysis processing
  • Method and system for distributed calculating and enquiring magnanimity data in on-line analysis processing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] Embodiments of the present invention will be further described below in conjunction with the accompanying drawings, but the present invention is not limited thereto.

[0028] Such as figure 1 As shown, the cluster system structure adopted by the present invention is mainly divided into a name node and a data node. The name node divides data into blocks, distributes data blocks to each node, and reads and writes data blocks, that is, manages data nodes and schedules distributed computing tasks; data nodes store data blocks and process Map computing tasks and Reduce computing tasks.

[0029] Such as figure 2 As shown, the present invention is as figure 1 The process of processing large-capacity datasets on the cluster system shown is:

[0030] 1) MapReduce divides the large-capacity data set to be calculated into blocks, and the size of each block is equal to the size of the data set divided by the number of Map tasks, and distributes the data blocks to the nodes;

...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a system for distributed computation and massive data query in online analyzing and processing; wherein, a cluster system is adopted to achieve distributed pre-computation and query to data cubes. The invention is characterized in that a large-capacity dataset is partitioned into a plurality of blocks distributed to each node through MapReduce based on the MapReduce frame; then a local closed cube corresponding to each data block is computed according to Map task on the node; finally the Map tasks on different nodes are started for parallel query to each local closed cube, then a plurality of measuring values searched out are merged according to the Reduce tasks. The invention has the advantages of simple and effective pre-computation and query of the online analyzing and processing to large-capacity data, large compression of storage space of data cubes and rapid response of user query.

Description

technical field [0001] The invention relates to a method and system for distributed pre-calculation and query in OLAP, especially for OLAP processing of massive data. Background technique [0002] OLAP is a research hotspot in recent years. It takes the dimensional model, that is, the data cube as the core, aims at analysis, and provides users with multi-perspective online data analysis through pre-aggregation technology. However, with the continuous development of the Internet and the increasing complexity of user needs, high-dimensional and large-capacity data will cause an information explosion in the data cube. How to effectively compress and quickly calculate it has become a major challenge for OLAP. [0003] Many data cube compression algorithms have been proposed by current researchers. Yannis Sismanis and others proposed Dwarf Cube in 2002, which eliminates spatial redundancy by identifying the same prefix and the same suffix. Laks V.S.Lakshmanany, Jian Pei et al. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 奚建清游进国陈虎张平建
Owner SOUTH CHINA UNIV OF TECH