Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

On-line analytical processing (OLAP) massive multidimensional data dimension storage method

A multi-dimensional data, massive technology, applied in the fields of electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as the inability to adapt to the query and analysis of massive multi-dimensional data, the space occupied by preprocessing data, and the lack of indexing mechanisms. The effect of avoiding brute force scanning, fast calculation and positioning, and saving I/O overhead

Inactive Publication Date: 2013-03-20
SOUTHEAST UNIV
View PDF2 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, row-oriented ROLAP often needs to scan the entire row of data, which affects the overall query efficiency; MOLAP stores pre-aggregated data in multi-dimensional arrays, which can quickly respond to OLAP aggregation calculations, but its update cost is high and the space occupied by pre-processed data Exponentially grows with increasing dimensionality
The previous OLAP storage technology has been unable to adapt to the growing query and analysis of OLAP massive multidimensional data. Some scholars have proposed the Hadoop-based massive data warehouse system Hive and Pig to deal with OLAP massive data analysis.
While Pig can process data in parallel, it is still row-oriented storage, facing the problem of row brute force scanning that scans the entire row; although Hive can avoid full-row retrieval, it lacks an effective indexing mechanism

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • On-line analytical processing (OLAP) massive multidimensional data dimension storage method
  • On-line analytical processing (OLAP) massive multidimensional data dimension storage method
  • On-line analytical processing (OLAP) massive multidimensional data dimension storage method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] Below in conjunction with specific embodiment, further illustrate the present invention, should be understood that these embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention, after having read the present invention, those skilled in the art will understand various equivalent forms of the present invention All modifications fall within the scope defined by the appended claims of the present application.

[0022] In this embodiment, the form of source data is shown in Attached Table 1, including TID column, dimension hierarchy attribute column and measure column. TID indicates the position where the attribute value of the dimension level appears in the original data base table, quantity is the measurement column, and the dimension level attribute column is between TID and quantity.

[0023] 1) Split by dimension

[0024] In view of the characteristic that OLAP analysis usually performs aggregation ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an on-line analytical processing (OLAP) massive multidimensional data dimension storage method. Firstly, OLAP multidimensional data are divided according to dimensions, dimension hierarchical encoding is built, a high definition (HD) File dimension storage file structure is designed, only relevant dimension corresponding data needs to be accessed for aggregation calculation, and therefore retrieval of unrelated data is avoided; secondly, a B+ tree index based on the dimension hierarchical encoding is built for rapid positioning of the dimension storage data, and therefore input (I) / output (O) overhead is saved; and at last, a high-efficiency parallel query algorithm is designed, and OLAP query efficiency is further improved. Therefore, the OLAP massive multidimensional data dimension storage method which is high in efficiency, easy to use and scalable is provided for massive data analysis application for scientific experimental statistics, environmental meteorology, bioinformatics computing and the like.

Description

technical field [0001] The invention relates to a dimension storage method for OLAP massive multidimensional data, which is suitable for fast analysis of multidimensional massive data in an OLAP system, and especially capable of distinguishing the hierarchical characteristics of dimensions in OLAP analysis. Background technique [0002] First the abbreviations and terms used in the present invention are explained: [0003] OLAP: Online Analytical Processing, online analytical processing; [0004] ROLAP: Relational OLAP, relational OLAP; [0005] MOLAP: Multidimensional OLAP, multidimensional OLAP; [0006] Hadoop: a distributed system infrastructure; [0007] Hive: A data warehouse tool based on Hadoop; [0008] Pig: a dataflow language and runtime for retrieving very large datasets; [0009] HDFS: Hadoop Distributed File System, Hadoop Distributed File System; [0010] HDFile: HDFS Dimension File, distributed dimension storage file; [0011] MapReduce: a parallel pro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 宋爱波何战国罗军舟
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products