Supercharge Your Innovation With Domain-Expert AI Agents!

Big data multidimensional data analysis method and system based on Hadoop and HBase

A multi-dimensional data and analysis method technology, applied in the database field, can solve the problems of low query efficiency, inability to provide efficient interactive query, and the architecture only supports vertical expansion, etc., to achieve the effect of reducing data analysis time

Pending Publication Date: 2019-10-18
LINEWELL SOFTWARE
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Encountered a bottleneck in the use of traditional databases and big data platforms, that is, the traditional architecture only supports vertical expansion, and the data processing capability is improved by adding hardware resources such as memory and CPU on a machine, but the relative data grows exponentially, and single-machine expansion soon to reach the limit
Although the Hadoop big data platform can store and calculate large-scale data, it cannot provide efficient interactive query, and the query efficiency is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data multidimensional data analysis method and system based on Hadoop and HBase
  • Big data multidimensional data analysis method and system based on Hadoop and HBase
  • Big data multidimensional data analysis method and system based on Hadoop and HBase

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Such as figure 1 , 3 Shown in and 4, a kind of big data multidimensional data analysis method based on Hadoop and HBase of the present invention comprises the steps:

[0031] Step 1, define multidimensional data model, the dimension of described multidimensional data model selects from the field of the table of source data (the described data source can be Hive, Kafka or RDBMS (relational database)) according to user's needs (such as select field A , B, C, and D as four dimensions), and configure the required analysis metrics, wherein the analysis metrics include one or more of combinations such as SUM, MIN, MAX, COUNT, and TOP_N;

[0032] Step 2, according to the dimension n of the defined multidimensional analysis data model, read the corresponding data from the data source, adopt the MapReduce framework (or Spark) under the Hadoop platform to calculate the data, and obtain 2 n A combined data cube, the data cube is saved in the HBase database, such as using the Map...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a big data multidimensional data analysis method based on Hadoop and HBase, and the method comprises the steps: defining a multidimensional data model, the dimension of the multidimensional data model being selected from fields of a table of source data according to the needs of a user, and configuring a needed analysis measurement; reading corresponding data from a data source according to the dimension n of the defined multi-dimensional analysis data model, calculating the data by adopting Hadoop to obtain data cubes of 2n combinations, and storing the data cubes intoan HBase database; inputting a database query statement, and forwarding the database query statement to a query engine through a Rest service layer, converting the database query statement into a corresponding API statement conforming to HBase query by the query engine, and obtaining data or a data set conforming to query conditions in the HBase through routing selection according to the API statement and in combination with a query mechanism of the HBase. The invention further provides a big data multi-dimensional data analysis system based on Hadoop and HBase. Real-time query, grouping and aggregation under a super-large data set are met, and the data processing efficiency is improved.

Description

technical field [0001] The invention relates to the field of databases, in particular to a Hadoop and HBase-based big data multidimensional data analysis method and system. Background technique [0002] Hadoop is an open source framework of Apache for processing large data sets. The Hadoop framework application engineering provides a distributed storage and computing environment across computer clusters. Hadoop is designed to expand from a single server to thousands of machines, and each machine can provide local computing and storage. At its core, Hadoop mainly has two levels, namely: processing / computing layer (MapReduce), and storage layer ( Hadoop Distributed File System). [0003] HBase is a high-reliability, high-performance, column-oriented, and scalable distributed NoSQL database. Its design goal is to solve the theoretical and implementation limitations of relational databases when processing massive data. The number of records (rows) in each table in HBsae can re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/28G06F16/27G06F16/21
CPCG06F16/284G06F16/27G06F16/212
Inventor 蔡剑齐蔡炜榕
Owner LINEWELL SOFTWARE
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More