Big-data parallel processing system and method based on column storage

A parallel processing and big data technology, applied in the field of big data processing, can solve problems such as inability to provide structured language query interface

Active Publication Date: 2014-03-19
WUHAN POST & TELECOMM RES INST CO LTD
View PDF4 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the disadvantage of HBase is that it can

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big-data parallel processing system and method based on column storage
  • Big-data parallel processing system and method based on column storage
  • Big-data parallel processing system and method based on column storage

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments.

[0022] Such as figure 1 As shown, the large data parallel processing system based on column storage of the present invention adopts a distributed architecture, including a client, a master control node and a plurality of data nodes, and each node can be constructed by a server with an ordinary X86 architecture, and the data The number of nodes can be expanded linearly according to business needs. The master control node includes an HBase master controller and a SQL master engine. The HBase master controller is responsible for managing and maintaining the data nodes. The SQL master engine is responsible for parsing SQL statements and continuing to distribute SQL statements to data nodes. The HBase partition node is responsible for the storage and management of data on each data node, and the SQL slave engine is responsible for the parsing and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a big-data parallel processing system and method based on column storage and relates to the field of big-data processing. The big-data parallel processing system comprises a client-side, a master control node and multiple data nodes, wherein the master control node comprises an HBase master controller and an SQL main engine. The HBase master controller takes charge of management and maintenance of the data nodes, the SQL main engine takes charge of analysis of SQL statements and continuous distribution of the SQL statements to the data nodes. Each data node comprises an HBase subarea node and an SQL sub engine, wherein the HBase subarea node takes charge of data storage and management on each data node, and the SQL sub engine takes charge of analysis and execution of SQL on the data nodes. The master control node and the data nodes comprise HBase tables and further respectively comprise SQL tables. The big-data parallel processing system and method based on the column storage is suitable for an environment with large-scale data volume and provides an SQL interface of a perfect structuralized relational data model.

Description

technical field [0001] The invention relates to the field of big data processing, in particular to a system and method for parallel processing of big data based on column storage. Background technique [0002] With the popularity of mobile Internet, smart terminals, Internet of Things, cloud computing and smart cities, people have gradually entered the era of "big data". Big data is a very large and complex data set. When the amount of data reaches the level of PB, EB or ZB, traditional database management tools face many problems in processing, such as acquisition, storage, retrieval and analysis. Big data has caused some problems, such as high concurrent read and write requirements for databases, high-efficiency storage and access requirements for massive data, and high scalability and high availability requirements for databases. Traditional database and data warehouse technologies are unable to do what they want. [0003] Hadoop is an open source software framework capa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/221G06F16/284
Inventor 郝俊瑞向智宇高汉松唐业祎郭嘉许德玮王静
Owner WUHAN POST & TELECOMM RES INST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products