Big-data parallel computing method and system based on distributed columnar storage

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A distributed columnar and parallel computing technology, applied in the field of big data processing, can solve problems such as slow computing speed, reduce time consumption, improve data query efficiency, and ensure real-time query analysis.

Inactive Publication Date: 2017-11-07

SOUTH CHINA UNIV OF TECH

View PDF4 Cites 44 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] In addition, traditional serial computing can no longer meet the needs of real-time query and analysis of big data, because the serial computing method requires tasks to be performed one by one in chronological order or priority processing order, which limits the existing CPU multi-core multi- The thread and distributed processing architecture can handle multi-task performance at the same time, but the calculation speed is slow

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0037] The present invention will be further described below in conjunction with specific examples.

[0038] The large data parallel computing method and system based on distributed columnar storage provided by this embodiment fully utilizes the processing performance of the cluster cloud server memory query and the advantages of columnar storage, and avoids the need to directly read HDFS file system data when querying. The resulting delay problem and the redundant data transmission problem caused by row storage greatly improve the data reading efficiency. In addition, the solution uses a Spark-based parallel computing framework on top of NoSQL-based columnar storage to further improve the efficiency of real-time query analysis through parallel computing. At the same time, due to the scalability of distributed clusters, the distributed architecture can meet the elastic and scalable requirements of massive data storage. The hierarchical structure of this program is as follows ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a big-data parallel computing method and system based on distributed columnar storage. Data which is most often accessed currently is stored by using the NoSQL columnar storage based on a memory, the cache optimizing function is achieved, and quick data query is achieved; a distributed cluster architecture, big data storing demands are met, and the dynamic scalability of the data storage capacity is achieved; combined with a parallel computing framework based on Spark, the data analysis and the parallel operation of a business layer are achieved, and the computing speed is increased; the real-time data visual experience of the large-screen rolling analysis is achieved by using a graph and diagram engine. In the big-data parallel computing method and system, the memory processing performance and the parallel computing advantages of a distributed cloud server are given full play, the bottlenecks of a single server and serial computing performance are overcome, the redundant data transmission between data nodes is avoided, the real-time response speed of the system is increased, and quick big-data analysis is achieved.

Description

technical field [0001] The present invention relates to the technical field of big data processing, in particular to a large data parallel computing method and system based on distributed columnar storage. Background technique [0002] The rapid development of the Internet and the continuous upgrading and replacement of hardware have caused the data scale of various units such as governments and enterprises to show explosive growth, and gradually move towards massive data. Faced with the storage and processing requirements of massive data, traditional relational databases are mainly based on the operation of tables and data rows, which has gradually failed to meet user needs, and even restricts the storage and processing of massive data. Therefore, relying solely on traditional storage technology cannot meet the development and needs of the times. It is necessary to establish a new big data storage technology based on traditional processing technology to ensure that data sto...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/30G06F9/50

CPCG06F9/5088G06F16/2219G06F16/24532

Inventor张星明陈霖王昊翔梁桂煌古振威吴世豪

OwnerSOUTH CHINA UNIV OF TECH

Big-data parallel computing method and system based on distributed columnar storage

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology