Big data distributed real-time query method and system

A distributed real-time, query method technology, applied in electrical digital data processing, special data processing applications, instruments, etc., can solve the problems affecting the query performance of the distributed real-time query system, so as to reduce query pressure, provide performance, and improve query performance. The effect of efficiency

Inactive Publication Date: 2017-10-27
SOUTH CHINA UNIV OF TECH
View PDF4 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, how to schedule and minimize IO scheduling will be an important bottlene

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data distributed real-time query method and system
  • Big data distributed real-time query method and system
  • Big data distributed real-time query method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The present invention will be further described below in conjunction with specific examples.

[0039] In the big data distributed real-time query method provided by this embodiment, at first, the query statement and query script file obtained through SQL APP and ODBC / JDBC are parsed and analyzed, and if it is a query script file, it is divided into multiple query Statements are processed separately. For a single query statement, it is divided by keywords, such as limit, groupby, join on, etc. Then the query coordinator will use different nodes to represent different operations for each part of the division, and then generate a basic query plan tree according to the execution order of the nodes, such as Figure 4 shown. The root node of the query plan tree is used to aggregate results and return them, while the leaf nodes of the query plan tree are generally Scan operations, which are used to obtain related table data.

[0040] For the query plan tree with low complex...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a big data distributed real-time query method and system. The method mainly comprises that: a real-time query processing flow is defined as a query plan tree containing multiple execution phases, and query plan trees with low complexity are executed in parallel in independent threads; multiple operation threads are allocated to the query plan trees with high complexity, and multiple execution phases are processed in parallel in the form of distributed cloud server clusters; and a distributed real-time query system integrates query result sets of each server, returns final results, and uses the NOSQL to carry out cache optimization on data query operations. According to the method and system disclosed by the present invention, real-time query processing performance of the cloud server is taken full advantage of, the performance bottleneck of a single server is broken through, and redundant data access between the server and the HDFS data node is avoided, so that the efficiency of the distributed real-time query can be improved.

Description

technical field [0001] The invention relates to the technical field of big data processing, in particular to a distributed real-time query method and system for big data. Background technique [0002] With the rapid development and popularization of computer and information technology, the scale of industrial application systems has expanded rapidly, and the data generated by industrial applications has grown explosively. Industry / enterprise big data that can easily reach hundreds of terabytes or even dozens to hundreds of PBs has far exceeded the processing capabilities of existing traditional computing technologies and information systems. Therefore, it is necessary to seek effective big data processing technologies, methods and means It has become an urgent need in the real world. If the rapid development of big data technology can be used to mine and statistically analyze data in various big data business fields, it can provide a reference for decision makers in all wal...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/182G06F16/2471
Inventor 王昊翔吴世豪张星明陈霖梁桂煌古振威
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products