Unlock instant, AI-driven research and patent intelligence for your innovation.

Data processing method and device in SparkSQL system

A data processing and data technology, applied in the computer field, can solve problems such as speed limitation and slowness, and achieve the effect of improving speed

Active Publication Date: 2017-06-13
BEIJING QIHOO TECH CO LTD +1
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the calculation model of SparkSQL batch processing limits the speed of SQL query.
For example, when performing an aggregation query such as "statistics on the average age of people with the same name in the user table", SparkSQL will read all the required data sets in the table into the memory for aggregation calculation, which is very slow

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method and device in SparkSQL system
  • Data processing method and device in SparkSQL system
  • Data processing method and device in SparkSQL system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that the present disclosure will be more thoroughly understood, and will fully convey the scope of the present disclosure to those skilled in the art.

[0059] figure 1 A schematic flowchart of a data processing method in a SparkSQL system according to an embodiment of the present invention is shown, such as figure 1 As shown, the method includes:

[0060] Step S110, when a query request for a data table in the SparkSQL system is received, determine whether the request hits a column of an aggregate query preprocessing task.

[0061] Step S120, if it hits, send the query r...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data processing method and device in a SparkSQL system. The method comprises the steps that when a query request for a data table in the SparkSQL system is received, whether the request hits a queue of aggregate query preprocessing tasks or not is judged; if yes, the query request is sent to an on-line analytical processing (OLAP) engine, and an aggregate query result returned by the OLAP engine is received; and if not, an SQL query module of the SparkSQL system is called to complete the query request. According to the technical scheme, the millisecond-level multidimensional aggregate analytical ability of the OLAP engine is effectively utilized, and the speed of performing aggregate query in the SparkSQL system is remarkably increased.

Description

technical field [0001] The invention relates to the technical field of computers, in particular to a data processing method and device in a SparkSQL system. Background technique [0002] SparkSQL is a big data analysis system using SQL, which can perform terabyte to petabyte-level data statistics. However, the calculation model of SparkSQL batch processing limits the speed of its SQL query. For example, when performing an aggregation query such as "statistics on the average age of the same person in the user table", SparkSQL will read all the required data sets in the table into memory for aggregation calculation, which is very slow. SUMMARY OF THE INVENTION [0003] In view of the above problems, the present invention is proposed to provide a data processing method and apparatus in a SparkSQL system that overcomes the above problems or at least partially solves the above problems. [0004] According to one aspect of the present invention, a data processing method in a S...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/244G06F16/2455G06F16/24556
Inventor 李远策李振炜
Owner BEIJING QIHOO TECH CO LTD