Unlock instant, AI-driven research and patent intelligence for your innovation.

A data processing method and device in a sparksql system

A data processing and data technology, applied in the computer field, can solve problems such as slowness and speed limit

Inactive Publication Date: 2021-08-20
BEIJING QIHOO TECH CO LTD +1
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the calculation model of SparkSQL batch processing limits the speed of SQL query.
For example, when performing an aggregation query such as "statistics on the average age of people with the same name in the user table", SparkSQL will read all the required data sets in the table into the memory for aggregation calculation, which is very slow

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data processing method and device in a sparksql system
  • A data processing method and device in a sparksql system
  • A data processing method and device in a sparksql system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0059] figure 1 A schematic flow chart showing a data processing method in a SparkSQL system according to an embodiment of the present invention, such as figure 1 As shown, the method includes:

[0060] Step S110, when a query request for a data table in the SparkSQL system is received, it is judged whether the request hits a column of the aggregation query preprocessing task.

[0061] Step S120, if it is hit, send the query requ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data processing method and device in a SparkSQL system. Wherein said method comprises: when receiving the query request to the data table in SparkSQL system, judge whether this request hits the column of aggregation query preprocessing task; If hit, then described query request is sent to online analytical processing OLAP engine , to receive the aggregated query result returned by the OLAP engine; if it is not hit, call the SQL query module of the SparkSQL system to complete the query request. This technical solution effectively utilizes the millisecond-level multi-dimensional aggregation analysis capability of the OLAP engine, and significantly improves the speed of aggregation queries in the SparkSQL system.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a data processing method and device in a SparkSQL system. Background technique [0002] SparkSQL is a system that uses SQL for big data analysis, and can perform TB to PB-level data statistics. However, the calculation model of SparkSQL batch processing limits the speed of SQL query. For example, when performing an aggregation query such as "statistics on the average age of people with the same name in the user table", SparkSQL will read all the required data sets in the table into the memory for aggregation calculation, which is very slow. Contents of the invention [0003] In view of the above problems, the present invention is proposed to provide a data processing method and device in a SparkSQL system that overcomes the above problems or at least partially solves the above problems. [0004] According to one aspect of the present invention, a kind of data processing met...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/242G06F16/2455
CPCG06F16/244G06F16/2455G06F16/24556
Inventor 李远策李振炜
Owner BEIJING QIHOO TECH CO LTD