Unlock instant, AI-driven research and patent intelligence for your innovation.

SparkSql external data source device, implementation method thereof and system

A technology of external data and source devices, which is applied in the field of SparkSql parsing external data sources, can solve problems such as lack of versatility, and achieve the effect of less error prone and short development cycle

Inactive Publication Date: 2020-05-05
SUNING CLOUD COMPUTING CO LTD
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the prior art, the syntax between different data sources is not universal

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • SparkSql external data source device, implementation method thereof and system
  • SparkSql external data source device, implementation method thereof and system
  • SparkSql external data source device, implementation method thereof and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0056] Such as figure 1 As shown, the present invention provides a SparkSql external data source device, and the external data source device includes:

[0057] The registration module 11 is used for registering the Sql query language received and analyzing and obtaining the logical plan;

[0058] A receiving module 12, configured to obtain the business logic code corresponding to the logical plan and transfer it to the physical plan;

[0059]The request data set module 13 is used to call the business logic code to query the corresponding external data source to obtain the data result set;

[0060] The partition conversion module 14 processes the data result set according to the preset number of partitions and converts the processed data result set into an iterative block, specifically Iterator[InternalRow].

[0061] It should be noted that the partition conversion module is optional, and in some embodiments, partitioning may not be performed, but the result set may be stored...

Embodiment 2

[0083] figure 2 It is a flow chart of a method embodiment implemented by the SparkSql external data source installer. The methods include:

[0084] S21. Registering the received Sql query language and analyzing it to obtain a logical plan;

[0085] S22. Obtain the business logic code corresponding to the logical plan and transfer it to the physical plan;

[0086] S23. Call the business logic code to query the corresponding external data source to obtain a data result set;

[0087] S24. Call the running block of SparkCore to format the result set of data and return it to the client.

[0088] The method also includes:

[0089] S210. After performing a recursive operation on the logic plan, inject the logic plan into a business logic interface so that developers can implement the business logic code;

[0090] The S22 includes: obtaining the business logic code corresponding to the logical plan and transferring it to the physical plan.

[0091] Preferably, the method also i...

Embodiment 3

[0186] The present invention also provides a computer system, comprising:

[0187] one or more processors; and

[0188] A memory associated with the one or more processors, the memory is used to store program instructions, and when the program instructions are read and executed by the one or more processors, perform the operations of the above method embodiments, specifically include:

[0189] Register and parse the received Sql query language to obtain a logical plan;

[0190] Obtain the business logic code corresponding to the logical plan and transfer it to the physical plan;

[0191] Call the business logic code to query the corresponding external data source to obtain the data result set;

[0192] Call the running block of SparkCore to format the data result set and return it to the client.

[0193] in, image 3 The architecture of the computer system is exemplarily shown, which may specifically include a processor 1510 , a video display adapter 1511 , a disk drive 1...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a SparkSql external data source device and an implementation method thereof, and the external data source device comprises a registration module which is used for registering areceived Sql query language and analyzing the received Sql query language to obtain a logic plan; a receiving module used for acquiring a service logic code corresponding to the logic plan and transmitting the service logic code to the physical plan; a request data set module used for calling the service logic code to query a corresponding external data source to obtain a data result set; and a return module used for calling the running block of SparkCore to perform format processing on the data result set and then returning the data result set to the client. The invention is simple and easyto use, strong in code replicability and short in development period, Rdd does not need to be customized, and components corresponding to different data sources do not need to be developed in a targeted mode.

Description

technical field [0001] The present invention relates to the field of SparkSql parsing external data sources, in particular to a SparkSql external data source device, implementation method and system. Background technique [0002] When connecting to different data sources, Sparksql uses different syntaxes when extracting data due to different data source engines. When a new data source engine is loaded, the SparkSql external data source interface needs to be redeveloped. [0003] In the prior art, it is too cumbersome to define an external data source, and the complexity of the code is high. If you implement an external data source according to SparkSql, you need to implement n interfaces such as RelationProvider, PrunedFilteredScan, and BaseRelation, and SparkSql provides these interfaces to only process the Filter logic plan provided by sparkSql. If you want more fine-grained control over the logical plan. You need to implement SparkStretegy in SparkSql; if you customize...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/242G06F16/2458
CPCG06F16/2433G06F16/2471
Inventor 卢勇亮赵云李成徐根林
Owner SUNING CLOUD COMPUTING CO LTD