ETL task scheduling method, system and device and storage medium
A task scheduling and storage medium technology, applied in program control design, program startup/switching, resource allocation, etc., can solve the problems of wasting cluster computing resources, wasting time, and low resource utilization of a single task, so as to avoid wasting resources. Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0031] figure 1 It is a flow chart of an ETL task scheduling method provided by Embodiment 1 of the present invention. This embodiment is applicable to situations where one or more ETL tasks are waiting to be executed, and specifically includes the following steps:
[0032] Step 110, acquiring the SQL-based ETL task.
[0033] ETL is the process of extracting, cleaning and transforming the data of the business system and loading it into the data warehouse. The purpose is to integrate the scattered, messy, and non-uniform data in the enterprise to provide an analysis basis for the decision-making of the enterprise. Currently, there are many ways to implement ETL tasks commonly used, such as using ETL tools and implementing them in SQL. The SQL-based method is more flexible and can improve the efficiency of ETL operations. Therefore, this embodiment chooses to implement ETL tasks in SQL.
[0034] Step 120, judging whether the ETL task is the first type task or the second type ta...
Embodiment 2
[0056] Figure 4 It is a flow chart of an ETL task scheduling method provided by Embodiment 2 of the present invention. On the basis of Embodiment 1, this embodiment further specifies the process of executing the second type of task through the second computing resource, specifically including:
[0057] Step 210, calculate the free computing power pool at this time by using the minimum value algorithm according to the ETL task.
[0058] The executor Spark thriftServer has multiple computing power pools SparkthriftServer1, Spark thriftServer2, Spark thriftServer3... according to the second computing resource configuration. When an ETL task classified as the second type of task is executed by Spark thriftServer, the minimum value algorithm can An example of obtaining a suitable free computing power pool is Spark thriftServer2, which can perform this ETL task.
[0059] Step 220, perform an executable judgment according to the free computing power pool.
[0060] After Spark thri...
Embodiment 3
[0067] Figure 5 Shown is a schematic structural diagram of an ETL task scheduling system 300 provided in Embodiment 3 of the present invention. The specific structure of the ETL task scheduling device is as follows:
[0068] A task acquiring module 310, configured to acquire SQL-based ETL tasks.
[0069]Currently, there are many ways to implement ETL tasks commonly used, such as using ETL tools and implementing them in SQL. The SQL-based method is more flexible and can improve the efficiency of ETL operations. Therefore, this embodiment chooses to implement ETL tasks in SQL.
[0070] The task judging module 320 is configured to judge whether the ETL task is a first type task or a second type task according to preset rules.
[0071] The first execution module 330 is configured to select a first computing resource to execute the ETL task when the ETL task is a first type task.
[0072] The second execution module 340 is configured to select a second computing resource to exec...
PUM

Abstract
Description
Claims
Application Information

- Generate Ideas
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com