Method and apparatus for implementing ETL scheduling

A technology of implementation method and triggering method, applied in the computer field, can solve the problems of not providing visual management of the overall ETL scheduling process, unable to adapt to actual application requirements, and reducing the execution efficiency of the parent process.

Active Publication Date: 2009-10-28
ADVANCED NEW TECH CO LTD
View PDF0 Cites 73 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Moreover, the JBPM engine core adopts the token-based sequential transfer method, which cannot meet the actual application requirements in ETL process scheduling such as "task fallback" and "task jump forward".
[0007] On the other hand, the process description language (JPDL) used by the JBPM engine cannot describe the dependency relationship between parent and child processes, and can only rely on programmers to extend the node class in the JAVA

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and apparatus for implementing ETL scheduling
  • Method and apparatus for implementing ETL scheduling
  • Method and apparatus for implementing ETL scheduling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] When building a data warehouse, in order to improve the execution efficiency of data extraction, conversion and loading (Extraction-Transformation-Loading, ETL) scheduling process, in the embodiment of the present invention, for any task process in the number of task processes included in ETL scheduling, execute The following operations: according to the preset configuration file, determine the triggering mode, execution sequence and mutual dependencies of each subtask process contained in the task process; trigger the corresponding subtask process in sequence according to the set triggering mode, and Executing the triggered subtask processes in a certain order, wherein, when it is determined that at least one subtask process has been executed, according to the dependency relationship between the subtask processes, start to execute the at least one subtask process that depends on the subtask process and has been executed Other subtask processes triggered.

[0028] In th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for scheduling ETL, comprising: determining the triggering mode, operation sequence and mutual dependency relationship of each subtask flow included in the task flow based on preset configuration file aiming at an optional task flow when implementing plural task flows included in the ETL scheduling; triggering corresponding subtask flows in turn according to the set triggering mode and implementing the triggered subtask flow according to set sequence, wherein after determining that at least a subtask flow is implemented, starting to implement other subtask flows depending on at least a subtask flow and other triggered subtasks based on the dependency relationship between subtask flows. Thus, subtask flow in each task flow has clear service logic and service function so as to effectively enhance the implementing efficiency of the ETL scheduling flow. The invention also discloses an apparatus for implementing ETL scheduling.

Description

technical field [0001] The present application relates to the field of computers, in particular to a process control method and device. Background technique [0002] Data Warehouse (DW) is a subject-oriented, integrated, relatively stable data collection that reflects historical changes and is used to support management decisions. A data warehouse is an independent data environment, and data extraction, transformation and loading (Extraction-Transformation-Loading, ETL) is an important part of building a data warehouse. [0003] ETL is used to extract data from distributed and heterogeneous data sources (for example, relational data, flat data files, etc.) Loaded into the data warehouse, so that the built data warehouse becomes the basis of online analytical processing and data mining. Technically, ETL mainly involves several aspects such as association, transformation, increment, scheduling and monitoring. Usually, the data in the data warehouse does not require real-tim...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 蒋杰陈荣松蒋萃林
Owner ADVANCED NEW TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products