Branch optimization method executed by big data ETL model

An optimization method and big data technology, applied in the field of big data analysis, can solve the problems of wasting computing resources, multiple redundant computing, etc., and achieve the effect of improving execution efficiency, efficient big data analysis, and reducing the repetition rate of execution

Active Publication Date: 2020-12-22
NANJING BEIDOU INNOVATION & APPL TECH RES INST CO LTD
View PDF13 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

With the rapid development of the Internet, various industries have accumulated a large amount of data assets, and ETL is the first step to analyze these data assets; due to the large amount of raw data and the complexity of ETL operators, an ETL model often takes a few minutes If the calculation time is tens of minutes, if all the operators in the ETL model are calculated without analysis, there may be more redundant calculations, resulting in a waste of computing resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Branch optimization method executed by big data ETL model
  • Branch optimization method executed by big data ETL model
  • Branch optimization method executed by big data ETL model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] The present invention will be further described below in conjunction with accompanying drawing:

[0026] Such as figure 1 Shown: According to the characteristics of the analysis data set, it is divided into two types of data sets, one is a stable data set, this type of data is stable within the time interval of hours or days, and will not change frequently; Another type of data set is an active data set, which is active within a time interval of minutes or hours, and new data records are constantly added to the original data set; while the ETL analysis model is executed regularly, when the original data After the update occurs, it is automatically submitted and run according to the preset time point, so the ETL model will be executed multiple times within a certain period of time. When the dynamic data is associated with the static data, for the static data, the data set may not be However, due to the update of dynamic data, the ETL analysis of static data is promoted....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a branch optimization method executed by a big data ETL model. The method comprises steps that necessity of model execution is dynamically analyzed according to the updating characteristics of an original data set and the characteristics of an ELT model; optimization judgment is carried out on a plurality of operator branches of the ETL model, for the branches with low updating frequency, an intermediate repetitive calculation process is skipped in a cache table reconstruction mode, the repetitive execution rate is reduced from the operator level, the execution efficiency of the ETL model is improved, and big data analysis is carried out more efficiently. Compared with the prior art, the necessity of model execution can be dynamically analyzed according to the updating characteristics of the original data set and the characteristics of the ELT model; optimization judgment is carried out on a plurality of operator branches of the ETL model, for the branches withlow updating frequency, an intermediate repetitive calculation process is skipped in a cache table reconstruction mode, the repetitive execution rate is reduced from the operator level, the executionefficiency of the ETL model is improved, and big data analysis is carried out more efficiently.

Description

technical field [0001] The invention relates to the field of big data analysis, in particular to a branch optimization method for big data ETL model execution. Background technique [0002] ETL is the process of extracting, cleaning and transforming the data of the business system and loading it into the data warehouse. An important part of intelligence. With the rapid development of the Internet, various industries have accumulated a large amount of data assets, and ETL is the first step to analyze these data assets; due to the large amount of raw data and the complexity of ETL operators, an ETL model often takes a few minutes If the calculation time is tens of minutes, if all the operators in the ETL model are calculated without analysis, there may be more redundant calculations, resulting in a waste of computing resources. [0003] DAG (Directed Acyclic Graph) refers to a directed graph without loops. In graph theory, if a directed graph cannot start from a certain vert...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/25
CPCG06F16/254G06F16/24552
Inventor 朱欣焰郭宇达呙维樊亚新
Owner NANJING BEIDOU INNOVATION & APPL TECH RES INST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products