Unlock instant, AI-driven research and patent intelligence for your innovation.

Real-time data warehouse automatic ETL method and system, equipment and computer storage medium

An automatic, data-based technology, applied in the field of equipment and computer storage media, systems, and real-time data warehouse automatic ETL methods, can solve problems such as serious data delays and complicated ETL processes, and achieve the effect of program security and robustness

Pending Publication Date: 2021-08-06
上海钢银科技发展有限公司
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to improve the above-mentioned problems of complicated ETL process and serious data delay of the real-time data warehouse, this application provides a real-time data warehouse automatic ETL method, system, equipment and computer storage medium

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Real-time data warehouse automatic ETL method and system, equipment and computer storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The application will be described in further detail below in conjunction with the accompanying drawings.

[0037] This application first provides a real-time data warehouse automatic ETL method, including:

[0038] Data extraction, the data in the mysql database is extracted to the message queue Kafka, specifically including:

[0039] Obtain the binlog log of the data through the slave disguised as mysql, analyze it, and send the parsed data to the Kafka queue;

[0040] In order to ensure that the data is safe and not lost, the status of the parsed binlog log progress is saved in Redis, and the maintainer can switch the binlog parsing progress at will by changing the corresponding value in redis to switch the binlog parsing progress, thereby improving flexibility; At the same time, the distribution of multiple nodes is realized through Zookeeper; if the dam that is parsing the binlog fails due to reasons such as the network, then the standby dam will seamlessly take ov...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of electronic commerce platform quasi real-time data warehouse construction, and discloses a real-time data warehouse automatic ETL method and system. The real-time data warehouse automatic ETL method, the method comprises the following steps: data extraction: extracting data in a mysql database to a message queue Kafka; data conversion: adding the data entering the message queue Kafka based on a conversion strategy configured in a mysql database in advance according to user requirements, including one or more of setting a default value, summarizing the data, taking the data of other databases as a value condition, and uploading script description; and data loading: loading the added data to a standard mysql database based on a corresponding relationship between a data source and a data destination configured in the mysql database in advance. According to the technical scheme, all data ETL requirements configured in the whole process can be completed through program configuration, codes do not need to be written, the ETL process of the real-time data bin is simplified, and data delay is shortened.

Description

technical field [0001] This application relates to the technical field of construction of quasi-real-time data warehouses on e-commerce platforms, in particular to an automatic ETL method, system, equipment and computer storage medium for real-time data warehouses. Background technique [0002] The data warehouse is a strategic collection that provides all system data support for all decision-making processes of the enterprise. Through the analysis of the data in the data warehouse, it can help enterprises improve business processes, control costs, and improve product quality. The data warehouse prepares for the final destination of the data, and these preparations include data cleaning, escaping, classification, reorganization, merging, splitting, statistics, etc. [0003] The real-time data warehouse collects, synchronizes, and calculates real-time data, and obtains the result data for use by the business side. The real-time data warehouse can provide query solutions at t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/25G06F16/28G06F9/54
CPCG06F16/254G06F16/284G06F16/283G06F9/546G06F2209/548
Inventor 葛昊
Owner 上海钢银科技发展有限公司