Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Lightweight distributed ETL architecture method based on column storage data warehouse

A data warehouse and columnar storage technology, applied in the field of lightweight distributed ETL architecture, can solve problems such as low read and write efficiency and data inconsistency, and achieve the effects of improving horizontal scalability, ensuring consistency, and improving query capabilities

Pending Publication Date: 2022-05-24
鞍钢集团自动化有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This is very inefficient for reading and writing in the case of a large amount of data and multiple target sources
In addition, traditional ETL tools also have the problem of data inconsistency in the process of applying to distributed ETL architecture

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Lightweight distributed ETL architecture method based on column storage data warehouse
  • Lightweight distributed ETL architecture method based on column storage data warehouse

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] In order to facilitate understanding of the present invention, the present invention will be described more fully below. However, the present invention may be embodied in many different forms and is not limited to the embodiments described herein. Rather, these embodiments are provided so that a thorough and complete understanding of the present disclosure is provided.

[0033] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terms used herein in the description of the present invention are for the purpose of describing specific embodiments only, and are not intended to limit the present invention.

[0034] The specific embodiments provided by the present invention will be described in detail below with reference to the accompanying drawings.

[0035] like figure 1 As shown, the architecture diagram of a lightweight distributed...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A lightweight distributed ETL architecture method based on a column storage data warehouse comprises the steps that a data source of a database server is extracted and converted through a Kettle tool set, meanwhile, an Airflow distributed scheduling platform is combined with a Celery distributed message queue to schedule work, a Redis cluster server is adopted as an agent to communicate between a client and a plurality of working nodes, and the data source of the database server is extracted and converted through a Kettle tool set. According to the method, the consistency of the lightweight distributed ETL architecture data is ensured, the data is stored and backed up through the object storage KS3 server, the security of the data is ensured, and meanwhile, the horizontal flexibility and query capability of the scheduling operation are improved.

Description

technical field [0001] The invention relates to the technical field of ETL architecture, in particular to a lightweight distributed ETL architecture method based on a columnar storage data warehouse. Background technique [0002] ETL is a very important part of the data warehouse. Traditional ETL tools The traditional ETL tool is to set up a proprietary transformation between the data source and the target data warehouse, which is used to apply all transformation procedures. This method solves the problem of It solves the problem of using different programming languages ​​on different system platforms. But in the data transformation process, performing all the transformation work exclusively becomes the "bottleneck". When extracting and transforming data, each data source needs to be extracted and transformed line by line, and finally stored in the data warehouse. This is very inefficient for reading and writing in the case of a large amount of data and multiple target sou...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/25G06F16/242G06F16/23G06F16/22G06F16/27G06F16/28
CPCG06F16/254G06F16/283G06F16/27G06F16/221G06F16/2433G06F16/2365Y02D10/00
Inventor 魏铭濡王里程
Owner 鞍钢集团自动化有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products