Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Big data ETL dispatching system, capable of supporting visualization and process-oriented implementation

A scheduling system and big data technology, applied in the field of big data processing, can solve problems such as cluster misoperation, complex background operations, and increased project implementation costs of enterprises, so as to achieve the effect of improving development speed and efficiency and reducing costs

Active Publication Date: 2017-09-08
科技谷(厦门)信息技术有限公司
View PDF8 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This process usually requires related operations in the background, and the background operations are complex, which reduces the development speed and efficiency of ETL. At the same time, there may be a risk of misoperation for the cluster, which greatly increases the cost of enterprise project implementation.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data ETL dispatching system, capable of supporting visualization and process-oriented implementation
  • Big data ETL dispatching system, capable of supporting visualization and process-oriented implementation
  • Big data ETL dispatching system, capable of supporting visualization and process-oriented implementation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0023] see figure 1 , the present invention discloses a big data ETL scheduling system that supports visualization and flow, which is implemented based on B / S architecture, and includes a big data component operating unit, an ETL job management module, an ETL scheduling management module, a system management module, and job configuration The database, the ETL job management module, the ETL scheduling management module and the big data component operating unit are independent of each other and do not affect each other, wherein:

[0024] refer to figure 1 As shown, the big data component operation unit includes a data query module supporting visual operations, a component script editing module, a script execution monitoring module, a platform component driver module, a big data platform, a local business system and a remote business system. The components of the big data platform include HDFS, Hive, HBase, Solr, YARN, Oozie, Spark, Storm, Sqoop, Pig, Impala, Zookeeper.

[0025...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a big data ETL dispatching system, capable of supporting visualization and process-oriented implementation. The system is achieved based on a B / S architecture and comprises a big data assembly operation unit, an ETL operation management module, an ETL dispatching management module, a system management module and an operation configuration database, wherein the ETL operation management module, the ETL dispatching management module and the big data assembly operation unit are mutually independent and do not influence each other. According to the invention, complicated backstage operations are avoided effectively; the development rate and efficiency of ETL are greatly increased; and cost for enterprise project implementation is reduced.

Description

technical field [0001] The invention relates to the technical field of big data processing, in particular to a big data ETL scheduling system supporting visualization and flow. Background technique [0002] ETL (Extract-Transform-Load, extraction, transformation and loading) is the most important part of BI (big data) projects. Usually, ETL will spend 1 / 3 of the time of the entire project. The quality of ETL design directly depends on Received the success or failure of the BI project. [0003] Big data ETL is also a long-term process. Only by constantly discovering and solving problems can ETL run more efficiently and provide accurate data for later-stage development of the project. Big data ETL is responsible for extracting data from scattered and heterogeneous data sources, such as relational data and flat data files, to the big data platform system, cleaning, converting, and integrating them, and finally loading them into big data platforms, data warehouses or datasets ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F11/30G06F9/48
CPCG06F9/4881G06F11/302G06F11/3051G06F16/215G06F16/254
Inventor 陈思恩杨紫胜廖雅哲林振州
Owner 科技谷(厦门)信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products