Distributed data integration system and method based on Web and Kafka

A distributed data and integrated system technology, applied in database management systems, electronic digital data processing, structured data retrieval, etc., can solve problems such as lack of ease of use and manageability, complex and cumbersome processes, etc.

Active Publication Date: 2020-04-24
BEIJING UNIV OF POSTS & TELECOMM
View PDF4 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, building a real-time ETL process based on Kafka Connect requires manual configuration of the configuration of the Worker process of Kafka Connect, management of

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed data integration system and method based on Web and Kafka
  • Distributed data integration system and method based on Web and Kafka
  • Distributed data integration system and method based on Web and Kafka

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary and are intended to explain the present invention and should not be construed as limiting the present invention.

[0029] The following describes the distributed data integration system and method based on Web and Kafka according to the embodiments of the present invention with reference to the accompanying drawings. First, the distributed data integration system based on the Web and Kafka proposed by the embodiments of the present invention will be described with reference to the accompanying drawings.

[0030] figure 1 It is a schematic structural diagram of a distributed data integration system based on Web and Kafka according to an embodi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed data integration system and method based on Web and Kafka, and the system comprises: a console module which is used for providing a console for a user, and enabling the user to carry out the creation and monitoring of an ETL task in a Web page operation mode; a management service module which is used for providing a management service API for the console module; a mode management module which is used for managing schema of the data source end, schema of the destination end and mapping of schema; a data extraction module which is used for extracting data ofthe management data source end to a message queue; a data processing module which is used for cleaning and converting the data; and a data loading module whcih is used for loading the data from the message queue to the destination. According to the system, the process of creating the ETL instance based on the Kafka Connect is simpler in operation, more standard in management and more flexible inconfiguration, and the ETL program is low in coupling degree, high in fault tolerance and easy to expand and integrate.

Description

technical field [0001] The invention relates to the technical fields of information technology and data business, in particular to a distributed data integration system and method based on Web and Kafka. Background technique [0002] When performing cross-application data fusion calculations, it is first necessary to collect data from isolated data sources and bring them together to a destination that can be efficiently accessed by the computing platform. This process is called ETL, that is, data extraction (Extract), Transform (Transform) and load (Load). Traditionally, ETL is done with batch jobs, which periodically load (incremental) data from the source, process it according to transformation logic, and write to the destination. According to different business needs and computing power, the delay of batch processing usually ranges from days to minutes. In some application scenarios, ETL requires the shortest possible delay, which leads to the need for real-time ETL. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/215G06F16/2457G06F16/25G06F16/27
CPCG06F16/215G06F16/24578G06F16/254G06F16/27
Inventor 鄂海红宋美娜王园
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products