Unlock instant, AI-driven research and patent intelligence for your innovation.

A distributed ETL data acquisition method and device

A data collection and distributed technology, applied in structured data retrieval, database management system, database distribution/replication, etc., can solve problems such as time-consuming, collection efficiency and progress cannot be guaranteed, and achieve the effect of improving efficiency

Active Publication Date: 2019-04-05
SHENZHEN THINKIVE INFORMATION TECH CO LTD
View PDF8 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, there are usually multiple internal systems in parallel in existing enterprises, and the existing ETL data collection is through a single ETL collection server for data collection. Therefore, it often takes a lot of time and energy to collect data when performing unified data processing and analysis. Especially in the case of multi-collection tasks and relatively large data volume, the efficiency and progress of collection cannot be guaranteed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A distributed ETL data acquisition method and device
  • A distributed ETL data acquisition method and device
  • A distributed ETL data acquisition method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0089] In the above, the step S16), if yes, go to the concurrent processing sub-flow;

[0090] See figure 2 , The concurrent processing sub-process includes steps,

[0091] S161) The ETL data server fetches a task from the task list;

[0092] S162) The ETL data server calls the row lock mechanism of the ORACLE database, filters the task request instruction of one of the ETL execution servers to match the task, and locks the task record, then changes the task status to executing, and finally performs the task record Unlock

[0093] S163) The ETL data server returns pairing success information to the successfully matched ETL execution server, and pairing failure information to the remaining ETL execution servers.

[0094] The steps after S21 of the collection task execution flow include:

[0095] S22) The ETL execution server feeds back the completion information of the collection task to the ETL data server;

[0096] S23) The task execution status is updated to the execution completed.

...

Embodiment 2

[0099] In the above, in step S12, if yes, the ETL data server judges the execution status of the historical collection task corresponding to the current collection task in the task list. If the execution is completed, go to step S13, if it is in execution, the current task will not be added The task list returns to step S11, if it is to be executed, the current task is not added to the task list and returns to step S11.

[0100] In this embodiment, it is determined whether the task that is the same as the current collection task in the task list already exists. Because data collection usually repeats detection collection (incremental collection) according to a certain execution cycle, only one collection task needs to be set, and it can be repeatedly activated according to the set cycle. However, in order to avoid repeated collection in a short period of time, or overlap of the same collection task before and after, the comparison process between the current collection task and t...

Embodiment 3

[0103] In the above, in the step S13, the ETL data server initializes the date replacement flag table.

[0104] In this embodiment, the ETL data server initializes the date replacement flag table for the purpose of replacing the specific value of the condition when splicing input sentences. For example, replacing the macro string in the data source sql, thereby replacing the string represented by the original macro with data with real meaning.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a distributed ETL data acquisition method and device. The method comprises the following steps: concentrating ETL acquisition tasks in an ETL data server for configuration; Joining task lists, then, distributing the ETL execution servers to a plurality of ETL execution servers which are distributed; The plurality of ETL execution servers acquire tasks according to the data in the to-be-executed task table of the ETL data server, so that the original mode that a single machine independently executes ETL data acquisition is changed into the mode that a plurality of machines are supported, the data acquisition is realized through labor division, and the acquisition efficiency is greatly improved.

Description

Technical field [0001] The invention relates to a data acquisition method, in particular to a distributed ETL data acquisition method and device. Background technique [0002] ETL, the abbreviation of Extract-Transform-Load in English, is used to describe the process of extracting, transforming, and loading data from the source to the destination. [0003] Information is an important resource of modern enterprises and the basis of scientific management and decision analysis. At present, most companies spend a lot of money and time to build online transaction processing OLTP business systems and office automation systems to record various related data of transaction processing. Therefore, whether an enterprise maximizes the use of existing data resources and converts it into information and knowledge has become the main bottleneck to improve its core competitiveness. ETL is the main technical means. [0004] However, existing enterprises usually have multiple internal systems in pa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/25G06F16/27
Inventor 王杰
Owner SHENZHEN THINKIVE INFORMATION TECH CO LTD