A data lineage analysis method, device, equipment and storage medium

An analysis method, blood relationship technology, applied in the computer field, can solve problems such as the inability to use wildcards

Active Publication Date: 2021-09-17
杭州天均科技有限公司
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] (1) Developers are required to list specific field names in the SQL statement, and wildcards cannot be used, which imposes certain constraints on developers;

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data lineage analysis method, device, equipment and storage medium
  • A data lineage analysis method, device, equipment and storage medium
  • A data lineage analysis method, device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0066] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0067] It should be understood that when used in this specification and the appended claims, the terms "comprising" and "comprises" indicate the presence of described features, integers, steps, operations, elements and / or components, but do not exclude one or Presence or addition of multiple other features, integers, steps, operations, elements, components and / or collections thereof.

[0068] The present invention will be further described below i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a data lineage analysis method, device, equipment, and storage medium. The method includes: a data lineage analysis method, including steps: S1, acquiring the script of the scheduling task; S2, analyzing the script of the scheduling task, and acquiring Database connection information, SQL statements, and the non-loop directed graph of the flow direction between the scheduling task job nodes; S3, traverse each job node, determine the input field set, output field set, and field mapping information inside the job node; S4, based on the The directed acyclic graph and the field mapping information inside the job node are constructed, and the connection between the input field set and the output field set is constructed to generate a data lineage graph. The present invention conducts data lineage analysis based on task scheduling and database embedding points, can process SQL statements using wildcards, and at the same time automatically adapts to changes in the table structure in the database, without manually modifying the SQL statement field part in the ETL script, and dynamically generates a data lineage map .

Description

technical field [0001] The invention belongs to the field of computer technology, and in particular relates to a data lineage analysis method, device, equipment and storage medium, in particular to a data lineage analysis method, device, equipment and storage medium based on scheduling task scripts and database buried points. Background technique [0002] Lineage analysis is a technical means for comprehensively tracking the data processing process, so as to find all relevant metadata objects starting from a certain data object and the relationship between these metadata objects. The relationship between metadata objects specifically refers to the data flow input-output relationship representing these metadata objects. [0003] With the development and application of big data technology, there is a demand for data lineage analysis in the field of big data governance. Through the analysis and processing of structured query statements for database operations, the mapping relat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/22G06F16/242G06F16/25G06F16/28
CPCG06F16/2246G06F16/242G06F16/254G06F16/284
Inventor 胡黎玮彭飞卢凯瑞何亚鹏王毅聂黎洲王诚
Owner 杭州天均科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products