Field-level data blood relationship determination method and device based on knowledge graph

A knowledge map and field technology, applied in the field of data processing, can solve problems such as high turnover of personnel, difficulty in reflecting the intermediate processing process, and inability to expose data users

Active Publication Date: 2021-04-27
TIANYUN RONGCHUANG DATA TECH BEIJING CO LTD
View PDF14 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] If the business changes are frequent, the development cycle is short, and the turnover of personnel is high, the table structure and table dependencies may change frequently as time goes by, eventually making the relationship between tables complicated and difficult to trace
The common dependencies are the dependencies between the data table and the production task. Which fields in the upstream data table are used for the production task can only be reflected in the coding logic. For example, database A or department A saves the updated data file Go to the specified directory, send it to database B through FTP (File Transfer Protocol, file transfer protocol) or other synchronous methods, datab...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Field-level data blood relationship determination method and device based on knowledge graph
  • Field-level data blood relationship determination method and device based on knowledge graph
  • Field-level data blood relationship determination method and device based on knowledge graph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0034] refer to figure 1 The flow chart of the field-level data lineage determination method based on the knowledge map is shown, the method includes the following steps:

[0035] In step S102, the respective table names and table structure information of the two data tables in the first data table pair are obtained.

[0036] Wherein, the first data table pair includes any two data tables in the database, and the two data tables can be called the first data table and the second data table; the first data table and the second data table can be the same or different A data table in a database, and when the first data table and the second data table are data tables in different databases, the above-mentioned different databases may be databases in the same or different database systems. It can be understood from this that there are multiple first data table pairs.

[0037] The table name of the data table can be a Chinese table name or an English table name. The table structure...

Embodiment 2

[0088] refer to image 3 Shown is a structural block diagram of a field-level data lineage determination device based on a knowledge graph, which includes:

[0089] The first obtaining module 302 is used to obtain the respective table names and table structure information of the two data tables in the first data table pair; wherein, the first data table pair includes any two data tables in the database; the table structure The information includes multiple fields;

[0090] A table name similarity calculation module 304, configured to calculate the first table name similarity between the table names of the two data tables in the first data table pair;

[0091] A field similarity calculation module 306, configured to calculate the similarity between fields in the table structure information of the two data tables in the first data table pair, to obtain a first field similarity matrix;

[0092] Blood relationship determination module 310, configured to determine whether the rel...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a field-level data blood relationship determination method and device based on a knowledge graph. The method comprises the steps of obtaining table names and table structure information of two data tables in a first data table pair, wherein the first data table pair comprises any two data tables in a database; calculating a first table name similarity between table names of two data tables in the first data table pair, and calculating a similarity between fields in the table structure information to obtain a first field similarity matrix; according to the first table name similarity, the first field similarity matrix and a pre-acquired target weight, determining whether a blood relationship exists between two data tables in the first data table pair or not; acquiring a plurality of first data table pairs with a blood relationship as target data table pairs; and generating a field-level data relationship graph according to the blood relationship between the knowledge graph and the data table in the target data table pair. The method and device can be used to reduce the difficulty in determining the blood relationship of the data table, and improves the determination accuracy of the blood relationship.

Description

technical field [0001] The present disclosure relates to the technical field of data processing, and in particular to a field-level data lineage determination method and device based on a knowledge graph. Background technique [0002] Data Lineage refers to the link of data generation, which is used to describe which other tables a table depends on, and how the generation of fields in the table depends on the fields of other tables. Through the data lineage, the upstream and downstream dependencies of data production can be clearly known. When an enterprise has a wide variety of businesses and a large business volume, the entire database system supporting the business will involve hundreds or thousands of tables, and there will be very complicated dependencies between tables. [0003] If the business changes are frequent, the development cycle is short, and the turnover of personnel is high, the table structure and table dependencies may change frequently as time goes by, e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/22G06F16/36G06K9/62
CPCG06F16/2282G06F16/367G06F18/22
Inventor 雷涛乔旺龙赵琳曹晓磊
Owner TIANYUN RONGCHUANG DATA TECH BEIJING CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products