Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Neo4j-based big data consanguinity management method, system, apparatus and storage medium

A blood relationship and management method technology, applied in the database field, can solve problems such as high cost, achieve the effect of upgrading levels, improving metadata management capabilities, and strengthening control

Inactive Publication Date: 2019-03-08
SF TECH
View PDF5 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If the big data team maintains two low-level HBase and ElasticSearch for the Atlas system, the cost is too high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Neo4j-based big data consanguinity management method, system, apparatus and storage medium
  • Neo4j-based big data consanguinity management method, system, apparatus and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0043] The blood relationship management method based on neo4j big data of this embodiment includes:

[0044] S1. Parse the SQL statement to generate a corresponding abstract syntax tree. For each abstract syntax tree, traverse each node of the abstract syntax tree in depth, and collect corresponding node data at each node;

[0045] Among them, the Antlr syntax analysis tool is used for syntax, lexical and semantic analysis, and the corresponding abstract syntax tree is generated.

[0046] Wherein, the corresponding node data includes a source data table, a target data table, fields of the source data table and fields of the target data table.

[0047] Specifically, the LineageMgr service parses the successfully executed Hive Sql stored in HDFS through the Antlr parser, obtains the Hive Sql abstract syntax tree, and analyzes the relative structure of each subtree by deeply traversing each node of the syntax tree, and collects The data of important nodes, the collected data in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to a neo4j-based big data consanguinity management method, system, apparatus and storage medium. The method comprises the following steps of parsing SQL statements to generate a corresponding abstract syntax tree; for each abstract syntax tree, traversing each node of the abstract syntax tree in depth, and collecting the corresponding node data at each node; storing the collected node data relationship in neo4j graphic database, and storing the collected necessary information in HBase; introducing the information from heterogeneous data sources into consanguinitysystem to form consanguinity. The method is easy to graphically display the dependencies and consanguinity between data sources and Hive tables, enhances the level of metadata management capability ofbig data platform, strengthens the control of data flow within the platform, sorts out the origin and development of data, breaks through the estrangement of heterogeneous data sources, and connectsthe data sources as a bridge of consanguinity.

Description

technical field [0001] The invention relates to the technical field of databases, in particular to a method, system, device and storage medium for blood relationship management based on neo4j big data. Background technique [0002] In the era of big data, data contains infinite value. The vigorous development of the mobile Internet has allowed Internet companies to accumulate PB-level user data and business data. Driven by strong demand, big data technology has also developed steadily and maturely. Through HDFS, HBase, MongoDB, Kafka and other storage components, massive and continuously increasing data have been recorded. [0003] From the generation, processing and integration, circulation and circulation of data to the final demise, a relationship will naturally form between the data. Using a similar relationship in human society to express this relationship between data is called the blood relationship of data. [0004] With the rise of big data, data mining is becomi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/28
Inventor 邓燕辉蔡适择姚小龙曾昭正唐国凯张文斌
Owner SF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products