Data lake schema classification method, device, equipment, medium and product

A data and preset value technology, applied in the field of data lake schema classification, can solve the problem of difficult to determine which table schema the partition schema belongs to, and achieve the effect of reducing quantity, reducing destructiveness and strong applicability

Pending Publication Date: 2022-05-06
GUANGZHOU WERIDE TECH LTD CO
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of the related technologies mentioned above, the inventor believes that when there are at least two table schemas compatible with the partition schema, it is difficult to determine which table schema the partition schema should be classified into

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data lake schema classification method, device, equipment, medium and product
  • Data lake schema classification method, device, equipment, medium and product
  • Data lake schema classification method, device, equipment, medium and product

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056] This specific embodiment is only an explanation of this application, and it is not a limitation of this application. Those skilled in the art can make modifications to this embodiment without creative contribution according to needs after reading this specification, but as long as the rights of this application All claims are protected by patent law.

[0057] In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a data lake schema classification method and device, equipment, a medium and a product, and the method comprises the steps: creating a queue, and initializing the queue; obtaining all fields in the at least two tables schema, and storing the fields in a queue; field types and field names of fields in the partition schema and at least two tables schema in the queue are judged in sequence; carrying out distance calculation on the fields in the partition schema and the same fields of the at least two tables schema in the queue in sequence; and determining a table schema in the corresponding queue when the distance is minimum, and merging the partitions schema into the table schema. The problem that it is difficult to determine which table schema is classified when there are at least two compatible table schema is solved. The method and the device have the effect of quickly merging the partition schema into the table schema.

Description

technical field [0001] The present application relates to the technical field of data classification, in particular to data lake schema classification methods, devices, equipment, media and products. Background technique [0002] In the unmanned driving system, it includes the original collected sensor data, labeled data, log data, etc. These different data constitute a data lake. Because each kind of data is generated every day, for the convenience of management, it is usually stored in different partitions according to date, vehicle and other information. Generally, the table structure description information corresponding to the data lake can be called a table schema, and the table structure description information corresponding to different partitions in the data lake can be called a partition schema. As time goes by, the data is constantly evolving, and there are different partition schemas in different partitions. [0003] For ease of use, compatible partition schema...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/22G06F16/2455G06F16/28
CPCG06F16/2282G06F16/221G06F16/24554G06F16/285
Inventor 孙子文陈飞韩旭
Owner GUANGZHOU WERIDE TECH LTD CO
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products