Unlock instant, AI-driven research and patent intelligence for your innovation.

Big data cluster metadata information collection method, device, equipment and medium

A technology of metadata information and collection methods, applied in database models, relational databases, structured data retrieval, etc., can solve problems such as lack of metadata information, failure to collect metadata information, low efficiency of metadata information collection, etc.

Active Publication Date: 2021-03-19
PINGAN YIQIANBAO E COMMERCE CO LTD
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, since there are many channels for users to submit tasks to the cluster, and the amount of data is large, collecting metadata information in this way may easily lead to the inability to collect metadata information accurately and completely, which in turn leads to missing metadata information, making metadata information The collection efficiency is low, which is not conducive to the management of big data clusters

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data cluster metadata information collection method, device, equipment and medium
  • Big data cluster metadata information collection method, device, equipment and medium
  • Big data cluster metadata information collection method, device, equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach

[0101] see Figure 8 , Figure 8 A specific implementation after step S5 is shown, which includes:

[0102] S51: Identify the data information whose metadata information is the same as the historical data in the big data warehouse, as repeated data information.

[0103] Specifically, since the collected metadata information may have the same data information as the historical data in the big data warehouse, in order to reduce data redundancy and thereby reduce the load on the big data cluster, the identification of metadata information and the big data warehouse The same data information as the historical data, as the duplicate data information.

[0104] S52: In the big data warehouse, delete the duplicate data information in the metadata information to obtain newly added metadata information.

[0105] Specifically, the duplicate data information in the metadata information is deleted, and the remaining metadata information will be distinguished from the historical data in ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of data collection, and discloses a big data cluster metadata information collection method, a device, equipment and a medium. The method comprises the steps: receiving a task submitted to a big data cluster by a user, analyzing the task, and obtaining an execution plan corresponding to the task; calculating the nodes of the big data cluster through theexecution plan, and receiving the execution plan returned by the corresponding interface of the big data cluster when monitoring that the execution of the calculation operation is completed; analyzing the execution plan, obtaining metadata information corresponding to the execution plan, and storing the metadata information in a relational database; and importing the metadata information stored in the relational database into a big data warehouse according to an Sqoop data importing mode. The invention also relates to the blockchain technology. Metadata information is stored in the blockchain. By analyzing the execution plan, complete collection of the metadata information is realized, and the metadata information collection efficiency is improved.

Description

technical field [0001] The present application relates to the technical field of data collection, and in particular to a method, device, equipment and medium for collecting metadata information of a big data cluster. Background technique [0002] Metadata information is an important concept in the field of big data. It reflects the real data information stored in the current big data cluster. For example, metadata information A generally includes the corresponding real data storage location, data size, data storage method, etc. , is the basic unit of big data cluster management and storage data. However, with the advent of the era of big data, the amount of user data has shown explosive growth, and the amount of data is increasing day by day, resulting in excessive redundancy of cluster data, which brings great challenges to the storage of big data clusters. At the same time, these data need to be managed by corresponding metadata information, which also causes cluster meta...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/25G06F16/28
CPCG06F16/25G06F16/284Y02D10/00
Inventor 陆魏胡凭智
Owner PINGAN YIQIANBAO E COMMERCE CO LTD