Data integration method based on data lake, server and storage medium

A data integration and server technology, applied in the field of big data, can solve problems such as increased storage costs, difficulty in quickly adjusting the structure of stored data, inflexible data use, etc., to achieve the effect of improving flexibility

Pending Publication Date: 2019-02-01
PING AN TECH (SHENZHEN) CO LTD
View PDF0 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the data is analyzed in different scenarios, it is difficult to quickly adjust the structure of the stored data, resulting in inflexible data use
At the same time, building data warehouses with different themes greatly increases storage costs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data integration method based on data lake, server and storage medium
  • Data integration method based on data lake, server and storage medium
  • Data integration method based on data lake, server and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0032] like figure 1 Shown is a schematic diagram of the system architecture based on the data lake in the present invention.

[0033] The present invention aims to provide a data integration system based on a data lake. The system receives the original form data of the data source and saves it in the data lake. The data lake is a storage system that can store a large amount of data from different sources and in different formats. The system includes storage space for multiple data pools, such as original data pool, simulated data pool, application data pool, text data pool and file data pool. The system receives raw data from the data source and directly stores it in the raw data pool without processing, classifies it in the raw data pool, and then classifies it into the corresponding classification data pool, such ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a big data technology, disclosing a data integration method based on a data lake, a server and a storage medium. The method constructs a data lake, constructs an original datapool and a plurality of classification data pools in the data lake, and sets processing rules of each classification data pool respectively. Then, the method receives the original data of each data source, stores the original data in the original data pool, extracts the original data in the original data pool, and classifies the original data into corresponding classification data pools by a preset mode according to the classification of each classification data pool. Finally, according to the preset processing rules in each classification data pool, the method normalizes the original data ineach classification data to obtain the target data, and stores the target data in the corresponding classification data pool. By using the invention, the flexibility of data utilization can be improved, and the data storage cost can be reduced.

Description

technical field [0001] The invention relates to the technical field of big data, in particular to a data lake-based data integration method, server and computer-readable storage medium. Background technique [0002] In the context of big data, advances in technology and software allow us to process and analyze large amounts of data. However, when processing and analyzing data, in addition to the scale of data, we also need to consider the diversity of data types to be analyzed and the complexity of data usage scenarios. Different data types and usage scenarios mean that data sets need to be presented in different ways. The formats are stored and used in different systems, such as storing data in different formats in data warehouses of different subjects. When the data is analyzed in different scenarios, it is difficult to quickly adjust the structure of the stored data, resulting in inflexible data usage. At the same time, building data warehouses with different themes gre...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/06G06F16/2458G06F16/35
CPCG06F3/0607G06F3/0644
Inventor 周文豪符尊群吴逸丰孙屹峰
Owner PING AN TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products