Data storage platform construction method compatible with data warehouse and data lake

A data warehouse and data storage technology, applied in the field of data processing, can solve the problems of storage cost waste, data management view, data redundancy, etc., to reduce data redundancy and storage costs, improve enterprise productivity, and reduce management and operation and maintenance cost effect

Pending Publication Date: 2022-05-24
杭州石原子科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0011] The purpose of the present invention is to provide a data storage platform construction method compatible with data warehouses and data lakes, so as to solve the problem that there is a large a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data storage platform construction method compatible with data warehouse and data lake

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0026] see figure 1 , the overall architecture is divided into upper and lower layers, the upper layer is the application load layer of the data warehouse and data lake ( figure 1 workload), the lower layer is the data platform layer ( figure 1 Data platform), the present invention provides a kind of technical scheme: a kind of data storage platform construction method compatible with data warehouse and data lake, comprises the following steps:

[0027] Step 1: Use columnar storage and row-column mixed storage to store the data of the data lake and data warehouse; in order to achieve the data storage requirements that can support both the data lake and the data warehouse, it is necessary to realize the support database on the premise of ensuring the storage of the data lake The storage method of data (including result data and process data that needs to be temporarily stored) means that object storage or file storage needs to support columnar storage technology and row-column...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of data processing, and particularly relates to a data storage platform construction method compatible with a data warehouse and a data lake, which comprises the following steps of: 1, storing data of the data lake and the data warehouse by adopting column storage and row-column mixed storage; 2, integrating the storage layers of the data warehouse and the data lake, and independently partitioning the storage areas of the data lake and the data warehouse in a partitioning manner; 3, a unified metadata management layer is constructed, bottom storage implementation details are shielded for upper-layer application loads through the unified metadata management layer, and unified data services for the upper-layer application loads are provided. In addition, the problem of data redundancy between two products can be solved, a globally unified data management view is provided for the user, further, the technology stack can be reduced and simplified, the overall management and operation and maintenance cost is reduced, and the data redundancy and storage cost are reduced.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a data storage platform construction method compatible with a data warehouse and a data lake. Background technique [0002] Status: With the advent of the era of big data, more and more big data products appear, of which data warehouse and data lake are two representative big data products to provide services for users. [0003] 1. Data Lake: A data lake is the storage of all kinds of unprocessed raw data, including structured, semi-structured and unstructured data of any scale, and needs to support convenient access to all users. Data lakes can be used as data sources for data warehouses or other big data applications. [0004] 2. Data warehouse: Data warehouse is suitable as a database choice for general analysis, including reports, data dashboards, interactive analysis and other high-performance analysis. Data warehouses generally contain only processed and refined d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/21G06F16/22G06F16/28
CPCG06F16/211G06F16/283G06F16/221Y02D10/00
Inventor 徐辛
Owner 杭州石原子科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products