OLAP pre-computation engine optimization method based on object storage and application

A technology for object storage and optimization methods, applied in the field of data analysis, which can solve problems such as affecting write performance, not allowing data to be changed by fragments, and incomplete data loading.

Active Publication Date: 2021-04-02
KUYUN SHANGHAI INFORMATION TECH CO LTD
View PDF21 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the use of network communication, there are network IO limitations when accessing the same resource concurrently
Also, object storage does not allow changing data by fragments, only entire objects, which affects write performance
Regarding data consistency issues, Amazon S3 provides eventual consistency for some operations, so new data may not be available immediately after uploading, which may result in incomplete data loading or loading outdated data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • OLAP pre-computation engine optimization method based on object storage and application
  • OLAP pre-computation engine optimization method based on object storage and application
  • OLAP pre-computation engine optimization method based on object storage and application

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] Embodiment 1 of the present invention provides a method for optimizing an OLAP precomputing engine based on object storage, such as figure 1 shown, including the following steps:

[0055] Step 1: Reduce renaming object operations in object storage;

[0056] During the specific implementation of the present invention, Amazon S3 is mainly used as object storage. The renaming operation in object storage is actually a copy plus delete operation, which is different from the modification of index files in file storage, so this operation is very inefficient and affects performance. For direct modification of the object name, it is necessary to first copy a new object and then delete the original object. For the operation of renaming a logical directory, it is necessary to traverse the entire directory to copy the files first, and the time and space costs are high. Therefore, the present invention proposes to reduce the optimization direction of renaming object operation, an...

Embodiment 2

[0085] Embodiment 2 of the present invention, such as Figure 7 As shown, an OLAP pre-computing engine optimization system based on object storage is provided, and the above-mentioned OLAP pre-computation engine optimization method based on object storage is applied, including file renaming conversion module, inverted path conversion module and data consistency checking module. at least one of which:

[0086] The file renaming conversion module, through the file mapping table added in the metadata layer, matches the mapping relationship of files before and after renaming, and is used to reduce the renaming operation on the bottom layer of the file system;

[0087] The inversion path conversion module adds a path adaptation mechanism to the retrieval logic at the bottom of the OLAP engine, and inverts the logical path of the partition directory hierarchy structure of the file to correspond to the prefix of the file in the object storage, which is used to realize fast query and ...

Embodiment 3

[0090] Embodiment 3 of the present invention provides a storage medium in which a computer program is stored, and is characterized in that, by running the computer program, the object storage-based OLAP precomputing engine optimization method described in Embodiment 1 can be executed.

[0091]In a specific embodiment of the present invention, the construction and query performance before and after using the optimization method provided by the present invention were tested and compared, and the optimized construction performance was verified, and data consistency was ensured. There is no obvious performance loss in the case of high concurrency and complex queries, and the speed is significantly improved.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an OLAP pre-computation engine optimization method based on object storage and application. Three optimization directions are provided, namely, the logic paths of renaming object operation reduction, data consistency check and inverted index file inversion are provided. Through a file mapping table added to the metadata layer, a mapping relationship between files before andafter renaming is matched, and renaming operation on the bottom layer of the file system is reduced; and a logic path of a partition directory hierarchical structure of the file is inverted to correspond to a prefix of the file in the object storage, thereby realizing quick query and reading of the object storage, adding logic verification to reading operation, deleting operation and writing operation, and checking data consistency. According to the method, the read-write mode of the OLAP engine in the object storage using process is optimized, the execution efficiency of the engine is improved, response to the analysis requirement of an upper-layer report system is accelerated, and based on the method, an efficient OLAP calculation query execution engine can be constructed, the construction efficiency is improved, and query is accelerated.

Description

technical field [0001] The invention relates to the technical field of data analysis, in particular to an object storage-based OLAP pre-calculation engine optimization method and application. Background technique [0002] At present, OLAP is a software technology that enables analysts to quickly, consistently, and interactively observe information from various aspects in order to achieve a deep understanding of data. The mainstream OLAP engines on the market mainly focus on three hot issues, data volume, performance and flexibility. [0003] The OLAP pre-computing engine based on the open source Apache Kylin uses cloud-native computing and storage to build fast, elastic, and cost-effective big data analysis applications, and can seamlessly connect existing data warehouses and cloud storage on the cloud, such as Amazon S3 , Azure Blob Storage, Snowflake, etc. High-performance OLAP services on the cloud are inseparable from the choice of storage media. Cloud storage solution...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/172G06F16/16G06F16/14G06F16/13
CPCG06F16/172G06F16/13G06F16/16G06F16/162G06F16/148G06F16/164G06F16/283G06F9/5083H04L67/1097
Inventor 顾单超李栋李扬韩卿
Owner KUYUN SHANGHAI INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products