Multi-column joint storage method based on column storage

A joint storage and joint index technology, applied in the field of multi-column joint storage based on column storage, can solve problems such as excessive CPU and IO resource consumption, low compression ratio of column storage database, and impact on query performance, so as to speed up retrieval , Increase the effect of similarity and high compression ratio

Inactive Publication Date: 2019-11-05
南京录信软件技术有限公司
View PDF6 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Because it is necessary to read the content of the corresponding row of each column attribute file, a lot of IO time is wasted; the current column store internally sorts multiple columns independently, and consumes too much cpu and io resources when grouping statistics based on multiple columns ; The compression ratio of the existing column store database is not high; the basic unit of the existing column store database execution engine to access data is a single attribute value, and there are a large number of jump access and random access in the access of the column database to the attribute value, which seriously affects the query performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-column joint storage method based on column storage
  • Multi-column joint storage method based on column storage
  • Multi-column joint storage method based on column storage

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0027] see Figure 1-3 , the present invention provides a technical solution: a method for multi-column joint storage based on column storage, the steps of which are as follows:

[0028] S1: Create a joint index for multiple columns, and intervene in the sorting and distribution of data when the data is stored in the database (sorted and stored according to group by). In S1, the maximum and minimum values ​​of the level1 nodes are first retrieved, and the leve...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-column joint storage method based on the column storage, and relates to the technical field of multi-column storage. The method comprises the following steps: S1, creating a joint index for multiple columns, and intervening the sorting distribution of data (according to group sorting storage) during data storage; S2, adopting the column storage by each column in themultiple columns of indexes, storing each column of data in an independent continuous area, and storing each column of index data in blocks, wherein the blocks are minimum units of data storage; S3,establishing a two-stage skip list query structure, and accelerating data retrieval. According to the multi-column joint storage method based on column storage, due to the fact that the data are stored in the form of columns, in the SQL statement execution process, the expenditure of mapping operation in a row database and the characteristics of column-by-column storage and data partitioning of the data are saved. Only useful attributes and data can be read from a magnetic disk according to needs so as to save the IO bandwidth.

Description

technical field [0001] The invention relates to the technical field of multi-column storage, in particular to a multi-column joint storage method based on column storage. Background technique [0002] In recent years, with the substantial increase in data volume in various industries, storage costs and management costs have also increased. Facing the OLAP demand for massive data, row storage databases will consume a lot of time for mapping when statistical data. The operation of massive data makes the database performance limited by computer memory and hard disk. In order to reduce storage costs and the need for machine configuration, the database system uses the method of compressing data to save data. In order to facilitate the statistical needs of OLAP, the method of storing data in columns is adopted. However, traditional column storage is for single-column storage, and there is no association between multiple columns, and the data is not arranged in order, so the effi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/22G06F16/28
CPCG06F16/221G06F16/283
Inventor 王帅
Owner 南京录信软件技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products