Indexing method of distributed column storage system

A column storage and distributed technology, which is applied in the field of distributed column storage system indexing, can solve the problems of low accuracy rate, bitmap loss of data statistics information, and inconvenient query of statistical information, etc., to achieve the effect of ensuring integrity and improving accuracy

Active Publication Date: 2016-12-21
BEIJING GUODIANTONG NETWORK TECH CO LTD +3
View PDF3 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

On the contrary, the accuracy is lower
At the same time, although the above-mentioned expansion of the concept of bitmap reduces the calculation cost, the expanded bitmap loses the statistical information of the data
Inconvenience to the query of subsequent statistical information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Indexing method of distributed column storage system
  • Indexing method of distributed column storage system
  • Indexing method of distributed column storage system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0031] It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are to distinguish two entities with the same name but different parameters or parameters that are not the same, see "first" and "second" It is only for the convenience of expression, and should not be construed as a limitation on the embodiments of the present invention, which will not be described one by one in the subsequent embodiments.

[0032] refer to figure 1 As shown, it is a flow chart of an embodiment of a method for indexing a distributed column storage system provided by the present invention. The method for indexing in the distributed column storage system includes:

[0033] Step 101, obtai...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an indexing method of a distributed column storage system. The method comprises the steps of obtaining distribution characteristics of each column of data, and setting a domain value and a division rule of each column of data; obtaining a continuous data area after division according to the domain value and the division rule; establishing an area coding vector corresponding to each data area; performing statistic calculation to obtain statistic information of each column of data, and combining the statistic information with the corresponding area coding vector to obtain an area coding vector with the statistic information; and performing data indexing by taking the area coding vector with the statistic information as a bit vector of a bitmap index. According to the indexing method of the distributed column storage system, a division mode of the grouping bitmap index conforms to a query filtering condition by setting the domain value and the division rule of each column of data; and by calculating the statistic information of the column data, the accuracy of grouping bitmap query is improved, the statistic information of the data in the column storage system can be reserved, and the integrity of data information is ensured.

Description

technical field [0001] The invention relates to the technical field of data indexing in a column storage system, in particular to a method for indexing a distributed column storage system. Background technique [0002] There are two ways to realize the physical data storage of the database: row-based storage and column-based storage. For row-based storage: It stores the entire record of the logical data table in the data block of the file. In order to improve the query speed, it establishes indexes such as B+ trees for certain columns. For column-based storage: the records in the logical data table are not directly mapped to the physical data, but the records are separated by columns, the values ​​of the same column of all records are stored together, and the connection data is provided to recombine the corresponding column values ​​​​of the records rise up to form a record. Among them, the relational database based on row storage has a disadvantage in data query performan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/221G06F16/2237
Inventor 孙乔付兰梅邓卜侨孙雷马慧远刘炜崔伟
Owner BEIJING GUODIANTONG NETWORK TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products