Multi-rule combined compression method based on database row and column mixed storage

A hybrid storage and composite compression technology, which is applied in the fields of electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as inability to obtain compression effects and low database efficiency

Inactive Publication Date: 2012-10-17
天津神舟通用数据技术有限公司
View PDF2 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The data in row storage is stored and read continuously in the form of tuple rows, but since the attributes of each tuple row in the data table have basically no data association, good compression effects cannot be achieved
In contrast, column storage stores the data of each attribute column in the data table separately and continuously, which can greatly improve the similarity of continuous data to achieve a higher compression rate, but at the same time breaks the organization of data tuple rows and will Causes the database to be extremely inefficient when doing traditional row queries

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-rule combined compression method based on database row and column mixed storage
  • Multi-rule combined compression method based on database row and column mixed storage

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] The technical solution of the present invention will be further described in conjunction with specific implementation and examples.

[0040] 1. If figure 1 and figure 2 Shown, the specific implementation process and working principle of the present invention are as follows:

[0041] 1) Receive the data imported by the user, and reorganize and split all the data into multiple attribute columns according to the attribute mode of the user table.

[0042] 2) Use the dictionary rule compression method to construct the dictionary structure and weight table for each attribute column data in the current data package.

[0043] 3) For each attribute column, use the constructed dictionary and weight information to estimate the size of the column encoded using various in-column compression rules, and select the compression rule that occupies the least space for each attribute column according to the comparison.

[0044] 4) According to the dictionary information of each attribu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-rule combined compression method based on database row and column mixed storage. A mixed storage compression mode for organizing data in a database according to tuple rows and compressing the data in the database according to property columns is provided by combining the current software and hardware development tendency and a severe performance bottleneck in database industry, and has the characteristic of high compression rate of column storage and the advantage of convenience in random positioning and accessing of row storage. Furthermore, a rule encoding method in a plurality of property columns is provided according to different data distribution characteristics; particularly, an inter-column compression rule is provided according to a possible relation among the property columns in a database sheet; by a rear-end general compression algorithm, a multi-level combined compression function is efficiently supplied to upper database application; and the maximum encoding and decoding speed under an appointed compression rate condition is guaranteed.

Description

technical field [0001] The invention relates to data storage technology, data compression technology, and data retrieval technology, in particular to a multi-rule composite compression method based on mixed storage of rows and columns in a database. Background technique [0002] Query processing and data storage are the two core elements of the database, and the two complement each other to ensure that the database can provide users with efficient data management and retrieval services. However, with the deepening of the information revolution, real-world applications generate massive amounts of new data all the time, and users are more inclined to retain historical data for a longer period of time. The limitation of data storage capacity has become a serious problem that must be faced without delay. On the other hand, the development speed of storage hardware has lagged far behind that of other computer system hardware, and the storage system has become a serious bottleneck...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 曹晖冯柯毛云青何清法周丽霞蒋志勇赵殿奎关刚王效忠李海峰
Owner 天津神舟通用数据技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products