Block compression of tables with repeated values

A compressing data, tangible technology, applied in the field of data processing, can solve problems such as a large number of hardware resources, achieve the effect of low network bandwidth and reduce storage space consumption

Active Publication Date: 2008-11-26
SAP AG
View PDF0 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For massive data, such as a combination of tables containing millions of records, data processing may require a lot of hardware resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Block compression of tables with repeated values
  • Block compression of tables with repeated values
  • Block compression of tables with repeated values

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In general, in Figures 1-10, what may be referred to as dictionary-based compression, bit-vector compression (or vector-based compression), sorted bit-vector compression (or shortened vector-based compression), and block A combination of techniques like vector compression to compress data. The data may be structured business data, where the data is structured in the sense that the data may be attributes or key figures organized in a data structure such as a table, and the attributes or key figures may have Correlation. For example, in an information table, a row may have dependencies between data in that row such that data in each column of the row is related to other data in other columns of the row. Certain values ​​of data, such as null values, may be instantiated very frequently across thousands or millions of rows that may be located in a portion of a data structure such as in Data can form a sparse distribution in the sense that it is within a particular row of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Methods and apparatus, including computer program products, for block compression of tables with repeated values. In general, value identifiers representing a compressed column of data may be sorted to render repeated values contiguous, and block dictionaries may be generated. A block dictionary may be generated for each block of value identifiers. Each block dictionary may include a list of block identifiers, where each block identifier is associated with a value identifier and there is a block identifier for each unique value in a block. Blocks may have standard sizes and block dictionaries may be reused for multiple blocks.

Description

technical field [0001] This invention relates to data processing by digital computers, and more particularly to block compression of tables with repeating values. Background technique [0002] Search engines can search large amounts of data in database tables, such as relational tables, to find results. For massive data, such as a combination of tables containing millions of records, the processing of the data may require a lot of hardware resources. For example, a large amount of random access memory space may be required to store all records related to executing user requests. Contents of the invention [0003] The subject matter disclosed herein provides methods and apparatus, including computer program products, that implement techniques involving block compression of tables with repeated values. [0004] In one aspect, the data column is compressed according to dictionary-based compression to generate a value identifier column, sort the value identifiers, generate a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30H03M7/30
CPCH03M7/3084H03M7/3088G06F16/1744G06F16/221G06F16/2228
Inventor 弗朗兹·费尔伯冈特·拉德斯托克安德鲁·罗斯
Owner SAP AG
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products