Payload storage method adopting column storage

A technology for column storage and metadata storage, applied in the field of file indexing, which can solve the problems of low data repetition, poor statistical analysis performance, and low search efficiency, and achieve the effects of reducing storage space, improving query performance, and improving response speed.

Pending Publication Date: 2021-01-15
南京好鱼科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Disadvantages of the existing technology: data is stored by row in Payloads, and the repeatability of each row of data is low, and the data cannot be optimally compressed during compression, and the compression efficiency is not high enough; after data is stored by row and compressed in blocks, When reading data, it is necessary to scan all the data in a single block, the search efficiency is low and the statistical analysis performance is poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Payload storage method adopting column storage

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The present invention will be further described in detail below in conjunction with specific embodiments.

[0024] A payload storage method using column storage, specifically including the following:

[0025] Step S1: First, use the Payloads of the Lucene entry to store metadata information for the entry in the index, and set the size of the block memory to 32K, and then store the multi-column data to be stored in the Payloads; Step S2: For storage The multi-column data in the Payloads is divided according to the set block memory size, that is, stored in a column-stored manner; step S3: after storing in the step S2, data compression is performed on the column-stored Payloads data, specifically for The divided block data is compressed and stored in the index file database for query and retrieval; the set value of block memory in step S1 adopts one of 16k, 32k, 64k or 128k. The Payloads of the step S1 is stored in a file format with pay as the suffix name, and the effect...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a payload storage method adopting column storage in the technical field of file indexing, which comprises the following steps of: firstly, storing metadata information for entries in an index by utilizing Payloads of Lucene entries, setting the size of a block memory to be 32K, and then storing multiple columns of data to be stored into the Payloads; dividing the multiplecolumns of data stored in the Payloads according to a set block memory size, namely storing the data in a column storage mode; and S3, after the Payloads data is stored in the step S2, compressing thePayloads data stored in the column, specifically compressing the divided block data, and storing the compressed block data into an index file database for query and retrieval. The payload storage method adopting column storage is small in occupied space, high in processing speed, high in compression efficiency and rapid in reading response.

Description

technical field [0001] The invention relates to the technical field of file indexing, in particular to a payload storage method using column storage. Background technique [0002] Indexing is the core of search engines in the era of big data. The process of indexing is the process of processing metadata (meta-data) into index files; Lucene, as an open source high-performance and scalable information retrieval engine in the industry, not only supports full-text Indexes can also provide various other types of index methods to meet different types of query requirements. [0003] Payloads was born in version 2.2 of Lucene. It is an extension of the Lucene 2.1 index file format. It provides an advanced indexing technology that can be flexibly configured, allowing metadata information to be stored for entries in the index. In some In specific application scenarios, the search performance of Lucene-based applications can be optimized. [0004] Using the Payload function of entrie...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/31G06F16/383
CPCG06F16/316G06F16/383
Inventor 母延年
Owner 南京好鱼科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products