Compressed encoding and decoding method for network flow data and bitmap indexes thereof

A bitmap indexing and compression coding technology, applied in database indexing, structured data retrieval, special data processing applications, etc., can solve the problems of inability to compress original network stream data synchronously, and cannot be applied well, and reduce coding complexity , the effect of increasing the compression rate and reducing the encoding length

Active Publication Date: 2020-09-18
中国工业互联网研究院
View PDF8 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0027] 4. Literal (L), the literal group that cannot be grouped according to 1 and 2 is classified into this type
However, the existing methods are not suitable for this kind of network flow data bitmap index, and they cannot compress the original network flow data synchronously.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Compressed encoding and decoding method for network flow data and bitmap indexes thereof
  • Compressed encoding and decoding method for network flow data and bitmap indexes thereof
  • Compressed encoding and decoding method for network flow data and bitmap indexes thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0066] The present invention provides a method for compressing and encoding network stream data and its bitmap index, and the specific steps are as follows:

[0067] Step 1: Split the original stream data sequence according to the field attributes and store it in a columnar database; among them, split the source and destination address sequence according to the field attributes and store them in columns, and get the following: figure 2 the database shown;

[0068] The second step: divide each column of data sequence into 4K row units; get as follows image 3 The block result shown is the original data block;

[0069] Step 3: sort the data in each data block in the original data block by row to generate Figure 4 The sorted rearranged data blocks shown, wherein the sorting can be sorted by initial letter, size, hash, etc.; and further obtain the rearrangement of the mapping relationship between the original data block and the rearranged data block Improved bitmap;

[0070]...

Embodiment 2

[0080] The present invention further provides a kind of coding structure of improving bitmap index table (such as Image 6 );

[0081] The code word length is 8 or 16 bits. The first bit is the type flag bit, 0 means 0 fill word (0-fill-word), and 1 means 1 fill word (1-fill word); the second bit indicates the length of the 0 / 1 fill word, 0 means the length is 1 byte, and 1 means the length is 2 bytes. For a 0 / 1 stuffing word with a length of 1 byte, the 3rd to 8th bits are the bit counters representing consecutive 0 / 1 bits; for a 0 / 1 stuffing word with a length of 2 bytes, the 3rd to 8th bit The 16th bit is a bit counter representing consecutive 0 / 1 bits.

Embodiment 3

[0083] The present invention further provides a coding structure for rearranging an improved bitmap (such as Figure 7 );

[0084] The code word length is 8 bits. The first bit is the type flag bit, 0 means 0 filling word (0-fill-word), and 1 means non-filling word (dirty-word). For 0 padding words, the 2nd to 8th bits are the sequence block counters representing consecutive 0 bits (7 consecutive bits form a sequence block); for non-filling words, the 2nd to 8th bits are consecutive 7 bits non-contiguous 0 sequence blocks.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a compressed encoding and decoding method for network flow data and bitmap indexes thereof, which comprises the following steps of: 1, splitting an original stream data sequence according to field attributes, and storing the split original stream data sequence in a column database; 2, partitioning each column of data sequence by taking 4K lines as units to obtain original data blocks; 3, sorting the data in each original data block according to rows to obtain a rearranged data block, and generating a rearranged improved bit chart; 4, constructing an improved bitmap index table for the rearranged data blocks; and 5, respectively performing run coding compression on the rearranged data block, the rearranged improved bitmap and the improved bitmap index table, recording the type and length of 0/1, generating continuous 8-bit or 16-bit words, and forming a compressed data block, a rearranged improved bitmap compression table and an improved bitmap index compressiontable. According to the method, the compression ratio is greatly improved, and the coding length is greatly reduced while the coding complexity is reduced.

Description

technical field [0001] The invention relates to a compression encoding and decoding method for network stream data and its bitmap index, and belongs to the fields of computer network, information retrieval and big data analysis. Background technique [0002] The rapid development of Internet technology and mobile devices has brought us into the era of mobile Internet, allowing users to access any content on the network from anywhere and at any time, resulting in more abundant traffic data. According to Cisco's prediction, the user traffic data generated and accumulated by any large Internet company in daily operations is so huge that it cannot be measured by the data magnitude of the past ten years. For this reason, Cisco has predicted that the data traffic of the network will increase at a rate of 3 times between 2017 and 2022, and reach 4.8 Zetta bytes in 2022. According to China Unicom's data, the compound annual growth rate of mobile user traffic on Unicom's 4G network ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/22
CPCG06F16/2237
Inventor 马戈顾维玺黄启洋王青春
Owner 中国工业互联网研究院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products