System for unstructured data compressing based on big data and method thereof

An unstructured data, compression processing technology, applied in electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of not maximizing the saving of branch node transmission bandwidth, occupying physical storage space, etc., to achieve stable Reliable data collection, small footprint, low data volume effects

Inactive Publication Date: 2017-06-13
SHENZHEN HI GALAXY NETWORK SCI & TECH CO LTD
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] 2. When transmitting, all data is only compressed by traditional Zip and then transmitted, which cannot maximize the saving of transmission bandwidth between each segmented node;
[0010] 3. When storing, if the data is completely stored in chronological order and logical relationship, it will take up a lot of physical storage space

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System for unstructured data compressing based on big data and method thereof
  • System for unstructured data compressing based on big data and method thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0033] The present invention is specifically aimed at the non-structural data characteristics of text information under the current big data, that is, text and document data that are inconvenient to be represented by two-dimensional logical tables of the database, including office documents in all formats, text, and standard general markup language. A subset of XML, HTML, various reports and so on.

[0034] The present invention realizes real-time data flow collection, and adopts the unstructured data compression processing algorithm proposed by this patent to efficiently compress the unstructured data flow, so as to save network bandwidth cost and storage when data is transmitted The physical resources occupied by the time data.

[0035] like figure 1 As shown, a large data-based unstructured data compression processing system includes a d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a system for unstructured data compressing based on big data and a method thereof. In the system, a data collecting module, a residual quantity buffer module, an ExUDP module, a data receiving module, a time series database, a data reduction module, a data analysis/mining interface sequentially perform unidirectional data transmission. The method comprises the steps of dataflow collection, data compression, data storage and data restoration. The data size of the unstructured data generated by big data collection is smaller, the bandwidth required for the transmission is lower, a space occupied by storage is smaller, application scenarios having harsh requirements for bandwidth and data speed can be met, and stable and reliable data collection, transmission and storage can be implemented.

Description

technical field [0001] The invention relates to the field of data stream collection, transmission, storage and analysis of big data processing, in particular to a big data-based unstructured data compression processing system and method thereof. Background technique [0002] According to a survey report by IDC, 80% of data in enterprises are unstructured data, and these data are growing exponentially by 60% every year. Unstructured data, as the name suggests, is information stored in a file system, not a database. According to reports, only 1%-5% of data is structured data on average, and more valuable information is stored in unstructured data, and traditional data processing techniques cannot dig out the value hidden in these data . In order to cope with this challenge, big data technology has emerged as the times require, and more and more companies around the world use it to collect, store, and analyze the data obtained in business operations. [0003] Data in the big...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30H03M7/30
Inventor 王倬遥高振国杨海雷
Owner SHENZHEN HI GALAXY NETWORK SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products