Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Statistical data compression method and system based on standard deviation

A statistical data and compression method technology, applied in the direction of code conversion, electrical components, etc., can solve problems such as inability to process well, achieve less CPU time, less calculation data, improve processing efficiency and effective long-term data storage efficiency Effect

Active Publication Date: 2021-09-03
SHANGHAI NETIS TECH
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Neither of the above two methods can handle well

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Statistical data compression method and system based on standard deviation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0035] In this method, the triggering of data compression is controlled by the upper limit of the number of statistical units. When data compression is triggered, the statistical unit is first sampled, and the upper and lower thresholds of the statistical indicators are obtained by calculating the standard deviation of the set of sampling statistical units, and then the statistical unit is filtered out from the full statistical units where the index value is between the upper and lower thresholds of the statistical indicators. And perform dimensionality reduction aggregation to low-dimensional statistical units. It is output together with the normal statistical unit in the statistical report.

[0036] refer to figure 1 : This method comprises the steps:

[0037] Step 1. Determine the upper limit of the statistical unit. Specifically, an upper limit for the number of statistical units is predefined, and when new input data has new dimension values ​​and new statistical units...

Embodiment 2

[0045] Embodiment 2 is a preferred example of Embodiment 1.

[0046] Assume that in a monitoring scenario, it is necessary to analyze and count TCP / IP sessions in time series. Then the dimension of the statistics is the quintuple . In some large-scale monitoring scenarios, such as online banking of large banks, or external interfaces of data centers, the number of sessions per minute may reach 10 million or more, which greatly exceeds the upper limit of a monitoring cluster. At the same time, keeping so many detailed intermediate sessions does not have much practical significance for the data. The high probability belongs to normal communication and can be aggregated into two-tuples For long-term storage.

[0047] Assume that the monitoring upper limit is set to 2 million per minute.

[0048] Step 1: Calculate the quintuple for the newly input data packet, and look for the statistic unit containing the same quintuple in the set of statistic units cached in the current mi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a statistical data compression method and system based on standard deviation, and the method comprises the steps: 1, judging the upper limit value of the number of statistical units to obtain a total statistical unit; 2, according to a preset sampling rate, screening out statistical units meeting a preset condition from the total quantity statistical units; 3, calculating an index average value and a standard deviation of the statistical units meeting a preset condition, and obtaining index upper and lower limit thresholds according to the index average value and the standard deviation; 4, filtering the statistical units according to the upper and lower limit threshold values of the indexes, filtering out all the statistical units of which the indexes are greater than or equal to the lower limit threshold value of the indexes and the indexes are less than or equal to the upper limit threshold value of the indexes, and aggregating the filtered statistical units to lower dimensions. Statistical memory occupation is controlled by controlling the upper limit value of the statistical unit, and the situation that memory allocation of the statistical unit is uncontrollable due to continuous input of data in a flow test statistical scene is avoided.

Description

technical field [0001] The present invention relates to the technical field of data compression, in particular to a statistical data compression method and system based on standard deviation. Background technique [0002] Streaming data is a set of sequential, massive, fast, and continuously arriving data sequences. In general, a data stream can be regarded as a dynamic data collection that grows infinitely over time. It is used in network monitoring, sensor network, aerospace, meteorological measurement and control, financial services and other fields. [0003] In the process of streaming data statistics, data is continuously input. In a certain dimension, the number of allocated statistical units will continue to increase, resulting in an increase in memory usage. Generally, memory usage is controlled by limiting the number of statistical units. After the number reaches the upper limit, no new statistical units will be allocated for new data. You must wait for the stat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): H03M7/30
CPCH03M7/30
Inventor 周奕庆蔡晓华
Owner SHANGHAI NETIS TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products