Statistical data compression method and system based on standard deviation
A statistical data and compression method technology, applied in the direction of code conversion, electrical components, etc., can solve problems such as inability to process well, achieve less CPU time, less calculation data, improve processing efficiency and effective long-term data storage efficiency Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0035] In this method, the triggering of data compression is controlled by the upper limit of the number of statistical units. When data compression is triggered, the statistical unit is first sampled, and the upper and lower thresholds of the statistical indicators are obtained by calculating the standard deviation of the set of sampling statistical units, and then the statistical unit is filtered out from the full statistical units where the index value is between the upper and lower thresholds of the statistical indicators. And perform dimensionality reduction aggregation to low-dimensional statistical units. It is output together with the normal statistical unit in the statistical report.
[0036] refer to figure 1 : This method comprises the steps:
[0037] Step 1. Determine the upper limit of the statistical unit. Specifically, an upper limit for the number of statistical units is predefined, and when new input data has new dimension values and new statistical units...
Embodiment 2
[0045] Embodiment 2 is a preferred example of Embodiment 1.
[0046] Assume that in a monitoring scenario, it is necessary to analyze and count TCP / IP sessions in time series. Then the dimension of the statistics is the quintuple . In some large-scale monitoring scenarios, such as online banking of large banks, or external interfaces of data centers, the number of sessions per minute may reach 10 million or more, which greatly exceeds the upper limit of a monitoring cluster. At the same time, keeping so many detailed intermediate sessions does not have much practical significance for the data. The high probability belongs to normal communication and can be aggregated into two-tuples For long-term storage.
[0047] Assume that the monitoring upper limit is set to 2 million per minute.
[0048] Step 1: Calculate the quintuple for the newly input data packet, and look for the statistic unit containing the same quintuple in the set of statistic units cached in the current mi...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com