Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and apparatus for constructing variance-optimized histogram

A construction method and histogram technology, applied in the field of big data computing, can solve the problems of reducing the construction time complexity, low data analysis accuracy, limited memory space, etc., to achieve rapid construction, ensure query accuracy, and solve space insufficient effect

Active Publication Date: 2017-11-24
NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
View PDF4 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the prior art, when constructing a variance optimized histogram in a streaming data environment, the required time complexity is O(n·(B / ∈)2logn), and each time a new element is written The complexity is O((B / ∈)logn), which is suitable for ordered stream data, and is limited by memory space, and can only construct data within a specified time window
[0006] The dynamically adjusted approximate variance optimization histogram method in the prior art, each time a new element appears is inserted into the corresponding bucket, and then the bucket is split or merged The sum of the variances of the histogram as a whole is approximately optimal. This method greatly reduces the construction time complexity, but it needs to save all the original data before calculating the variance of the buckets to be split and the buckets to be merged, so it is not conducive to large streams. Dynamically build a variance optimization histogram in the data environment;
[0007] In the prior art, the method of constructing variance optimization histogram with sample data is based on the premise of knowing the distribution of data, and then randomly sampling the continuous stream data, However, the disadvantage is that the accuracy of data analysis is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and apparatus for constructing variance-optimized histogram
  • Method and apparatus for constructing variance-optimized histogram
  • Method and apparatus for constructing variance-optimized histogram

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The technical solutions provided by the present invention will be described in detail below in the form of specific embodiments in conjunction with the accompanying drawings. In order to make the purpose, technical solutions and points of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is an embodiment of the present invention, and all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of the present invention.

[0053] The technical solution provided by the present invention uses an adaptive online variance optimization sampling method to sample flow data within a limited space, and uses the sample data to dynamically construct an approximate variance optimiz...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a method and an apparatus for constructing a variance-optimized histogram. The method comprises: setting the number of samples K and the number of histogram barrels B according to the memory size and the query accuracy, and when a new element appears, in order to keep the number of samples at K, optimizing memory space data samples by using the online data sampling method; and according to the optimized memory space data samples, dynamically constructing the variance-optimized histogram. The apparatus comprises an optimization unit and a construction unit. According to the technical scheme provided by the present invention, the influence of the data size and the distribution characteristic is reduced, and the interval retrieval error caused by the data skew or the uneven data distribution can be effectively reduced.

Description

technical field [0001] The invention relates to the field of big data calculation, in particular to a method and device for constructing a variance optimization histogram. Background technique [0002] In the era of big data, streaming data characterized by massive volume and high speed has become a hot research direction. At the same time, the application requirements for real-time processing and analysis of streaming data are also growing explosively. For example, the peak transaction value of Tmall’s “Double Eleven” in 2015 reached 85,900 transactions per second, which was 2.23 times the peak value of 38,500 transactions per second during the “Double Eleven” in 2014. Ant Huabei successfully traded within 1 minute of its launch. The total number of orders has reached 520,000, and the Boeing 737 engine in the flight state generates nearly 20TB of data per hour, such as network monitoring, network traffic analysis, transaction log analysis, stock quotes, transactions, etc., ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/9024
Inventor 史亮王勇张鸿
Owner NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT