System and Method for Identifying Hierarchical Heavy Hitters in Multi-Dimensional Data

a multi-dimensional data and hierarchical technology, applied in multi-dimensional databases, instruments, data processing applications, etc., can solve the problems of prohibitive storing of data for all nodes, superfluous, and inconvenient use of hh algorithm

Inactive Publication Date: 2009-11-26
AMERICAN TELEPHONE & TELEGRAPH CO +1
View PDF3 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The conventional HH algorithm, however, did not account for any hierarchy in the data set.
However, the storing of data for all nodes and the amount of calculation is prohibitive.
In addition, this method provides superfluous results.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and Method for Identifying Hierarchical Heavy Hitters in Multi-Dimensional Data
  • System and Method for Identifying Hierarchical Heavy Hitters in Multi-Dimensional Data
  • System and Method for Identifying Hierarchical Heavy Hitters in Multi-Dimensional Data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0011]The present invention may be further understood with reference to the following description and the appended drawings, wherein like elements are referred to with the same reference numerals. The exemplary embodiment of the present invention describes a method for identifying hierarchical heavy hitters (“HHHs”) in a multidimensional data structure. The multidimensional data structure and methods for identifying the HHHs therein will be discussed in detail below.

[0012]In the exemplary embodiments, the exemplary hierarchical data is described as data representing IP addresses in IP traffic data. The IP addresses are by their nature hierarchical, i.e., each individual address is arranged into subnets, which are within networks, which are within the IP address space. Therefore the collection of multiple data points based on IP addresses, and the generalization of these IP addresses, will result in a hierarchical data structure. The concept of generalization will be described in gre...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method including receiving a plurality of elements of a data stream, storing a multi-dimensional data structure in a memory, said multi-dimensional data structure storing the plurality of elements as a hierarchy of nodes, each node having a frequency count corresponding to the number of elements stored therein, comparing the frequency count of each node to a threshold value based on a total number of the elements stored in the nodes and identifying each node for which the frequency count is at least as great as the threshold value as a hierarchical heavy hitter (HHH) node and propagating the frequency count of each non-HHH nodes to its corresponding parent nodes.

Description

INCORPORATION BY REFERENCE[0001]The entire disclosure of U.S. patent application Ser. No. 10 / 802,605, entitled “Method and Apparatus for Identifying Hierarchical Heavy Hitters in a Data Stream” filed Mar. 17, 2004 is incorporated, in its entirety, herein. The entire disclosure of U.S. Provisional Patent Appln. 60 / 560,666, entitled “Diamond in the Rough: Finding Hierarchical Heavy Hitters in Multi-Dimensional Data” filed Apr. 8, 2004 is incorporated, in its entirety, herein.BACKGROUND[0002]Aggregation along hierarchies is a critical data summarization technique in a large variety of online applications, including decision support (e.g, online analytical processing (OLAP)), network management (e.g., internet protocol (IP) clustering, denial-of-service (DoS) attack monitoring), text (e.g., on prefixes of strings occurring in the text), and extensible markup language (XML) summarization (i.e., on prefixes of root-to-leaf paths in an XML data tree). In such applications, data is inherent...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F17/30
CPCG06F17/30592G06F17/30489G06F16/283G06F16/24556Y10S707/99945Y10S707/99948
Inventor CORMODE, GRAHAMKORN, PHILIP RUSSELLMUTHUKRISHNAN, SHANMUGAVELAYUTHAMSRIVASTAVA, DIVESH
Owner AMERICAN TELEPHONE & TELEGRAPH CO
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products