Efficient computation of top-K aggregation over graph and network data
a graph and network data technology, applied in computing, instruments, electric digital data processing, etc., can solve the problems of inability to easily answer top-k operations in structured query language (sql) query engines, inability to process h-hop queries, and inability to advance analysis of social networks
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Benefits of technology
Problems solved by technology
Method used
Image
Examples
example 1
[0058]An online professional networking tool helps people discover inside connections to recommended job candidates, industry experts and business partners. It is natural to submit queries for business companies to find top-k candidates who have strong expertise and are referred by professionals in the same domain. For example, a query may find top-k candidates who have experiences in database research and also are referred by many database experts.
[0059]There is no doubt that the above queries are useful for emerging applications in many online social communities and other networks, such as book recommendations on a website of an online retailer, targeted marketing on a social networking website, and gene function finding in biological networks. These applications are unified by a general aggregation query definition over a network. In general, a top-k aggregation on graphs needs to solve three problems, listed below as P1, P2 and P3. Note that P1, P2 and P3 are listed below for pr...
example 2
[0071]FIG. 3A and FIG. 3B illustrate an example of the first forward processing approach. Given a graph 300-1 shown in FIG. 3A with nodes 302-1, 302-2, 302-3, 302-4, 302-5 and 302-6 (a.k.a. nodes e1, e2, e3, e4, e5 and e6, respectively), the SUM function for 1-hop neighbors is computed to generate aggregate scores (a.k.a. aggregate values). Node e3 is selected first for a forward processing and SUM(e3,1) is computed as 1.8 (i.e., the aggregate score of e3).
[0072]In the SUM function in FIGS. 3A and 3B and in the examples that are presented below, the “1” parameter (e.g., the “1” in SUM(e3,1)) indicates 1-hop. Thus, in Example 2, SUM(e3,1) is the sum of the score assigned to node e3 plus the scores assigned to the 1-hop neighbors of node e3 (i.e., score of node e3+score of node e1+score of node e2+score of node e4+score of node e5, or 0.2+0.5+0+0.1+1=1.8).
[0073]In Example 2, node e1 is selected as the next node and SUM(e1, 1) is computed as 0.5+0+0.2=0.7. Thus, the aggregate score of ...
example 3
[0091]Consider graph 400 in FIG. 4, which depicts an example of forward processing using differentia index-based pruning. For node e3, the differential indexes of its neighbors in 1-hop are: delta(e1−e3)=0, delta(e2−e3)=0, delta(e4−e3)=0, and delta(e5−e3)=1. The differential index values for delta(e1−e3), delta(e2−e3), and delta(e4−e3) are zero because e1, e2 and e4's 1-hop nodes are a subset of e3's 1-hop nodes. The differential index value for delta(e5−e3) is 1 because e5 has one node that is in its 1-hop but not in e3's 1-hop (i.e., node e6).
[0092]In Example 3, a forward processing is done on node e3 with the SUM aggregate values evaluated on node e3's 1-hop nodes to obtain SUM(e3,1)=1.8. Then the upper bound of e3's neighbor nodes is computed, as FIG. 4 shows. For instance, SUM(e1,1)=1.8 because given delta(e1−e3)=0, the aggregate value of node e1 can at most be the same as SUM(e3,1). SUM(e4, 1)=1.1 because node e4's own score is 0.1 and node e4 has only one neighbor. Thus, N(e4...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


