Enabling recommendations and community by massively-distributed nearest-neighbor searching

a technology of nearest-neighbor searching and recommendation, applied in the field of enabling recommendations and community by massively-distributed nearest-neighbor searching, can solve the problems of limiting the system's ability, unable to effectively leverage existing solutions, and difficulty in matching people with extremely similar tastes and interests

Inactive Publication Date: 2006-01-26
EMERGENT MUSIC
View PDF4 Cites 474 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0991] Further, it should be noted that a user may have a plurality of taste profiles. For instance, a user may have one type of music he likes to listen to while studying, and another type he likes to listen to while dancing. Preferred embodiments of the invention allow the user to choose different taste profiles—and correspondingly different nearest neighbors and recommendations—according to mood.
[0999] Note that some similarity metrics, such as Pearson's r, enable the computation of levels of probabilistic certainty, or p-values, with respect to a null hypothesis. In many cases, such as r, it is possible to state a null hypothesis that roughly corresponds to the concept “the two users have no particular tendency to agree.” This enables the system to take into account the fact that some pairs of users have more data to base the metric on then others, and thus more reason to have confidence. This is a significant advantage over many of the simpler techniques. However, this approach nevertheless has a drawback. As an example consider two users with a very large number of items in common which they have each rated, where a p-value derived from r is used as the metric. Suppose further that on average, there is a slight tendency to agree rather than disagree. Then, simply due to the large number of items with ratings in common, the p-value may be extremely indicative of rejection of the null hypothesis, even though on average, there isn't a very unusual amount of agreement between ratings. In practical use with a large number of users, where not too many nearest neighbors need to be found, this effect is normally not a major problem, because there will also be users who do have a lot of agreement and who also have a high number of rated items in common, and such pairings will result in even greater extremities of p-values. In such cases, there can be a lot of confidence that the similarity metric is finding users who are actually very similar in taste—even though their may be other pairings, with even more similarity, that are left behind due to not having as much data for comparison.
[1006] What is needed is a means for facilitating retrieval of representations of nearest neighbor candidate taste profiles and associated user identifiers in an order such that said nearest neighbor candidate taste profiles tend to be at least as similar to a taste profile of the target user according to a predetermined similarity metric as are subsequently retrieved ones of said nearest neighbor candidate taste profiles.
[1010] In preferred embodiments the data used in facilitating this retrieval is a subset of the data used in the similarity metric, or a summary derived from that data, or a combination of the two, in order to lower computational costs.
[1040] Note that to increase the performance over protocols such as Gnutella that are popular at the current time, currently preferred embodiments use the peer-to-peer method described in [12]. Also, at the time that user machines connect for a new session in the peer to peer network, they should connect to randomly chosen seed nodes in order to increase the randomness of results obtained from searches.

Problems solved by technology

However, none of the existing solutions effectively leverages the fact that users of online recommendations systems and online community systems typically own their own computers, and have the opportunity to make the central processing units of those computers available for making such systems more useful and enjoyable.
In particular, the task of matching people with extremely similar tastes and interests becomes very computationally difficult as the number of people increases and as the complexity of the similarity measure increases.
With hundreds of thousands or even millions of people such as are typically enrolled in major online services, limitations of server hardware resources constrain the system's ability to find the best matches between people based on taste and interest.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Enabling recommendations and community by massively-distributed nearest-neighbor searching
  • Enabling recommendations and community by massively-distributed nearest-neighbor searching
  • Enabling recommendations and community by massively-distributed nearest-neighbor searching

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[1061]FIG. 1 illustrates an embodiment in which each client node is responsible for determining its own user's nearest neighbors. Representations of user profiles and associated user identifiers 5 are provided in order of likely similarity to the user. See, for example, the descriptive text for clusterfitter.py in Appendix 4, which describes a way a client node can determine the order in which to download each one of a set of clusters. (The source code itself appears the computer program listing appendix.) In the preferred embodiment, these clusters are downloaded with the help of other client nodes using BitTorrent. In the preferred embodiment there are a limited number of clusters, retrieved by each client node in its own appropriate order. Not every cluster is retrieved by every client, because only a certain amount of time is available to do the downloads. But on the whole, each can generally, in time, be found on a number of client nodes. This enables a BitTorrent tracker runni...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The computer associated with each of a potentially large number of end users is harnessed to provide a massively-distributed mechanism for finding the nearest neighbors of each user, according to tastes and / or interests. Once these nearest neighbors are determined, there taste or and / or interest profiles are leveraged for highly accurate recommendations, and their online addresses are leveraged for community purposes.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS [0001] This application is a continuation-in-part of International Patent Application: PCT / US2005 / 02731, filed 27 Jan. 2005 for Enabling Recommendations and Community By Massively-Distributed Nearest-Neighbor Searching, which claims priority from and benefit of the following U.S. Provisional Patent Applications: 60 / 540,041 filed 27 Jan. 2004, for Enabling Recommendations and Community by Massively-Distributed Nearest-Neighbor Searching; 60 / 611,222 filed 18 Sep. 2004 for Community and Recommendation System; and 60 / 635,197 filed 9 Dec. 2004 for Community and Recommendation System. Applicant hereby claims priority from and benefit of the aforesaid applications 60 / 611,222 and 60 / 635,197. Applicant hereby incorporates by reference herein to the fullest extent allowed by law the entire disclosure of each of the aforesaid applications, including all text, drawings, and code whether on paper or machine-readable media.RESERVATION OF COPYRIGHT Copyright...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F15/16
CPCG06F15/16G06Q10/00H04L12/58H04L12/00H04L12/185G06Q30/02H04L12/1827H04L12/4625G06Q10/107H04L51/52H04L51/00
Inventor ROBINSON, GARY
Owner EMERGENT MUSIC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products