Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for distributing users to clusters

A user and cluster technology, applied in special data processing applications, instruments, computing, etc., and can solve problems such as difficulty in implementation

Active Publication Date: 2014-11-12
GOOGLE LLC
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these techniques have disadvantages
For example, the running time of HAC is O(n 2 ), which is difficult to achieve for hundreds of millions of values ​​of n; and the k-means algorithm needs to represent the mean of the data points, which is not feasible when the data points are sets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for distributing users to clusters
  • Method and system for distributing users to clusters
  • Method and system for distributing users to clusters

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] figure 1 A logical illustration of the following minhash method for clustering users is shown. While this approach can be implemented, it is presented here primarily for purposes of explanation. The following will refer to figure 2 A practical implementation for clustering users in a system with a large number of users is described.

[0025] Such as figure 1 As shown, the inputs to the minimal hashing method are: a population of items 110, denoted U; a set of k permutations 112, denoted p1, p2, . . . , pk; and a user's interest set 114, denoted for user A as X_A.

[0026] A permutation is a permutation in the range U that is uniformly selected from the set of all permutations in the range U so that each permutation has the same probability of being selected as the other permutations. Permutations are every one-to-one mapping of U to U (bijective). This permutation is only possible if U is fixed and countable. The integer k is a selection parameter. Usually the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Methods and apparatus, including systems and computer program products, to provide clustering of users in which users are each represented as a set of elements representing items, e.g., items selected by users using a system. In one aspect, a program operates to obtain a respective interest set for each of multiple users, each interest set representing items in which the respective user expressed interest; for each of the users, to determine k hash values of the respective interest set, wherein the i-th hash value is a minimum value under a corresponding i-th hash function; and to assign each of the multiple users to each of the respective k clusters established for the respective user, the i-th cluster being represented by the i-th hash value. The assignment of each of the users to k clusters is done without regard to the assignment of any of the other users to k clusters.

Description

[0001] This application is a PCT international application with an international filing date of August 15, 2006 and an international application number of PCT / US2006 / 031868, which entered the Chinese national phase with a national application number of 200680038100.7, entitled "Possibility of Set-Based Similarity A divisional application of the patent application for "Extending User Clustering". technical field [0002] The present invention relates to digital data processing, and more particularly to grouping users of computer applications or systems into clusters. Background technique [0003] The operation of grouping users into clusters serves several purposes. To achieve user personalization, for example, a well-known technique, collaborative filtering, involves clustering users and recommending to users items that other users in the user cluster have expressed interest in. A user may generally be considered to express interest in an item in a variety of ways, for exam...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06Q30/02G06F17/30867G06F16/9535
Inventor 马尤尔·达塔尔阿舒托什·加尔格
Owner GOOGLE LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products