Accelerated large-scale similarity calculation

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A correlation and computer technology, applied in calculation, complex mathematical operations, instruments, etc., can solve calculation-intensive and time-consuming problems

Pending Publication Date: 2020-04-03

GOOGLE LLC

View PDF0 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

This process becomes increasingly computationally intensive and time-consuming as the number of records stored increases

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0024] This document describes techniques for implementing a k-minimum hash or k-minimum value ("KMV") data processing algorithm to classify data preloaded at a graphics processing unit (GPU) to compute relationships between entities. Specifically, the described techniques can be used to accelerate data correlation calculations (e.g., for determining similarity between entities) by storing pre-sorted data on the GPU, so that the computing unit of the GPU can quickly determine the similarity between entities. relation. GPUs determine relationships by performing a specific type of correlation algorithm. Since the GPU is no longer required to pre-sort the data before performing the correlation algorithm, relationships can be computed or determined at the GPU at increased speed relative to current systems.

[0025] For example, entity correlation systems store large amounts of data including information describing different entities. The system may include a central processing u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining data stored at a storage device using a first processor of an entity correlation system. The data includes information about multiple entities. The first processor generates data arrays using the obtained data. Each data array includes parameter values for multiple entities and is configured for processing at a respective computing cell of a second processor. The system provides the data arrays to the second processor. The second processor is configured to execute a correlation algorithm to concurrently process the data arrays at the respective computing cells. The second processor computes a correlation score based on calculations performed at the cells using the algorithm andthe parameter values. The system determines relationships among entities of the data arrays based on the correlation score. The relationships indicate overlapping attributes or similarities that existamong subsets of entities.

Description

Background technique [0001] This specification relates to the calculation process of large-scale similarity calculation. [0002] In many cases, it may be desirable to determine whether, or to what extent, an input sample matches more than one stored record. As one example, it may be desirable to determine whether a DNA sample matches any of the records stored in a database of DNA records. A database may contain many DNA records (eg, hundreds of thousands or even millions of records). In general, it may be desirable to retrieve a certain number (n) of stored records from the database in response to an input sample. The input samples may be the n records in the database that are determined to be the n closest matches to the input samples. The number n of retrieved records is smaller than the total number of records in the database, usually much smaller. The n retrieved records can be arranged in the most probable order first. Conventionally, such a retrieval process may in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G06F17/10

CPCG06F18/22G06F18/2431G06F16/288G06F17/10G06F17/15G06F17/18G06F16/906G06F12/0802

Inventor马琳N.威甘德

OwnerGOOGLE LLC

Accelerated large-scale similarity calculation

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology