Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Sample set processing method and device and sample query method and device

A sample set and sample technology, applied in the computer field, can solve problems such as long query time, large amount of calculation, difficulty in meeting user needs, etc., achieve fast query and retrieval, and improve computing speed

Active Publication Date: 2018-07-03
ADVANCED NEW TECH CO LTD
View PDF7 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When there is a large amount of data in the sample database to be retrieved, if a brute force search is used to perform high-dimensional vector calculations for each sample, the amount of calculation is very large, resulting in too long query time, making it difficult to meet user needs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sample set processing method and device and sample query method and device
  • Sample set processing method and device and sample query method and device
  • Sample set processing method and device and sample query method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The solutions provided in this specification will be described below in conjunction with the accompanying drawings.

[0024] figure 1 It is a schematic diagram of an implementation scenario of an embodiment disclosed in this specification. Such as figure 1 As shown, a sample set composed of a large number of samples is stored in the storage platform, and these samples may be various content types such as pictures, audio, and documents. The storage platform can be a centralized platform or a distributed platform (such as hadoop distributed file system HDFS). In order to respond to users' search queries on these complex samples, the computing platform analyzes and processes the sample sets in the storage platform offline in advance. The offline processing of the computing platform mainly includes two parts: classification processing and index processing. During the classification process, the computing platform performs two-level clustering on the samples in the sampl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Embodiments of the invention provide a method and device for carrying out classified processing and index processing on sample sets and a method and device for querying similar samples. In the classified processing, two stages of clustering are carried out on samples in a sample set, and clustering results are recorded in a first vector table and a second vector table. In the index processing, twostages of indexes are established for each sample in the sample set, the first stage of index points to a coarse clustering center to which the sample belongs, and the second stage of index points toa segmented clustering center corresponding to a segmented vector of the sample. In a process of querying similar samples, two stages of retrieval are carried out on queried samples. In the first stage of retrieval, coarse clustering centers to the queried samples are determined from the first vector table, and comparison samples belonging to the coarse clustering centers are obtained. And in thesecond stage of retrieval, the comparison samples, distances of which satisfy a predetermined condition, are taken as similar samples. In such a way, rapid retrieval and query of samples are realized.

Description

technical field [0001] One or more embodiments of this specification relate to the field of computer technology, and in particular to preprocessing of sample sets, and sample query methods and devices. Background technique [0002] With the upgrading of the Internet, people increasingly use the Internet to carry out more searches and inquiries. For example, people are very accustomed to using various search engines to search for interesting content. At the same time, people's search and query objects are becoming more and more complex. For example, searching for text keywords gradually develops into searching for pictures, searching for music, and so on. The difficulty of searching increases exponentially as the objects to search and query become more complex. First, complex objects usually need to be represented by high-dimensional vectors. Therefore, in the process of searching, it is usually necessary to compare the distance or similarity between many high-dimensional ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/316G06F16/35G06F16/951G06F18/231G06F18/241G06F16/182G06F16/2237G06F16/245G06F16/285
Inventor 杨文
Owner ADVANCED NEW TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products