Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Visual target retrieval method and system based on target detection

A target detection and retrieval system technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of noise interference, poor practicability, noise, etc., to achieve the effect of improving average accuracy and speeding up retrieval speed

Active Publication Date: 2017-12-26
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF5 Cites 48 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The strategy it adopts is to use the features in multiple candidate regions for exhaustive matching, and then find the similarity scores of the two candidate regions with the highest similarity as the final similarity scores of the two images. This process can be regarded as a query image It is very time-consuming to match the predicted multiple targets with the multiple predicted targets of the images in the library. On average, it takes nearly 2.5 minutes to query a picture, and the practicability is poor.
In addition, the literature uses the candidate area generated by RPN to replace the sliding window in R-MAC, and learns a global feature expression end-to-end on the cleaned Landmark dataset, thus achieving the best result of current visual target retrieval, but It treats the characteristics of each candidate window equally, but in fact only one or a few windows actually contain the target to be retrieved, so it will introduce a lot of noise. In addition, because the data set it learns mainly contains landmarks, it is only obtained in landmark retrieval. After a good result, the results on other data sets are unknown. The above two methods regard the features in the predicted candidate window as the descriptor of the local block of the image to perform feature matching. The neural network generates the candidate window. In fact, it is time-consuming, and the candidate window is a rectangle, which is different from the actual shape of the object.
[0005] Therefore, there are the following problems in the existing method for solving target retrieval by means of target detection technology: one, the time complexity of exhaustive matching using candidate windows is high ( image 3 shown); Second, the equal weight of each candidate window is likely to cause noise interference ( Figure 4 shown); 3. The candidate area of ​​the rectangle does not match the actual shape of the object, such as Figure 2A and Figure 2B shown, containing two different but very similar buildings

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Visual target retrieval method and system based on target detection
  • Visual target retrieval method and system based on target detection
  • Visual target retrieval method and system based on target detection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In view of the above problems, the present invention proposes a kind of attention mechanism based on detection and voting for image retrieval ( Figure 5 Shown), and the following specific examples, and in conjunction with the accompanying drawings described in detail as follows. The invention belongs to the image search technology based on deep learning. The whole technical framework of the present invention is as Figure 6 as shown, Figure 6 The middle and lower part corresponds to the offline network training stage, which is the frame diagram of Faster R-CNN, which obtains a target detection model through IDF (inverse document frequency) weighted cross-entropy loss function training, which is the final target detection model below; Figure 6 The upper middle part corresponds to the online feature extraction stage, which uses the final target detection model to extract the feature vector of the convolutional layer of the picture to be retrieved, and aggregates the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a visual target retrieval method and system based on target detection. The method comprises the steps that an IDF weighted cross entropy loss function is adopted to train a public target detection dataset, and a preliminary target detection model is generated; a retrieval dataset containing a target type designated by a user is adopted to slightly adjust the preliminary target detection model, and a final target detection model is generated; and feature extraction is performed on a visual target in a to-be-retrieved picture through the final target detection model, multiple convolution feature graphs of the to-be-retrieved picture are generated, the convolution feature graphs are aggregated through a spatial attention matrix, aggregate feature vectors are generated, and a picture matched with the aggregate feature vectors is retrieved in a picture library. According to the method, visual target retrieval and detection are associated, so that a candidate window prediction step is avoided; and the attention matrix is obtained by selectively accumulating the feature graphs, local descriptors of a convolution layer are aggregated into a global feature expression in a weighted mode, the global feature expression is used for visual target retrieval, and retrieval speed and precision are improved.

Description

technical field [0001] The invention relates to the field of multimedia content analysis, in particular to a visual target retrieval method and system based on target detection. Background technique [0002] Visual object retrieval is a kind of image retrieval, which is widely used in commodity search, object recognition, object tracking and other fields. Different from content-based approximate image retrieval, visual object retrieval does not retrieve images similar to the query image, but images with the same visual object as the query image. Such as figure 1 As shown in the content on the left, this visual target only occupies a small part of the image (the target is in the white box), and the query image containing this target and figure 1 The images in the gallery on the right vary widely in terms of camera angle, lighting, shape, and size. Studying visual object retrieval is both important and challenging. [0003] The traditional target retrieval method performs ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/5838G06F18/24G06F18/214
Inventor 唐胜肖俊斌李锦涛
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products