Unlock instant, AI-driven research and patent intelligence for your innovation.

A Weakly Supervised Object Classification and Localization Method Based on Divergence Learning

A technology of target classification and localization method, applied in the field of weakly supervised target classification and localization based on divergence learning, it can solve problems such as difficulty in optimizing object localization, and achieve the effect of optimizing image classification loss, optimizing divergence loss, and high performance.

Active Publication Date: 2020-08-21
UNIVERSITY OF CHINESE ACADEMY OF SCIENCES
View PDF19 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Specific activation target parts are able to minimize image classification loss, but encounter difficulties in optimizing object localization

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Weakly Supervised Object Classification and Localization Method Based on Divergence Learning
  • A Weakly Supervised Object Classification and Localization Method Based on Divergence Learning
  • A Weakly Supervised Object Classification and Localization Method Based on Divergence Learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0087] 1. Database and sample classification

[0088] The divergence network is evaluated on the commonly used CUB-200-2011 and ILSVRC2016 data sets. CUB-200-2011 contains 11,788 images of 200 species of birds, of which 5,994 are used for training and 5,794 are used for testing. According to taxonomy, we divide 200 species of birds into three levels, including 122 genera, 37 families and 11 orders. For ILSVRC2016, we used 1.2 million images and 1,000 classes for training, and used 5,000 images for testing in the validation set. We apply the ready-made category hierarchy that appears with the ILSVRC 2016 data set. For example, "dog", "cat" and "rabbit" are grouped into the parent category "animal", and "chairs" and "tables" are grouped into the parent category "furniture".

[0089] Construction of classification and positioning network: Integrate the divergent activation module with VGGnet and GoogLeNet, including VGGnet and GoogLeNet: delete the VGG-16 network and the pooling la...

experiment example 1

[0130] The effectiveness of the hierarchical divergence activation module and the differential divergence activation module (differential divergence) in the network and the proposed regularization factor λ are respectively verified.

[0131] 1) The influence of hierarchical divergence activation module and differential divergence activation module

[0132] Table 5: The influence of the level of divergence activation module and the difference divergence activation module

[0133]

[0134]

[0135] As shown in Table 5, compared with the baseline CAM method, the introduction of the hierarchical divergence activation module reduces the top-1 / top-5 positioning error rate by 5.14% / 4.36%. in Image 6 In, the example of activation diagram shows the impact of hierarchical divergence activation modules. Only from the supervision of sub-category tags, CAM tends to activate object parts, such as bird heads. Through the introduction of hierarchical supervision of image categories, the activatio...

experiment example 2

[0142] Experimental example 2 The influence of the number of feature output layers

[0143] The divergence learning network model based on the VGGnet network was tested on the CUB-200-2011 test set to verify the influence of the number of feature output layers. The results are shown in Table 6 below.

[0144] Table 6 The influence of the number of feature output layers on positioning

[0145] Feature output layers Positioning error rate 155.85 252.8 350.71 451.34

[0146] It can be seen in Table 6 that as the number of feature output layers increases, the overall positioning error rate is decreasing, which shows that the use of hierarchical divergence activation modules can effectively improve the positioning effect, and when the number of feature output layers increases from three to four When layering, the positioning result drops. This is because the shallow features are not enough to distinguish the object category, which affects the positioning result.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a weakly supervised target classification and positioning method based on divergence learning. The implementation process is: constructing a training sample set; constructing a classification and positioning network based on a hierarchical divergence activation module or a hierarchical divergence activation module-difference divergence activation module; The samples in the training sample set are input into the initially constructed classification and localization network for multi-scale target feature extraction; the loss function is designed, the gradient is calculated according to the loss function, the gradient backpropagation is performed on the entire convolutional layer network, and the convolutional layer network is updated parameters to optimize the network. The method of the present invention proposes two kinds of divergent learning forms: differential divergent learning and hierarchical divergent learning. The two divergent learning excavates the positioning information of the target from different angles, and finally activates the complete target area. This method can find the complementarity and The discriminative visual mode can maintain the high performance of image classification while accurate target positioning, and has very good practicability and scalability.

Description

Technical field [0001] The present invention relates to the field of computer vision and image processing, and in particular to a method for classifying and positioning weakly supervised targets based on divergence learning. It uses divergent ideas to learn different representations of targets, and uses a joint optimization method to train the network. In the case of labeling, the target in the image is located, and it can be better extended to large-scale data positioning when the image labeling workload is large and there is noise. Background technique [0002] As a basic problem in the field of vision, object detection is the basis of many vision applications. Traditional supervised target detection models often need to accurately mark the location of each target in a large number of images. Although this kind of method can rely on a large amount of annotation information to learn target recognition and positioning information, it puts forward very high requirements on the co...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62G06N3/04
CPCG06N3/045G06F18/217G06F18/24
Inventor 万方薛昊岚刘畅付梦莹叶齐祥韩振军焦建彬
Owner UNIVERSITY OF CHINESE ACADEMY OF SCIENCES