Reverse attention model based on multi-scale depth supervision

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An attention model and attention technology, applied in the field of pedestrian re-identification, can solve problems such as weakening of attention, loss of feature information, increase in network model complexity, etc., and achieve the effect of improving timeliness and advanced performance

Inactive Publication Date: 2021-06-04

TONGJI UNIV

View PDF1 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, when these features are reweighted, some features are emphasized, and the attention of other features is weakened, resulting in the loss of some important feature information.

The network model based on multi-scale feature learning often embeds the multi-scale feature learning module into the feature extraction network. Although this embedding operation can improve the feature learning ability of the model to a certain extent, it will increase the complexity of the network model, so it is urgently needed. A model that can solve the problems in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0045] The structural diagram of a reverse attention model based on multi-scale depth supervision proposed by the present invention is as follows: figure 1 As shown, the ResNet-50 network pre-trained on the ImageNet dataset is used as the backbone framework to extract different levels of deep features from pedestrian pictures. Remove the last spatial downsampling operation, the original global average pooling operation, and the fully connected layer of the ResNet-50 network, and then re-add the average pooling layer and the linear classification layer at the end of the network. The intermediate layer features generated by the four stages of the ResNet-50 network are used as the input of the attention mechanism module and the reverse attention mechanism module. The proposed multi-scale feature learning layer as figure 2 As shown, in order to reduce the amount of GPU memory occupied by the training network, only the outputs of the second and third stages are selected to partic...

Embodiment 2

[0096] In order to verify the effectiveness of the model proposed in this application, this embodiment conducts relevant experimental verification on three large-scale public pedestrian re-identification datasets: Market-1501, CUHK03, and DukeMTMC-reID. The experimental parameter settings and experimental results of the application will be described in detail below.

[0097] Experiment details:

[0098] The network model proposed in this application is implemented on the PyTorch framework. All experiments are carried out on two TITAN XP graphics cards, and the dimensionality reduction ratio parameter r in the attention mechanism module is set to 16. The size of all training pictures is set to 384×128 pixels, and the training data set is expanded with random erasure and random horizontal flip. The batch data block size of each training is set to 64, which contains 16 different pedestrians, and each pedestrian contains 4 pedestrian pictures. Loss function weight coefficient λ ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a reverse attention model based on multi-scale depth supervision. The model comprises an input end, a multi-scale feature learning module, an attention mechanism module, a reverse attention mechanism module, a depth supervision module, a plurality of loss functions, a plurality of average pooling layers, a plurality of linear layers and branches. The multi-scale feature learning module is used for performing multi-scale learning and training on the depth features; the attention mechanism module is used for enhancing attention on local important feature information; the reverse attention mechanism module is used for changing the features inhibited by the attention mechanism module into emphasized features and complementing the attention mechanism; and the deep supervision module is used for correcting the attention accuracy of the attention mechanism module on the important features. A reverse attention mechanism module is provided, the problem of feature information loss caused by an attention mechanism is relieved, part of modules can be discarded in the test stage of the model, and the test efficiency is improved.

Description

technical field [0001] The present invention relates to the field of pedestrian re-identification, in particular to a reverse attention model based on multi-scale deep supervision. Background technique [0002] Pedestrian Re-Identification (PReID) is the task of automatically judging whether the pedestrians captured by different traffic cameras or the same traffic camera at different times are the same pedestrian. Due to its important role in the application of intelligent video surveillance systems, person re-identification has received extensive attention in the field of computer vision in recent years. The resolution of pedestrians captured in real scenes is low, and traditional biometric information cannot be obtained accurately. At present, this task mainly relies on the appearance characteristics of pedestrians for identification. However, there are differences in illumination, posture, angle of view and background in pictures of pedestrians captured in different scen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/00G06K9/62G06N3/08G06N3/04

CPCG06N3/08G06V40/103G06N3/045G06F18/217G06F18/24G06F18/214G06V10/82G06V40/10G06V20/52G06N3/09G06N20/00G06F18/25

Inventor 黄德双吴迪元昌安赵仲秋黄健斌

Owner TONGJI UNIV

Reverse attention model based on multi-scale depth supervision

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology