Cross-modal person re-identification method based on dual attribute information

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of person re-identification and attribute information, applied in the field of computer vision and deep learning, can solve the problems of insufficient consideration and neglect, and achieve the effect of sufficient application and improved semantic expression.

Active Publication Date: 2022-02-01

SHANDONG ARTIFICIAL INTELLIGENCE INST

View PDF9 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the semantic expressiveness of the features extracted by these methods needs to be improved. They ignore whether it is effective to use the attribute information of pedestrians to represent semantic concepts, or consider it insufficiently.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0055] Extracting the pedestrian text description in step a) includes the following steps:

[0056] a-1.1) The present invention preprocesses the text information when extracting features from the pedestrian text, that is, establishes a word frequency table after segmenting the description sentences of the content captured by the surveillance camera.

[0057] a-1.2) Filter out low-frequency words in the word frequency table.

[0058] a-1.3) The words in the word frequency table are encoded using one-hot encoding.

[0059] a-1.4) Use bidirectional LSTM model for feature extraction of pedestrian textual descriptions. The bidirectional LSTM model can fully consider the context of each word, making the learned text features richer.

[0060] Extracting pictures in step a) includes the following steps:

[0061] a-2.1) Use the ResNet network pre-trained on the ImageNet dataset for image feature extraction;

[0062] a-2.2) Perform semantic segmentation on the extracted pictures, a...

Embodiment 2

[0064] A lot of work has been done on the attribute recognition of pedestrian pictures, and good results have been achieved. The present invention selects and uses a relatively stable pedestrian attribute recognition model, and extracts the attributes and possibility values contained in the pedestrian pictures in the data set. Step b) The extraction steps are as follows:

[0065] b-1) Use the NLTK tool library to preprocess the data of pedestrian text descriptions, and extract noun phrases in two formats: adjective plus noun, and multiple nouns superimposed;

[0066] b-2) Sort the extracted noun phrases according to word frequency, discard the low-frequency phrases, keep the top 400 noun phrases to form an attribute table, and obtain the text attribute c T ;

[0067] b-3) Use the PA-100K data set to train the pictures to obtain 26 kinds of predicted values, mark the attributes of the pictures with the predicted value greater than 0 as 1, and mark the attributes of the pictu...

Embodiment 3

[0069] The present invention uses the shared subspace method commonly used in the field of cross-modal pedestrian re-identification to establish the association between the feature vectors of the two modalities. The setting of the latent space is to make the image features and text features of pedestrians have pedestrian id There is a basic semantic relationship between separability and graphic features. The present invention considers that in cross-modal pedestrian image-text retrieval, the same pedestrian id corresponds to multiple pictures and multiple corresponding text descriptions, so the design goal of the loss function is to shorten the pictures and text descriptions belonging to the same pedestrian id distance between pictures and texts that do not belong to the same pedestrian id. Specifically, let the data in one of the modalities serve as anchors, take the data belonging to the same class as the anchor in the other modality as positive samples, and take the data be...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

A cross-modal pedestrian re-identification method based on dual attribute information, through the cross-modal pedestrian re-identification method based on dual attribute information, fully utilizes the rich semantic information extracted from the data of the two modalities, and provides a method based on The dual attribute space construction and attribute fusion algorithm of text attributes and image attributes improves the semantic expressiveness of the features extracted by the model by constructing a cross-modal person re-identification end-to-end network based on latent space and attribute space. A new cross-modal person re-identification end-to-end network based on latent space and attribute space is proposed to solve the problem of cross-modal image-text person re-identification, which greatly improves the semantic expressiveness of the extracted features. The application of information is more adequate.

Description

technical field [0001] The invention relates to the fields of computer vision and deep learning, in particular to a cross-modal pedestrian re-identification method based on dual attribute information. Background technique [0002] In the information age, video surveillance plays an irreplaceable role in maintaining public safety. Pedestrian re-identification is an important sub-task in video surveillance scenarios. It aims to find the same pedestrian in the image data generated by different surveillance cameras. Photo. The application areas of public security monitoring facilities are becoming more and more extensive, resulting in massive image data. How to quickly and accurately find the target person in the massive image data is a research hotspot in the field of computer vision, but in some specific emergency scenarios, People cannot provide timely pictures matching the pedestrians they are looking for as the basis for retrieval, and can only provide verbal descriptions,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G06V40/20G06V40/10G06V20/40G06V20/52G06V10/26G06N3/04

CPCG06N3/049G06V40/25G06V40/10G06V20/40G06V20/52G06V10/267G06N3/044G06N3/105G06N3/08G06V40/103G06N3/045G06F18/253

Inventor高赞陈琳宋雪萌王英龙聂礼强

OwnerSHANDONG ARTIFICIAL INTELLIGENCE INST

Cross-modal person re-identification method based on dual attribute information

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology