Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Pedestrian identification method based on local feature perception image-text cross-modal model and model training method

A technology of local features and training methods, applied in the field of pattern recognition, can solve the problems of insufficient precision, complex feature extraction process, difficult to put into practical application scenarios, etc., and achieve the effect of high accuracy and simple structure

Pending Publication Date: 2022-07-12
NANJING UNIV OF INFORMATION SCI & TECH
View PDF2 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the above methods still have the problems of complex feature extraction process and insufficient precision, and it is difficult to put them into practical application scenarios.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Pedestrian identification method based on local feature perception image-text cross-modal model and model training method
  • Pedestrian identification method based on local feature perception image-text cross-modal model and model training method
  • Pedestrian identification method based on local feature perception image-text cross-modal model and model training method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0052] This embodiment provides a training method for a local feature-aware graphic-text cross-modal model, wherein the local-feature-aware graphic-text cross-modal model is based on the Pytorch deep learning framework, and is used to mine the feature information of pedestrian images and text descriptions. The perceptual image-text cross-modal model includes a visual feature extraction module and a text feature extraction module. The visual feature extraction module includes a PCB structure for extracting local images, and the text feature extraction module includes a multi-branch convolution structure for extracting text features. Each branch of the multi-branch convolutional structure is aligned with one of the local images.

[0053] Specifically, as figure 1 As shown, the training method of the local feature-aware image-text cross-modal model is as follows.

[0054] 1. Prepare the graphic data set

[0055] Construct an image and text data set, which includes a training s...

Embodiment 2

[0113] This embodiment provides a pedestrian recognition method based on a local feature-aware graphic and text cross-modal model, such as image 3 and Figure 4 As shown, the pedestrian identification method includes:

[0114] Get the graphic data of pedestrians,

[0115] Input the pedestrian's graphic data into the pre-trained local feature-aware graphic-text cross-modal model for feature extraction, and output the pedestrian recognition result.

[0116] The construction and training of the local feature-aware graphic and text cross-modal model have been clearly described in Embodiment 1, and will not be repeated here.

[0117] As will be appreciated by one skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a pedestrian recognition method based on a local feature perception image-text cross-modal model and a model training method, and belongs to the technical field of mode recognition. The pedestrian recognition method comprises the following steps: acquiring image-text data of pedestrians, and inputting the image-text data of the pedestrian into a pre-trained local feature perception image-text cross-modal model for feature extraction, and outputting a pedestrian recognition result. The local feature perception image-text cross-modal model comprises a visual feature extraction module and a text feature extraction module, PCB local feature learning is introduced to visual feature extraction, a multi-branch convolution structure is introduced to text feature extraction, and image-text local features can be efficiently extracted without introducing semantic segmentation, attribute learning and the like. Cross-modal matching is carried out on three levels of shallow features, local features and global features, and image-text feature distribution is gradually pulled in. The method is simple in structure and high in accuracy, and application of the image-text cross-modal pedestrian retrieval field in actual scenes can be promoted.

Description

technical field [0001] The invention relates to a pedestrian recognition method and a model training method based on a local feature-aware graphic and text cross-modal model, belonging to the technical field of pattern recognition. Background technique [0002] Manually reviewing surveillance cameras to find target pedestrians may have problems such as high time cost, easy omission, and low reliability. In addition, in some specific scenarios, it is impossible to perform intelligent retrieval through technologies such as pedestrian re-recognition and face recognition. For example, witnesses do not take pictures of the target, and can only describe the appearance of pedestrians through dictation. [0003] Existing related technologies are as follows: (1) A text-based pedestrian retrieval self-supervised visual representation learning system and method with application number CN202010590313.2: the algorithm constructs auxiliary tasks (gender judgment and pedestrian similarity ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06V40/10G06V10/40G06V10/74G06V10/774G06K9/62G06V10/82G06N3/04G06N3/08
CPCG06N3/049G06N3/08G06N3/045G06N3/044G06F18/22G06F18/214
Inventor 陈裕豪张国庆
Owner NANJING UNIV OF INFORMATION SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products