Object perception image fusion method for multi-modal target tracking

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A target tracking and image fusion technology, applied in the field of target tracking, can solve problems such as difficult to deal with, ignore mode sharing and potential value of object information, and achieve the effect of enhancing robustness, excellent model performance, and improving texture information

Active Publication Date: 2021-05-28

TIANJIN UNIV

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

This way weights the two modalities equally, however one modality may actually have a better value than the other; on the other hand, by extending the single-modal tracker to a multi-modal tracker, the designed Some benchmark RGBT trackers, done by concatenating features from RGB and thermal modalities directly into vectors, which are then fed into the tracker, these weights are used to fuse complementary features from RGB and thermal images so that they can be effectively properties of deployment-specific modalities, but ignores the potential value of modality sharing and object information, which is crucial for multimodal tracking

[0005] These methods rely on hand-crafted features or single-structure adapter deep networks for object localization, which are difficult to deal with the challenges of appearance changes due to object deformation, sudden motion, background clutter, and occlusion, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0036] An embodiment of the present invention provides an object-aware image fusion method for multi-modal target tracking. The method includes the following steps:

[0037] 101: Acquire adaptive fusion images, that is, by inputting RGB images and thermal modality images into dual channels, performing saliency detection on the images, cascading and connecting the outputs of different layers in the network, each channel contains three aggregation modules, According to the deep features of the connection, the sliding window is used to judge the image gray value, pixel intensity, similarity measure and consistency loss to adaptively guide the network to reconstruct the fusion image;

[0038] Further, using the verification set, adjust the hyperparameter network structure diagram as shown in figure 1 shown.

[0039]102: The training set is obtained through the adaptive fusion network to obtain the fusion training set, the fusion training set is used to train the tracking network,...

Embodiment 2

[0043] The scheme in embodiment 1 is further introduced below in conjunction with specific examples and calculation formulas, see the following description for details:

[0044] The present invention adopts the largest multi-modal target tracking data set RGBT234 during training. This data set is expanded from RGBT210. It contains 234 aligned RGB and thermal modal video sequences, including 200,000 frames of pictures in total. The longest video The sequence reaches 4000 frames.

[0045] The fusion task generates an informative image containing a large amount of thermal information and texture details, the first components such as figure 1 As shown, it mainly consists of three main components: feature extraction, feature fusion, and feature reconstruction. Saliency detection is performed for the entire image, cascading and fully connecting the outputs of different layers in the network, RGB image and thermal image into two channels respectively, both channels consist of a C1 a...

Embodiment 3

[0078] Embodiment 1 adopted by the embodiment of the present invention is as image 3 As shown, this figure reflects the RGB image on the left, the thermal image in the middle, and the fusion image on the right. When the parameter λ=0.01, the fusion result is displayed. The fusion image contains a lot of thermal information and texture details.

[0079] The embodiment 2 adopted by the embodiment of the present invention is as figure 2 and image 3 As shown, these two images are the PR and SR score maps of 12 trackers on the RGBT234 dataset. PR is the percentage of frames within a given threshold distance between its output position and the real value, and SR is the ratio of successful frames whose overlap is greater than the threshold. From the results, it can be seen that the tracker using this method is much better than other trackers, especially Outperforms the next best tracker MANet (77.8%) by 3.2% in PR. Outperforms the second-best MANet (54.4%) by 6.1% in SR.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an object perception image fusion method for multi-modal target tracking, and the method comprises the following steps that: a self-adaptive fusion image is obtained, the saliency detection of the image is carried out, the outputs of different layers in a network are cascaded and connected, an RGB image and a thermal modal image are inputted into two channels, each channel comprises three aggregation modules, and reconstruct a fusion image is ed by judging the gray value, pixel intensity, similarity measurement and consistency loss of the image according to the connection deep features and adaptively guiding the network; the feature combination module combines the features extracted from the sample and the search image by using depth cross-correlation operation to generate corresponding similarity features for subsequent target positioning, and then performs classification and regression to perform target positioning; a regression network without an anchor frame is used for tracking training, all pixels in a bounding box in groundtruth are used as training samples, weak prediction can be corrected to a certain extent, and a correct position is corrected; and the sampling position is changed by adopting deformable convolution, so that the sampling position is aligned with the prediction bounding box, and the classification confidence corresponds to the target object, so that the classification confidence is more reliable.

Description

technical field [0001] The invention relates to the field of target tracking, in particular to an object-aware image fusion method for multi-modal target tracking. Background technique [0002] Target tracking is widely used in video surveillance, automatic driving and robotics. It has always been the focus of computer vision research. Its definition is as follows: given the size and position of the target in the initial test frame of a video sequence, predict the target in the subsequent frame size and position. The main challenge of tracking is that the target object may have severe occlusion, large deformation and illumination changes. [0003] In recent years, it has been found that thermal infrared can provide more stable signals, and the promotion of thermal infrared cameras has promoted the development of several fields, such as object segmentation, person Re-ID and pedestrian detection, so multimodal (RGBT) tracking has caused More research. Multimodal tracking ca...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06T7/246G06K9/46G06K9/62G06N3/04G06N3/08

CPCG06T7/246G06N3/04G06N3/084G06T2207/10016G06T2207/10024G06T2207/10048G06V10/462G06F18/24G06F18/253Y02T10/40

Inventor 朱鹏飞王童胡清华

Owner TIANJIN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Object perception image fusion method for multi-modal target tracking

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology