Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Object perception image fusion method for multi-modal target tracking

A target tracking and image fusion technology, applied in the field of target tracking, can solve problems such as difficult to deal with, ignore mode sharing and potential value of object information, and achieve the effect of enhancing robustness, excellent model performance, and improving texture information

Active Publication Date: 2021-05-28
TIANJIN UNIV
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This way weights the two modalities equally, however one modality may actually have a better value than the other; on the other hand, by extending the single-modal tracker to a multi-modal tracker, the designed Some benchmark RGBT trackers, done by concatenating features from RGB and thermal modalities directly into vectors, which are then fed into the tracker, these weights are used to fuse complementary features from RGB and thermal images so that they can be effectively properties of deployment-specific modalities, but ignores the potential value of modality sharing and object information, which is crucial for multimodal tracking
[0005] These methods rely on hand-crafted features or single-structure adapter deep networks for object localization, which are difficult to deal with the challenges of appearance changes due to object deformation, sudden motion, background clutter, and occlusion, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Object perception image fusion method for multi-modal target tracking
  • Object perception image fusion method for multi-modal target tracking
  • Object perception image fusion method for multi-modal target tracking

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0036] An embodiment of the present invention provides an object-aware image fusion method for multi-modal target tracking. The method includes the following steps:

[0037] 101: Acquire adaptive fusion images, that is, by inputting RGB images and thermal modality images into dual channels, performing saliency detection on the images, cascading and connecting the outputs of different layers in the network, each channel contains three aggregation modules, According to the deep features of the connection, the sliding window is used to judge the image gray value, pixel intensity, similarity measure and consistency loss to adaptively guide the network to reconstruct the fusion image;

[0038] Further, using the verification set, adjust the hyperparameter network structure diagram as shown in figure 1 shown.

[0039]102: The training set is obtained through the adaptive fusion network to obtain the fusion training set, the fusion training set is used to train the tracking network,...

Embodiment 2

[0043] The scheme in embodiment 1 is further introduced below in conjunction with specific examples and calculation formulas, see the following description for details:

[0044] The present invention adopts the largest multi-modal target tracking data set RGBT234 during training. This data set is expanded from RGBT210. It contains 234 aligned RGB and thermal modal video sequences, including 200,000 frames of pictures in total. The longest video The sequence reaches 4000 frames.

[0045] The fusion task generates an informative image containing a large amount of thermal information and texture details, the first components such as figure 1 As shown, it mainly consists of three main components: feature extraction, feature fusion, and feature reconstruction. Saliency detection is performed for the entire image, cascading and fully connecting the outputs of different layers in the network, RGB image and thermal image into two channels respectively, both channels consist of a C1 a...

Embodiment 3

[0078] Embodiment 1 adopted by the embodiment of the present invention is as image 3 As shown, this figure reflects the RGB image on the left, the thermal image in the middle, and the fusion image on the right. When the parameter λ=0.01, the fusion result is displayed. The fusion image contains a lot of thermal information and texture details.

[0079] The embodiment 2 adopted by the embodiment of the present invention is as figure 2 and image 3 As shown, these two images are the PR and SR score maps of 12 trackers on the RGBT234 dataset. PR is the percentage of frames within a given threshold distance between its output position and the real value, and SR is the ratio of successful frames whose overlap is greater than the threshold. From the results, it can be seen that the tracker using this method is much better than other trackers, especially Outperforms the next best tracker MANet (77.8%) by 3.2% in PR. Outperforms the second-best MANet (54.4%) by 6.1% in SR.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an object perception image fusion method for multi-modal target tracking, and the method comprises the following steps that: a self-adaptive fusion image is obtained, the saliency detection of the image is carried out, the outputs of different layers in a network are cascaded and connected, an RGB image and a thermal modal image are inputted into two channels, each channel comprises three aggregation modules, and reconstruct a fusion image is ed by judging the gray value, pixel intensity, similarity measurement and consistency loss of the image according to the connection deep features and adaptively guiding the network; the feature combination module combines the features extracted from the sample and the search image by using depth cross-correlation operation to generate corresponding similarity features for subsequent target positioning, and then performs classification and regression to perform target positioning; a regression network without an anchor frame is used for tracking training, all pixels in a bounding box in groundtruth are used as training samples, weak prediction can be corrected to a certain extent, and a correct position is corrected; and the sampling position is changed by adopting deformable convolution, so that the sampling position is aligned with the prediction bounding box, and the classification confidence corresponds to the target object, so that the classification confidence is more reliable.

Description

technical field [0001] The invention relates to the field of target tracking, in particular to an object-aware image fusion method for multi-modal target tracking. Background technique [0002] Target tracking is widely used in video surveillance, automatic driving and robotics. It has always been the focus of computer vision research. Its definition is as follows: given the size and position of the target in the initial test frame of a video sequence, predict the target in the subsequent frame size and position. The main challenge of tracking is that the target object may have severe occlusion, large deformation and illumination changes. [0003] In recent years, it has been found that thermal infrared can provide more stable signals, and the promotion of thermal infrared cameras has promoted the development of several fields, such as object segmentation, person Re-ID and pedestrian detection, so multimodal (RGBT) tracking has caused More research. Multimodal tracking ca...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06T7/246G06K9/46G06K9/62G06N3/04G06N3/08
CPCG06T7/246G06N3/04G06N3/084G06T2207/10016G06T2207/10024G06T2207/10048G06V10/462G06F18/24G06F18/253Y02T10/40
Inventor 朱鹏飞王童胡清华
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products