Visual target tracking method based on multi-level aggregation and attention twin network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A twin network and target tracking technology, applied in the field of visual target tracking, can solve the problems of missing details and local structure information, it is difficult to distinguish between objects with the same attributes or semantics, low resolution, etc.

Pending Publication Date: 2020-10-30

上海蠡图信息科技有限公司

View PDF0 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

But although the designs of siamese network trackers are convincing, they still inevitably have some limitations. Most tracking methods only use deep features, and usually the feature representation has a low resolution, which will lead to Some target-specific details and local structure information are lost, so these trackers tend to be less sensitive to details, and it is difficult to distinguish two targets with the same attributes or semantics

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0051] refer to figure 1 The schematic diagram is a schematic diagram of the overall framework of the multi-level aggregation and attention Siamese network proposed in this embodiment. Most of the existing trackers rely on the output features of the last layer of the Siamese backbone network to track the target, and often ignore the characteristics of different levels of features. Therefore, this embodiment proposes a new network called Siam Multi-Level Aggregation and Attention Network (SiamMLAA), which includes a head attention (HA) module, a multi-layer aggregation (MLA) module and a self-refinement (SR) module. The simple process can be described as a head-attention module added to the top-level convolutional layer of the backbone network to improve feature representation, and to model a wider and richer context for top-level features by utilizing spatial and channel attention; in addition, the multi-layer aggregation module It can effectively integrate low-level spatial...

Embodiment 2

[0116] In order to verify the real effect of the visual object tracking method based on multi-level aggregation and attention twin network proposed in the above embodiment, the experimental results of this embodiment on five public tracking benchmark data sets including OTB2013, OTB50, OTB2015, VOT2016 and VOT2017 show that , this method is superior to the baseline tracker in various evaluation criteria, and also has high competitiveness in the existing tracking methods, so the proposed network SiamMLAA has achieved very good performance in all aspects.

[0117] Specifically, the proposed network framework is implemented on PyTorch and trained on RTX2080Ti with 4 GPUs.

[0118] The training process is as follows: use the ResNet22 model pre-trained on the ILSVRC classification dataset and random noise to initialize the backbone network and the rest, and train offline on the target tracking dataset GOT10K. The dataset contains more than 10,000 video clips of moving targets in ...

Embodiment 3

[0128] In order to verify the effectiveness of each key module designed in the proposed tracker, an ablation study is also carried out in this embodiment. The ablation experiment is carried out on the OTB benchmark, which includes three data sets of OTB2013, OTB50 and OTB2015, from Figure 8 with Figure 9 It can be found intuitively that in Figure 8 with Figure 9 Among them, a is the schematic diagram of the success graph, and b is the schematic diagram of the accuracy graph. The tracker containing all the modules (i.e. multi-layer aggregation MLA module, self-refinement SR module and head-attention HA module) achieves almost the best tracking performance in terms of both accuracy and success rate, which demonstrates that the proposed tracker of the present invention Each of the modules in is necessary to significantly improve the final tracking performance.

[0129] Table 2: Ablation studies of different composition combinations on the OTB dataset.

[0130]

[0131...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a visual target tracking method based on multi-level aggregation and a twinning network, and the method comprises the following steps: extracting the multi-level feature representation of an example sample and a search sample through a twinning backbone network; defining a multi-layer aggregation module, and selectively integrating high-layer semantic features and low-layerdetail features to learn complementary information among the multi-layer features so as to assist shallow features in tracking a target; adding a self-refining module behind the multi-layer aggregation module to suppress noise generated by multi-layer aggregation; adding a head attention module at the top convolution feature of the twin backbone network to enhance the semantic representation of the top feature and improve the recognition capability of the target; and constructing a multi-level aggregation and attention twin network tracker for visual target tracking. The method has the beneficial effect that the visual target tracking result is more remarkably improved.

Description

technical field [0001] The invention relates to the technical field of visual target tracking, in particular to a visual target tracking method based on multi-level aggregation and attention twin network. Background technique [0002] Visual object tracking refers to automatically locating a specified object in a constantly changing video sequence. It is one of the most basic research problems in the field of computer vision, and has a wide range of needs in visual monitoring, human-computer interaction, and video editing. The core problem of object tracking is how to detect and localize objects accurately and efficiently in challenging scenarios with occlusions, out-of-view, deformation, and background clutter changes. [0003] In recent years, trackers based on Siamese networks have shown great potential for visual tracking in terms of speed and robustness by transforming the tracking problem into a similarity learning problem. In the offline training phase of the network,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/00G06K9/62G06N3/04

CPCG06V20/42G06N3/045G06F18/213G06F18/22G06F18/241G06F18/214

Inventor 宋晓宁范颖冯振华

Owner 上海蠡图信息科技有限公司

Visual target tracking method based on multi-level aggregation and attention twin network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology