Transform-based semi-supervised video target segmentation method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A target segmentation and semi-supervised technology, applied in the field of image processing, can solve the problems of poor segmentation accuracy, high computational cost, and lack of inherent inductive bias for small-scale targets and similar targets, so as to reduce computational costs and improve segmentation accuracy.

Pending Publication Date: 2022-05-03

CENT SOUTH UNIV

View PDF0 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, most Transformer-based methods directly input the features of all frames in the memory pool into the multi-head attention module, which is computationally expensive as the number of segmented frames increases, and the classic Transformer architecture lacks an inherent inductive bias , the segmentation accuracy of small-scale targets and similar targets is poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0063] figure 1 Shown is a kind of Transformer-based semi-supervised video target segmentation method flowchart of the embodiment of the present invention, and concrete steps are as follows:

[0064] Step 1, get the dataset and segmentation labels

[0065] Obtain the video target segmentation data set, the static image data set and the corresponding segmentation labels of the two data sets, and form each image in the data set and the corresponding segmentation labels into an image pair.

[0066] Step 2, data augmentation and processing

[0067] (2-a) After normalizing each image pair formed by the static image data set obtained in step (1) and the corresponding segmentation label, repeat the following process to obtain a synthetic video training sample corresponding to each image pair, A collection of synthetic video training samples constitutes the synthetic video training set:

[0068] I. The short side of the image pair is reduced to w pixels, and the long side is reduce...

Embodiment 2

[0109] Using the method of Example 1, the video target segmentation experiment was carried out on the public data sets YoutubeVOS2018, YoutubeVOS2019, DAVIS2016 and DAVIS2017. The operating system of this experiment is Linux ubuntu 18.04 version; the PyTorch1.8.1 framework based on CUDA10.1 and cuDNN7.6.5 is implemented, and two NVIDIA 2080Ti 11G GPUs are used for training and testing.

[0110] In this embodiment, the performance of the present invention is evaluated by using the area similarity J, contour accuracy F and their average value J&F. The regional similarity J is calculated from the average cross-over-union ratio of the estimated label and its corresponding real label, and its calculation formula is:

[0111]

[0112]where M is the predicted segmentation label, G is the ground truth segmentation label, and the symbols ∩ and ∪ denote the intersection and union of two sets, respectively.

[0113] Contour accuracy F represents the average boundary similarity betwee...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a semi-supervised video target segmentation method based on Transform. According to the implementation scheme, the method comprises the following steps: 1) obtaining a data set and a segmentation label; 2) data expansion and processing; 3) constructing a segmentation model; 4) constructing a loss function; 5) training a segmentation model; and 6) video target segmentation. According to the method, a space-time integration module is designed to compress space-time information, a multi-scale layer is introduced to generate cross-scale input features, and a double-branch cross attention module is constructed to consider multiple features of target information. According to the method, the calculation cost can be reduced, and meanwhile, the segmentation precision of small-scale targets and similar targets is effectively improved.

Description

technical field [0001] The invention relates to the technical field of image processing, in particular to a Transformer-based semi-supervised video target segmentation method. Background technique [0002] Video object segmentation is an important prerequisite for video understanding and has many potential applications, such as video retrieval, video editing, autonomous driving, etc. The goal of semi-supervised video object segmentation is, given the segmented object (i.e., segmentation label) in the first frame of the video, to segment that object from other frames in the entire video sequence. [0003] Due to the powerful performance of the Transformer architecture on computer vision tasks such as image classification, object detection, semantic segmentation, and object tracking, many current studies apply it to video object segmentation. The Transformer architecture has excellent long-range dependency (Long-range dependency) modeling capabilities, which can effectively m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06V20/40G06V10/774G06V10/82G06K9/62G06N3/04G06N3/08

CPCG06N3/08G06N3/047G06N3/045G06F18/2155

Inventor 阳春华周玮赵于前张帆

Owner CENT SOUTH UNIV

Transform-based semi-supervised video target segmentation method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology