Transform-based semi-supervised video target segmentation method
A target segmentation and semi-supervised technology, applied in the field of image processing, can solve the problems of poor segmentation accuracy, high computational cost, and lack of inherent inductive bias for small-scale targets and similar targets, so as to reduce computational costs and improve segmentation accuracy.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0063] figure 1 Shown is a kind of Transformer-based semi-supervised video target segmentation method flowchart of the embodiment of the present invention, and concrete steps are as follows:
[0064] Step 1, get the dataset and segmentation labels
[0065] Obtain the video target segmentation data set, the static image data set and the corresponding segmentation labels of the two data sets, and form each image in the data set and the corresponding segmentation labels into an image pair.
[0066] Step 2, data augmentation and processing
[0067] (2-a) After normalizing each image pair formed by the static image data set obtained in step (1) and the corresponding segmentation label, repeat the following process to obtain a synthetic video training sample corresponding to each image pair, A collection of synthetic video training samples constitutes the synthetic video training set:
[0068] I. The short side of the image pair is reduced to w pixels, and the long side is reduce...
Embodiment 2
[0109] Using the method of Example 1, the video target segmentation experiment was carried out on the public data sets YoutubeVOS2018, YoutubeVOS2019, DAVIS2016 and DAVIS2017. The operating system of this experiment is Linux ubuntu 18.04 version; the PyTorch1.8.1 framework based on CUDA10.1 and cuDNN7.6.5 is implemented, and two NVIDIA 2080Ti 11G GPUs are used for training and testing.
[0110] In this embodiment, the performance of the present invention is evaluated by using the area similarity J, contour accuracy F and their average value J&F. The regional similarity J is calculated from the average cross-over-union ratio of the estimated label and its corresponding real label, and its calculation formula is:
[0111]
[0112]where M is the predicted segmentation label, G is the ground truth segmentation label, and the symbols ∩ and ∪ denote the intersection and union of two sets, respectively.
[0113] Contour accuracy F represents the average boundary similarity betwee...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


