Video semantic segmentation method based on ConvLSTM convolutional neural network

A convolutional neural network and semantic segmentation technology, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve the problem of ignoring the correlation of adjacent video frames, improve the generalization ability, expand the receptive field, The effect of improving accuracy

Active Publication Date: 2020-10-30
SHANDONG UNIV
View PDF4 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The existing research methods sequentially perform image semantic segmentation on each frame in the video. Although the req...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video semantic segmentation method based on ConvLSTM convolutional neural network
  • Video semantic segmentation method based on ConvLSTM convolutional neural network
  • Video semantic segmentation method based on ConvLSTM convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0064] A video semantic segmentation method based on ConvLSTM convolutional neural network, comprising the following steps:

[0065] A. Build and train video semantic segmentation network

[0066] (1) Get the dataset

[0067] Neural networks need a large amount of data to learn, and most networks use supervised learning, that is, the input data has corresponding labeled data during network training. The input data in the training set, that is, the data set is a video sequence, and the corresponding labeled data of the data set is the result image after semantic segmentation; in video semantic segmentation, because the video contains many frames, only a few frames in a video sequence have corresponding Annotated images of , in the Cityscapes dataset, each video sequence has 30 frames, of which the 20th frame has annotation information. The dataset is Cityscapes dataset. The Cityscapes dataset contains a variety of video sequences recorded in street scenes from 50 different c...

Embodiment 2

[0086] According to a kind of video semantic segmentation method based on ConvLSTM convolutional neural network described in embodiment 1, its difference is:

[0087] Before performing step (3), perform data augmentation on the data in the training set in the data set, including: perform random horizontal flipping, random brightness adjustment, and random cropping on the data in the training set to expand the data in the training set. In this way, over-fitting of the network can be avoided and the generalization ability of the network can be improved.

[0088] In step (3), the learning rate decay strategy is used to train the video semantic segmentation network. As the number of iterations increases, the learning rate gradually decreases, which can ensure that the model will not fluctuate too much in the later stage of training, thus getting closer to the optimal solution. Set the initial learning rate l 0 is 0.0003, and the learning rate l is attenuated by formula (I) durin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a video semantic segmentation method based on a ConvLSTM convolutional neural network. The method comprises the following steps: A, constructing and training a video semanticsegmentation network: (1) acquiring a data set; (2) constructing a video semantic segmentation network; (3) training a video semantic segmentation network; (4) testing the segmentation accuracy of thevideo semantic segmentation network; B, performing video semantic segmentation through the trained video semantic segmentation network structure. According to the method, the ConvLSTM module is adopted to consider the correlation between the adjacent frames of the video, so the accuracy of semantic segmentation of the video is improved. In addition, a densely-connected cavity space pyramid pooling module with densely-connected blocks is adopted, so the transmission of features and gradients is more effective, the problem of gradient disappearance in the deep network training process is solved, multi-scale context information can be systematically aggregated, and the receptive field is expanded.

Description

technical field [0001] The invention relates to a video semantic segmentation method based on a ConvLSTM convolutional neural network, belonging to the technical field of computer vision. Background technique [0002] Neural network is a machine learning technique that simulates the nervous system of the brain. Through learning, the network can have specific nonlinear expression capabilities. Increasing the number of network layers can improve the expressive performance of neural networks. At present, deep neural networks have become the basis of deep learning. [0003] Based on the research of feedforward neural network, convolutional neural network (Convolutional Neural Networks, CNN) and recurrent neural network (Recurrent Neural Network, RNN) have become research hotspots and have been widely used. [0004] Convolutional neural network is a kind of feed-forward neural network that includes convolution calculation and has a deep structure. It can effectively extract feat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/00G06N3/04G06N3/08
CPCG06N3/049G06N3/08G06V20/49G06V20/41G06N3/045
Inventor 元辉周兰黄文俊
Owner SHANDONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products