Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Video semantic segmentation method based on convolutional neural network

A convolutional neural network and semantic segmentation technology, applied in the field of real-time video semantic segmentation of targets in the process of autonomous driving, can solve the problem of slow target segmentation processing speed, improve real-time performance and processing speed, improve segmentation accuracy, and reduce time. Effect

Active Publication Date: 2019-08-20
HARBIN INST OF TECH
View PDF3 Cites 32 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This technology helps improve how videos can be analyzed by dividing them into smaller segments that help identify important parts or events within each frame more easily during analysis. It achieves these benefits through an efficient way of learning from both images and their related contextual factors like motion direction.

Problems solved by technology

The technical problem addressed in this patents relates to improving the performance of machine learning systems (ML) based upon accurate object detection/segmenting techniques like convolutional neural networks (CNN). Current methods require significant amounts of memory storage space due to their computational complexity and cannot handle dynamic scenes without losing any useful details from previous segments. Therefore, an improved method called semi-semantics segmentation becomes necessary to achieve better results compared to current approaches.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video semantic segmentation method based on convolutional neural network
  • Video semantic segmentation method based on convolutional neural network
  • Video semantic segmentation method based on convolutional neural network

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0038] Specific implementation mode one: combine figure 1 To describe this embodiment,

[0039] A video semantic segmentation method based on a convolutional neural network, comprising the following steps:

[0040] Step 1: Construct a W-shaped network model based on the attention mechanism. The entire model structure is composed of two branches, which can identify overall information and detailed information at the same time.

[0041] Such as image 3 As shown, the W-shaped network model includes two branches:

[0042] One branch is input from the image and undergoes three times of convolution to down-sample to obtain a feature map with one-eighth of the accuracy of the original image, preserving the details of the original image as much as possible;

[0043] The other branch uses the Xception module or ResNet module to perform deep down-sampling, expand the receptive field, and obtain 16-fold and 32-fold down-sampled feature maps, respectively, after the two down-sampled f...

specific Embodiment approach 2

[0050] In step 2 of this embodiment, on the basis of the W-shaped network, the process of constructing an optical flow field algorithm to propagate and fuse features between frames is as follows:

[0051] Using the deep feature flow algorithm, which combines the propagation correspondence between features, only runs the computationally intensive deep convolutional network on sparse key frames, and transmits their deep feature maps to other networks through the optical flow field. frame. Compared with the entire deep convolutional network, the optical flow calculation method has less calculation and faster operation speed, so the algorithm has been significantly accelerated. Among them, the calculation of the optical flow field also uses the convolutional neural network model, so the whole framework realizes end-to-end training, thereby improving the recognition accuracy. Since the intermediate convolutional feature maps have the same spatial size as the input image, they pres...

specific Embodiment approach 3

[0056] One branch described in this embodiment is input from an image and undergoes three convolutions to perform down-sampling to obtain a feature map with one-eighth of the accuracy of the original image. The specific process is as follows:

[0057] The image is first processed by conv+bn+relu to achieve 2 times downsampling,

[0058] Then through conv+bn+relu processing to achieve 2 times downsampling, get 4 times downsampled feature map;

[0059] Then, through the above operation, 2 times downsampling is performed to obtain a feature map with one-eighth of the accuracy of the original image.

[0060] Other steps and parameters are the same as those in Embodiment 1 or 2.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a video semantic segmentation method based on a convolutional neural network, and belongs to the technical field of automatic driving. The problem that in the existing automatic driving field, the real-time target segmentation processing speed is too low is solved. According to the invention, the convolutional neural network model is applied to video semantic segmentation;a W-shaped network is constructed by using an optical flow field and adopting an attention mechanism and depth separable convolution, and feature propagation between different frames is achieved by using inter-frame related information on the basis of the W-shaped network in combination with a feature aggregation algorithm of the optical flow field, thereby further improving the speed of video semantic segmentation and greatly reducing the time required for segmentation. The method is used for video semantic segmentation.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products