Video semantic segmentation method based on convolutional neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and semantic segmentation technology, applied in the field of real-time video semantic segmentation of targets in the process of autonomous driving, can solve the problem of slow target segmentation processing speed, improve real-time performance and processing speed, improve segmentation accuracy, and reduce time. Effect

Active Publication Date: 2019-08-20

HARBIN INST OF TECH

View PDF3 Cites 32 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Benefits of technology

This technology helps improve how videos can be analyzed by dividing them into smaller segments that help identify important parts or events within each frame more easily during analysis. It achieves these benefits through an efficient way of learning from both images and their related contextual factors like motion direction.

Problems solved by technology

The technical problem addressed in this patents relates to improving the performance of machine learning systems (ML) based upon accurate object detection/segmenting techniques like convolutional neural networks (CNN). Current methods require significant amounts of memory storage space due to their computational complexity and cannot handle dynamic scenes without losing any useful details from previous segments. Therefore, an improved method called semi-semantics segmentation becomes necessary to achieve better results compared to current approaches.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

specific Embodiment approach 1

[0038] Specific implementation mode one: combine figure 1 To describe this embodiment,

[0039] A video semantic segmentation method based on a convolutional neural network, comprising the following steps:

[0040] Step 1: Construct a W-shaped network model based on the attention mechanism. The entire model structure is composed of two branches, which can identify overall information and detailed information at the same time.

[0041] Such as image 3 As shown, the W-shaped network model includes two branches:

[0042] One branch is input from the image and undergoes three times of convolution to down-sample to obtain a feature map with one-eighth of the accuracy of the original image, preserving the details of the original image as much as possible;

[0043] The other branch uses the Xception module or ResNet module to perform deep down-sampling, expand the receptive field, and obtain 16-fold and 32-fold down-sampled feature maps, respectively, after the two down-sampled f...

specific Embodiment approach 2

[0050] In step 2 of this embodiment, on the basis of the W-shaped network, the process of constructing an optical flow field algorithm to propagate and fuse features between frames is as follows:

[0051] Using the deep feature flow algorithm, which combines the propagation correspondence between features, only runs the computationally intensive deep convolutional network on sparse key frames, and transmits their deep feature maps to other networks through the optical flow field. frame. Compared with the entire deep convolutional network, the optical flow calculation method has less calculation and faster operation speed, so the algorithm has been significantly accelerated. Among them, the calculation of the optical flow field also uses the convolutional neural network model, so the whole framework realizes end-to-end training, thereby improving the recognition accuracy. Since the intermediate convolutional feature maps have the same spatial size as the input image, they pres...

specific Embodiment approach 3

[0056] One branch described in this embodiment is input from an image and undergoes three convolutions to perform down-sampling to obtain a feature map with one-eighth of the accuracy of the original image. The specific process is as follows:

[0057] The image is first processed by conv+bn+relu to achieve 2 times downsampling,

[0058] Then through conv+bn+relu processing to achieve 2 times downsampling, get 4 times downsampled feature map;

[0059] Then, through the above operation, 2 times downsampling is performed to obtain a feature map with one-eighth of the accuracy of the original image.

[0060] Other steps and parameters are the same as those in Embodiment 1 or 2.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a video semantic segmentation method based on a convolutional neural network, and belongs to the technical field of automatic driving. The problem that in the existing automatic driving field, the real-time target segmentation processing speed is too low is solved. According to the invention, the convolutional neural network model is applied to video semantic segmentation;a W-shaped network is constructed by using an optical flow field and adopting an attention mechanism and depth separable convolution, and feature propagation between different frames is achieved by using inter-frame related information on the basis of the W-shaped network in combination with a feature aggregation algorithm of the optical flow field, thereby further improving the speed of video semantic segmentation and greatly reducing the time required for segmentation. The method is used for video semantic segmentation.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Owner HARBIN INST OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video semantic segmentation method based on convolutional neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Benefits of technology

Problems solved by technology

Method used

Image

Examples

specific Embodiment approach 1

specific Embodiment approach 2

specific Embodiment approach 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology