A Video Understanding Method Based on Compression-Excitation Pseudo-3D Network

A pseudo-3D and network technology, applied in the field of video understanding based on compression-excitation pseudo-3D network, can solve the problems of difficult training, difficult extraction of deep features, and difficulties, so as to increase accuracy and robustness, and deepen the number of network layers , Improve the effect of network performance

Active Publication Date: 2022-05-03
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] 3) The three-dimensional convolutional model has a larger number of parameters than the two-dimensional convolutional network, so training is very difficult, so most of them use a shallow structure, which makes it difficult to extract deep features
[0007] In addition, the current basic convolutional neural network of the video understanding classification model has some problems: the convolution kernel, as the core of the convolutional neural network, usually aggregates the spatial information and the information of the feature dimension on the local receptive field and finally obtains it. global information
The convolutional neural network consists of a series of convolutional layers, nonlinear layers, and subsampling layers, so that they can capture the characteristics of the image from the global receptive field for image description, but it is quite difficult to learn a very powerful network. difficult

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Video Understanding Method Based on Compression-Excitation Pseudo-3D Network
  • A Video Understanding Method Based on Compression-Excitation Pseudo-3D Network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] All features disclosed in this specification, or steps in all methods or processes disclosed, may be combined in any manner, except for mutually exclusive features and / or steps.

[0024] Any feature disclosed in this specification (including any appended claims, abstract and drawings), unless expressly stated otherwise, may be replaced by alternative features which are equivalent or serve a similar purpose. That is, unless expressly stated otherwise, each feature is one example only of a series of equivalent or similar features.

[0025] A kind of video comprehension method based on compression-excitation pseudo three-dimensional network proposed by the present invention adopts the pseudo-three-dimensional residual network based on compression-excitation mechanism to realize, including steps 1-3:

[0026] Step 1, input the original video into the network after processing

[0027] (1.1) Each training video in the training data is divided into several 4-second long segme...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a video comprehension method based on compression-excitation pseudo-three-dimensional network, the method comprising: preprocessing training data and test data to form a training set and a test set; The difference network; the test set is used to test the pseudo-three-dimensional residual network based on the compression-incentive mechanism; and the detailed structure of the pseudo-three-dimensional residual network based on the compression-incentive mechanism is given. A video comprehension method based on the compression-excitation pseudo-3D network proposed by the present invention evenly extracts the spatial and temporal features of the input video clips, reduces the amount of parameters compared with the 3D convolution model, and deepens the number of network layers. Deeper features are extracted; and the interdependence between feature channels is explicitly modeled, thereby improving network performance; the prediction results of the test samples are averaged as the final prediction results, which increases the accuracy and robustness of the results sex.

Description

technical field [0001] The invention belongs to the technical field of computer vision, relates to the field of video comprehension and classification, and in particular relates to a video comprehension method based on a compression-excitation pseudo three-dimensional network. Background technique [0002] A large amount of image and video data is being generated every minute, which also promotes the development of multimedia content understanding applications such as search and recommendation. How to extract video features well is of great significance for video content analysis and understanding. In the image field, the integration of the residual network model has been able to achieve a 3.57% top-5 error rate (error) on the ImageNet dataset (a large-scale visualization database for visual object recognition software research), which is better than the error rate. The rate is 5.1% human level. Compared with images, video contains complex timing information in addition to ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06V20/40G06V10/774G06V10/764G06V10/82G06N3/04G06N3/08
CPCG06N3/08G06V20/42G06N3/045G06F18/241G06F18/214
Inventor 高建彬王嘉琦
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products