Unlock instant, AI-driven research and patent intelligence for your innovation.

Lightweight video action recognition method and system based on deep learning

A deep learning and action recognition technology, applied in the field of video recognition, can solve the problems of unrealized spatio-temporal information interaction, achieve efficient and more accurate video action recognition, and realize the effect of real interaction

Active Publication Date: 2022-05-17
WUHAN UNIV
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Most of the existing video action recognition models are (2+1)D models or 2D+3D parallel models. Although these methods have achieved certain results, they actually divide the spatiotemporal information into two parts during the learning process. The two independent parts are extracted separately, and the real spatio-temporal information interaction is not realized.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Lightweight video action recognition method and system based on deep learning
  • Lightweight video action recognition method and system based on deep learning
  • Lightweight video action recognition method and system based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to facilitate those of ordinary skill in the art to understand and implement the present invention, the present invention will be described in further detail below in conjunction with the accompanying drawings and examples of implementation. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit this invention.

[0027] please see figure 1 , a lightweight video action recognition network based on deep learning provided by the present invention uses separate convolution to reconstruct 3D convolution from three different dimensions (t, h, w), that is, decomposes 3D convolution along three dimensions For three different 2D convolution branches, three MDM modules of different dimensions are constructed, including two spatiotemporal collaborative convolution module branches MDM-A(t,h) and MDM-C(t,w), and a Spatial convolution module branch MDM-B(h,w); where...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a lightweight video action recognition method and system based on deep learning, and proposes a multi-dimensional module (MDM) for action recognition. MDM uses multiple 2D convolution kernels of different dimensions to reconstruct 3D convolution kernels. These modules have both high efficiency and recognition accuracy. Specifically, MDM first performs 2D convolutions on a video cube from three orthogonal dimensions to learn the appearance and motion features of action subjects in videos in a synergistic manner. Second, three 2D convolutions are grouped along the channel dimension, aiming to further reduce parameter computation. Finally, a time offset operation is performed along the time dimension for the two co-convolutions that include the spatio-temporal dimension, effectively obtaining long-range and short-range timing information in the video. Compared with the calculation amount of 3D CNN, the method of the present invention has higher efficiency, and meets the requirement of light weight on the basis of ensuring recognition performance.

Description

technical field [0001] The invention belongs to the technical field of video recognition, and relates to a video-oriented human action recognition network, method and system, in particular to a lightweight video action recognition network, method and system based on deep learning. Background technique [0002] Deep learning drives progress in video action recognition. Deep learning achieves results far superior to traditional recognition methods by training on massive video datasets. However, considering factors such as device power consumption and storage costs, video-based human motion analysis and recognition applications have high requirements for real-time performance and speed. Therefore, it is of great significance to carry out lightweight design on the human behavior analysis and recognition model. [0003] At present, the mainstream methods of video action recognition can be roughly divided into three categories: [0004] (1) Two-stream approach: Extending 2D CNN...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06V40/20G06V20/40G06V10/82G06N3/04
CPCG06N3/045Y02D10/00
Inventor 王中元陈建宇曾康利黄宝金
Owner WUHAN UNIV