Unlock instant, AI-driven research and patent intelligence for your innovation.

Lightweight video action recognition network, method and system based on deep learning

A deep learning and action recognition technology, applied in the field of video recognition, can solve the problems of unrealized spatio-temporal information interaction, achieve efficient and more accurate video action recognition, and realize the effect of real interaction

Active Publication Date: 2021-09-03
WUHAN UNIV
View PDF9 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] Most of the existing video action recognition models are (2+1)D models or 2D+3D parallel models. Although these methods have achieved certain results, they actually divide the spatiotemporal information into two parts during the learning process. The two independent parts are extracted separately, and the real spatio-temporal information interaction is not realized.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Lightweight video action recognition network, method and system based on deep learning
  • Lightweight video action recognition network, method and system based on deep learning
  • Lightweight video action recognition network, method and system based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] In order to facilitate those of ordinary skill in the art to understand and implement the present invention, the present invention will be described in further detail below in conjunction with the accompanying drawings and examples of implementation. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit this invention.

[0028] please see figure 1 , a lightweight video action recognition network based on deep learning provided by the present invention uses separate convolution to reconstruct 3D convolution from three different dimensions (t, h, w), that is, decomposes 3D convolution along three dimensions For three different 2D convolution branches, three MDM modules of different dimensions are constructed, including two spatiotemporal cooperative convolution module branches MDM-A(t,h) and MDM-C(t,w), and a Spatial convolution module branch MDM-B(h,w); where h...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a lightweight video action recognition network, method and system based on deep learning, and provides a multi-dimensional module (MDM) for action recognition. The MDM uses a plurality of 2D convolution kernels of different dimensions to reconstruct a 3D convolution kernel, and these modules have high efficiency and recognition accuracy. Specifically, the MDM firstly performs 2D convolution on a video cube from three orthogonal dimensions, and learns appearance and motion features of an action subject in a video in a collaborative manner. Secondly, grouping operation is carried out on the three 2D convolutions along the channel dimension so as to further reduce the parameter calculation amount; and finally, time migration operation is performed on the two collaborative convolutions containing the space-time dimension along the time dimension to effectively obtain long-range and short-range time sequence information in the video. Compared with the calculated amount of the 3D CNN, the method provided by the invention is higher in efficiency, and meets the lightweight demand on the basis of ensuring the recognition performance.

Description

technical field [0001] The invention belongs to the technical field of video recognition, and relates to a video-oriented human action recognition network, method and system, in particular to a lightweight video action recognition network, method and system based on deep learning. [0002] technical background [0003] Deep learning drives progress in video action recognition. Deep learning achieves results far superior to traditional recognition methods by training on massive video datasets. However, considering factors such as device power consumption and storage costs, video-based human motion analysis and recognition applications have high requirements for real-time performance and speed. Therefore, it is of great significance to carry out lightweight design on the human behavior analysis and recognition model. [0004] At present, the mainstream methods of video action recognition can be roughly divided into three categories: [0005] (1) Two-stream approach: Extendin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06N3/04
CPCG06N3/045Y02D10/00
Inventor 王中元陈建宇曾康利黄宝金
Owner WUHAN UNIV