Video action recognition method based on CNN-LSTM (Content Network-Long Short Term Memory) and action

An action and video technology, applied in the field of computer vision, can solve the problems of lack of joint attention and modeling of key spatio-temporal features, low attention, loss of spatial correlation, etc.

Inactive Publication Date: 2020-06-19
CHONGQING UNIV OF POSTS & TELECOMM
View PDF4 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, most of the information represented in each feature layer of CNN has not been fully developed and utilized. At the same time, the CNN-LSTM-based model is not only limited by the lack of fine-grained high-context full-connected features as input for timing modeling, FC-LSTM is also easy to lose spatial correlation when modeling temporal features of convolutional layers with spatial topology, thus affecting recognition performance
Furthermore, the current attention model is limited to the preparation of separate temporal and spatial saliency features, and lacks joint attention and modeling of key spatio-temporal features, which makes the obtained features not pay enough attention to key spatio-temporal context information. High, lack of dependence, directly affects the feature quality and further affects the classification accuracy of the model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video action recognition method based on CNN-LSTM (Content Network-Long Short Term Memory) and action
  • Video action recognition method based on CNN-LSTM (Content Network-Long Short Term Memory) and action
  • Video action recognition method based on CNN-LSTM (Content Network-Long Short Term Memory) and action

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] Embodiments of the present invention are described below through specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the content disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and the details in this specification can also be modified or changed based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that the illustrations provided in the following embodiments are only schematically illustrating the basic idea of ​​the present invention, and the following embodiments and the features in the embodiments can be combined with each other under the condition of no conflict.

[0050] Wherein, the accompanying drawings are for illustrative purposes only, and represent only schematic diagrams, rather than physical drawings, and should not be const...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a video action classification method based on CNN-LSTM (Convolutional Neural Network-Long Short Term Memory) and action., and belongs to the field of computer vision. The method comprises the following steps: S1, exporting multi-layer depth features from a pre-trained convolutional neural network CNN to represent video actions, capturing context relationship information between different video frames by using Conv-LSTM and FC-LSTM, and performing time sequence modeling on the video actions; s2, through TAM and JSTAM, time significance and space-time significance in action representation are enhanced; s3, two attention models are adopted to obtain video action global representation containing key information, and a PCA dimension reduction algorithm is utilized to perform dimension reduction and decorrelation on a high-dimensional action representation vector; and S4, allocating different weights to the outputs of the two independent networks of the time attention network TAN and the joint time attention network JSTAN, and integrating the plurality of representation vectors into a final classification vector.

Description

technical field [0001] The invention belongs to the field of computer vision, and relates to a video action classification method based on CNN-LSTM and attention. Background technique [0002] With the increasing maturity of Internet technology, especially the promotion of mobile Internet applications and the popularization of shooting equipment such as smart phones, digital cameras, and surveillance cameras, video has increasingly become an indispensable media form in people's daily production and life, and the video business has shown rapid development. Today's digital content is essentially multimedia information including text, images, audio, and video. Among them, especially video has become a new way of communication among Internet users. As a new information carrier, it contains a wealth of human action (Human Action) information, and the way of communicating information through video actions has gradually become popular. Up to now, with the support of social apps su...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/62G06N3/04
CPCG06N3/049G06V40/20G06N3/045G06F18/213G06F18/2415
Inventor 张祖凡吕宗明甘臣权张家波
Owner CHONGQING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products