Human action recognition method and system based on multi-scale features

A multi-scale feature and action recognition technology, applied in the field of image processing, can solve the problem of independent feature modeling without video sequence time and space dimensions, and achieve the effects of reducing computing resource consumption, improving performance, and reducing computing load

Pending Publication Date: 2022-07-12
STATE GRID SHANDONG ELECTRIC POWER +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In recent years, Transformer has introduced image processing tasks from natural language processing tasks. There are relatively few studies on video understanding tasks. Most of the existing methods perform self-attention operations after feature aggregation in the feature dimension, and do not perform video sequence Perform independent feature modeling in temporal and spatial dimensions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Human action recognition method and system based on multi-scale features
  • Human action recognition method and system based on multi-scale features
  • Human action recognition method and system based on multi-scale features

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0043] Embodiments of the invention and features of the embodiments may be combined with each other without conflict.

[0044]Nowadays, action recognition algorithms have been deployed in different application scenarios, such as power plants, factories, streets and other scenes with relatively complex background environments. For spatiotemporal feature extraction, the algorithm needs to identify the location of humans in different environments and identify some features that contribute more to action classification. Due to the diversity of human actions, action recognition algorithms with better performance need to recognize more fine-grained actions. For different actions, the execution time of the execution object is relatively different, which requires the action recognition algorithm to consider both short-term features (short-term) and long-term features (long-term) in the time dimension. Due to the requirements of current monitoring-level equipment, action recognition a...

Embodiment 1

[0046] see attached figure 1 , 2 As shown, this embodiment discloses a method for human action recognition based on multi-scale features, including:

[0047] For a video sequence containing T frames, the implementation of this model first extracts the features of each frame through a 2D convolutional neural network, and obtains the feature representation of the T×HW dimension;

[0048] Take the local window ΔT in the time dimension, perform self-attention calculation in the window, and obtain the maximum response R based on the primary local features 1 .

[0049] Shift operations are performed on primary features and self-attention calculations are performed to expand the receptive field of the model.

[0050] The above HW represents the size of the feature map extracted by the neural network. The feature vector of 1*HW dimension is formed by reshape, and then T frames are stacked to form the feature map of T*HW dimension. The shift operation is an operation of the SwinTra...

Embodiment 2

[0072] The purpose of this embodiment is to provide a computing device, including a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor implements the steps of the above method when the processor executes the program.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a human action recognition method and system based on multi-scale features. The method comprises the following steps: acquiring spatial features of a video sequence by using a convolutional network; taking a local window in the time dimension for the spatial features, and performing calculation in the local window to obtain the maximum response based on the primary local features; processing the maximum response based on the primary local features to obtain secondary features; taking a local window for the secondary features in the time dimension, and performing calculation in the local window to obtain the maximum response based on the secondary local features; and for the maximum response based on the primary local features and the maximum response based on the secondary local features, final action classification is obtained in an over-weighted fusion mode, and the obtained features correspond to short-term, medium-term and long-term features of the video sequence. Short-term and long-term modeling is carried out on the time sequence, and the performance of an action recognition algorithm is improved.

Description

technical field [0001] The invention belongs to the technical field of image processing, and particularly relates to a method and system for human action recognition based on multi-scale features. Background technique [0002] The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art. [0003] Action recognition task, as one of the main applications of computer vision in real life, has a wide range of applications in real life. Lightweight action recognition algorithms can be deployed on edge devices, and action recognition algorithms can be applied to real-world scenarios such as factories and streets. Video supervision and behavior prediction have become new research hotspots, but also bring great challenges. Action recognition aims to identify the type of action performed by the observed person through a small number of video frames in a short period of time. The research on action r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06V40/20G06V10/44G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06N3/045G06F18/253
Inventor 焦敏亓振亮谭冲张伟李晓磊亓鹏陈顺东崔建丁利朝何鹏王洪瑞张文利
Owner STATE GRID SHANDONG ELECTRIC POWER
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products