Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Video behavior recognition method and system based on channel attention-oriented time modeling

A recognition method and attention technology, applied in character and pattern recognition, neural learning methods, biological neural network models, etc., can solve problems such as heavy computational burden and large amount of computation.

Active Publication Date: 2021-05-18
SHANDONG UNIV
View PDF7 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

They achieve good results in video action recognition, but also bring a large number of model parameters, resulting in heavy computational burden
Therefore, existing techniques propose to decompose the 3D convolution kernel into a 2D spatial kernel and a 1D temporal kernel to solve this problem, however, these methods still suffer from the problem of heavy computation due to the use of 1D convolution

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video behavior recognition method and system based on channel attention-oriented time modeling
  • Video behavior recognition method and system based on channel attention-oriented time modeling
  • Video behavior recognition method and system based on channel attention-oriented time modeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] In one or more embodiments, a video behavior recognition method based on channel attention-oriented temporal modeling is disclosed, referring to figure 1 , including the following procedures:

[0055] (1) Obtain the convolution feature map of the input behavior video;

[0056] (2) Generate channel attention weights and adjust the input video convolution feature map;

[0057] (3) Select the feature channel whose attention weight is higher than the set value to perform residual time modeling, and calculate the residual of the spatial features of adjacent frames in these channels to establish a temporal correlation model between them, by capturing the human body The motion dynamics of actions changing over time to learn the temporal relationship of the video, and then obtain a more discriminative video feature representation;

[0058] (4) Perform video behavior recognition based on the obtained feature representation.

[0059] Specifically, given a convolutional feature...

Embodiment 2

[0136] In one or more embodiments, a video behavior recognition system based on channel attention-oriented temporal modeling is disclosed, characterized in that it includes:

[0137] The data acquisition module is used to obtain the convolution feature map of the input behavior video;

[0138] Channel attention generation (CAG) module, used to obtain channel weights and adjust the original input video convolution feature map;

[0139] The Residual Time Modeling (RTM) module is used to select the feature channels whose attention weight is higher than the set value for residual time modeling, and calculate the residuals of the spatial features of adjacent frames in these channels to establish the relationship between them. The temporal correlation model of the human body learns the temporal relationship of the video by capturing the dynamics of human motion over time, and then obtains a more discriminative video feature representation;

[0140] The video behavior recognition mo...

Embodiment 3

[0146] In one or more embodiments, a terminal device is disclosed, including a server, the server includes a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor executes the The program implements the video behavior recognition method based on channel attention-oriented temporal modeling in the first embodiment. For the sake of brevity, details are not repeated here.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a video behavior recognition method and system based on channel attention-oriented time modeling, and provides a new video-level channel attention generation method based on an input video convolution feature map so as to learn the difference of different feature channels. The method includes: under the guidance of channel attention, carrying out importance sorting on the generated attention scores, calculating residual errors of space features of adjacent frames in a strong identification feature channel to capture the motion dynamic of human body actions along with time change, and capturing video time dependence by establishing a time relation model of the adjacent frames, so that efficient video time structure modeling is realized, and video feature representation with higher identification capability is generated; and performing video behavior identification based on the obtained feature representation. The provided channel attention-oriented residual time modeling module can be flexibly embedded into many existing 2D network structures, and the performance of video behavior recognition is improved by endowing the 2D network with efficient time modeling capability.

Description

technical field [0001] The invention relates to the technical field of video behavior recognition, in particular to a video behavior recognition method and system based on channel attention-oriented time modeling. Background technique [0002] The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art. [0003] As an important research field of computer vision, video behavior recognition has received more and more attention in recent years due to its wide application in video surveillance, video understanding, human behavior analysis and so on. Compared with image-based vision tasks that only utilize spatial information, temporal structure modeling is crucial for video behavior recognition, because video data is high-dimensional and a single image is not enough to express the information of the entire behavior. Therefore, video action recognition is highly dependent on efficient learning ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/62G06F30/27G06N3/04G06N3/08
CPCG06F30/27G06N3/049G06N3/08G06F2119/12G06V20/42G06V20/46G06F18/24
Inventor 马昕武寒波宋锐荣学文李贻斌
Owner SHANDONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products