Video Action Detection Method Based on Scale Attention Dilated Convolutional Network

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A convolutional network and action detection technology, applied in the field of video analysis, can solve the problems of increasing network construction and training time and space costs, constraining scale size, different semantic interference, etc., to achieve the goal of reducing network structure redundancy and high execution efficiency Effect

Active Publication Date: 2021-04-30

HANGZHOU DIANZI UNIV

View PDF7 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The existing video action detection methods mainly have the following deficiencies: First, in the feature extraction stage, the three-dimensional convolution operation used to extract the timing features of the action will fixedly reduce the timing dimension of the input video layer by layer in the constructed network model, Constrains the scale size of the extracted features in time series. Too small a scale may cause contextual semantic segmentation, and too large a scale may cause interference of different semantics. The key points of whether or not and its type, that is, the key frame position and its duration (such as continuous key frames) are often different, and the conventional average pooling operation ignores the weight of the key points; third, the existing methods for different scales Fragments use different network structures (such as dilated convolutional networks) to extract feature representations of action clips, which will greatly increase the time and space costs of network construction and training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] The present invention will be further described below in conjunction with accompanying drawing.

[0030] A video action detection method based on the scale-attention hole convolutional network. First, the video is sampled to obtain the frame image sequence and the video segment is obtained according to the action segment mark, and then the layer-scale attention action segment model and the frame position attention action recognition are respectively constructed. model, and finally combined with the watershed algorithm to determine the action category to which the video clip belongs. This method uses the dilated convolutional network to more accurately capture the temporal and spatial motion information of video data, uses the layer-scale attention mechanism to describe the temporal context of video frames, and uses the frame position attention mechanism to learn appropriate weights for the video frames of the action clips. Good reflection of the content of the action cl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a video action detection method based on a scale attention hole convolution network. The method of the present invention first samples the video to obtain the frame image sequence and obtains the video segment according to the segment position mark, then constructs the layer-scale attention action segment model and the frame position attention action recognition model respectively, and obtains the frames sequentially according to the above model and in combination with the watershed algorithm The weighted feature representation of the image and the action category to which the video clip belongs to complete the task of video action detection. The method of the present invention utilizes the dilated convolutional network to extract the spatio-temporal motion information that can better reflect the intrinsic structure of the time dimension and the space dimension of the video data, and more properly characterizes the internal correlation of the timing context of the video frame with the change of the scale size through layer-scale attention , the designed frame position attention mechanism gives the video frame of the action segment a weight that more accurately characterizes the key content of the action segment, improves the accuracy of video action detection, and improves the efficiency of action detection.

Description

technical field [0001] The invention belongs to the technical field of video analysis, in particular to the technical field of time series action detection, and relates to a video action detection method based on a scale-attention hole convolution network. Background technique [0002] The understanding of human motion video plays an important role in many fields such as security monitoring and behavior analysis, and has become a frontier research topic in the field of computer vision. However, unedited real videos often contain background segments unrelated to human actions, which will affect the correct understanding of video content. To solve this problem, the video action detection method not only classifies the actions in the video, but also locates the start and end time of the action instance in the video. Video action detection tasks usually use video frame sequences as input, and output the detection results of multiple groups of segments in the form of "action typ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G06K9/00G06K9/62G06N3/04G06N3/08

CPCG06N3/08G06V40/20G06V40/10G06V20/46G06V20/52G06N3/047G06N3/045G06F18/241G06F18/2415

Inventor李平曹佳晨陈乐聪徐向华

OwnerHANGZHOU DIANZI UNIV

Video Action Detection Method Based on Scale Attention Dilated Convolutional Network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology