Method for detecting video action based on convolutional neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
一种卷积神经网络、动作检测的技术，应用在基于卷积神经网络的视频动作检测领域，能够解决破坏动作连续性、3DCNN学习不到运动特征、增加类内差异性等问题

Inactive Publication Date: 2017-06-27

PEKING UNIV SHENZHEN GRADUATE SCHOOL

View PDF4 Cites 68 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0011] (1) Sparse sampling will destroy the continuity within the action, so that the 3D CNN cannot learn better motion features;

[0012] (2) The video clips obtained at different sampling frequencies share a network for training, which will increase the intra-class differences, increase the network learning burden, and require more complex networks and more training data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0041] Below in conjunction with accompanying drawing, further describe the present invention through embodiment, but do not limit the scope of the present invention in any way.

[0042] The invention provides a video action detection method based on a convolutional neural network. By adding a spatiotemporal pyramid pooling layer to the traditional network structure, it eliminates the limitation of the network on input, speeds up training and testing, and better excavates the action in the video. Motion information enables improved performance in both video action classification and temporal localization. The present invention does not require that the input video segments have the same size.

[0043] Such as figure 2 As shown, since the traditional convolutional neural network requires the same size of the input video clips, the video clips need to be down-sampled before being input into the network. However, the present invention removes the downsampling process and inser...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention publishes a method for detecting video action, and relates to the technical field of computer vision recognition. The video action detection method is based on a convolutional neural network, a space-time pyramid pooling layer is added to a network structure, restrictions of a network on input is eliminated, the speed of training and detection is improved, and the performance of video action classification and time positioning is improved; the convolutional neural network includes convolutional layers, common pooling layers, space-time pyramid pooling layers and full connection layers; and output of the convolutional neural network includes a category classification output layer and a time positioning calculation result output layer. According to the method provided by the invention, video clips of different time lengths do not need to be obtained through downsampling, but by direct one-time input of a whole video, thereby improving efficiency; and at the same time, since the network trains video clips of the same frequency, difference within a category is not increased, the learning burden of the network is reduced, model convergence is relatively fast, and a detection effect is relatively good.

Description

technical field [0001] The invention relates to computer vision recognition technology, in particular to a video action detection method based on a convolutional neural network. Background technique [0002] In recent years, almost everyone has a mobile phone. Every mobile phone has a camera. Coupled with the development of the Internet and the advancement of communication technology, people are more and more fond of shooting videos and spreading them on the Internet, so the number of videos is growing explosively, and video storage and analysis technology is very important. [0003] Video motion detection refers to classifying the motion in the video and giving the start time and end time of the motion, for example figure 1 shown. In recent years, the task of video action recognition has made great progress, but it is mainly suitable for cropped videos, that is, there is only one action in the video, and there are no redundant frames. As a result, scholars began to stud...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/00G06K9/62G06N3/04G06V10/764

CPCG06N3/04G06V20/42G06F18/24G06F18/214G06N3/084G06V20/41G06V10/82G06V10/764G06N3/045G06F17/15G06F17/18G06N3/08G06N3/047

Inventor 王文敏李志豪王荣刚李革董胜富王振宇李英赵辉高文

Owner PEKING UNIV SHENZHEN GRADUATE SCHOOL

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method for detecting video action based on convolutional neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology