Three-dimensional convolution and Faster RCNN-based video action detection method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A three-dimensional convolution and motion detection technology, applied in the field of image processing, can solve the problems of synchronous positioning and lack of spatial annotation information, and achieve the effect of motion positioning and excellent performance.

Inactive Publication Date: 2018-08-14

BEIJING UNIV OF TECH

View PDF4 Cites 54 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, S-CNN lacks the ability to predict at fine temporal resolution and locate the precise temporal boundaries of action instances

At the same time, due to the lack of spatial annotation information in the current untrimmed dataset, it is difficult for the current untrimmed dataset to simultaneously locate the spatial bounding box of the action when locating the time boundary of the action.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0093] In the present invention, NVIDIA GPU is used as the computing platform, CUDA is used as the GPU accelerator, and Caffe is selected as the CNN framework.

[0094] S1 data preparation:

[0095] The ActivityNet 1.3 dataset is used in this experiment. The ActivityNet dataset consists only of untrimmed videos with 200 different types of activities, including 10024 videos in the training set, 4926 videos in the validation set and 5044 videos in the test set. Compared to THUMOS14, this is a large dataset, both in terms of the number of activity categories involved and the number of videos.

[0096] Step 1.1: Download the ActivityNet 1.3 dataset from http: / / activity-net.org / download.html to the local.

[0097] Step 1.2: Convert the downloaded video into images according to 25 frames per second (fps), and the images of different subsets are placed in folders according to the corresponding video names.

[0098] Step 1.3: According to the data augmentation strategy, this experi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a three-dimensional convolution and Faster RCNN-based video action detection method. The method comprises the steps of firstly introducing a new model, and encoding a video stream by using a three-dimensional full convolutional network; secondly generating candidate time regions comprising actions based on generated features, and generating a group of candidate frames; andfinally performing classified detection on the candidate frames subjected to different film editing, thereby predicting action types and video action starting and ending time in the video stream, andpredicting spatial position boundary frames of the actions. Compared with an existing method, the method provided by the invention has excellent performance in unpruned data set video time sequence action detection, and can realize action localization in the absence of spatial labeling information.

Description

technical field [0001] The invention belongs to the technical field of image processing, and relates to a video action detection method based on three-dimensional convolution and Faster RCNN. Background technique [0002] With the vigorous development of Internet video media, video content detection and analysis has attracted extensive attention from industry and academia in recent years. Action recognition is an important branch of video content detection and analysis. In the field of computer vision, action recognition has made great progress in both manual features and deep learning features. Action recognition usually boils down to a classification problem, where each action instance in the training phase is pruned from a longer video sequence, and the learned action model is used for either pruned videos (e.g., HMDB51 and UCF101) or untrimmed videos ( For example, action recognition in THUMOS14 and ActivityNet). However, most videos in the real world are unrestricted...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/00G06K9/32G06K9/62G06N3/04

CPCG06V40/20G06V20/40G06V10/25G06N3/045G06F18/24

Inventor 刘波聂相琴

Owner BEIJING UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Three-dimensional convolution and Faster RCNN-based video action detection method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology