Space-time attention based video classification method

A video classification and attention technology, applied in the direction of instruments, calculations, character and pattern recognition, etc., can solve the problems of not fully utilizing, ignoring, and limiting the effect of video classification, so as to promote learning effects, improve effects, and improve accuracy. Effect

Active Publication Date: 2017-11-07
PEKING UNIV
View PDF4 Cites 101 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, existing deep video classification methods cannot simultaneously model spatial and temporal saliency in videos, ignore the connection between ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Space-time attention based video classification method
  • Space-time attention based video classification method
  • Space-time attention based video classification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0021] A kind of video classification method based on spatio-temporal attention of the present invention, its flow process is as follows figure 1 As shown, it specifically includes the following steps:

[0022] (1) Data preprocessing

[0023] Data preprocessing is to extract frames and optical flow from the training video and the video to be predicted. The optical flow is the motion vector generated from two consecutive frames of the video, which can be decomposed into horizontal and vertical components. In order to facilitate the deep network to process the motion information in the optical flow, in this embodiment, the horizontal and vertical components of continuous L optical flows are stacked alternately to obtain an image with 2L channels.

[0024] (2) Spatio-temporal attention model construction and training

[0025] The spatio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a space-time attention based video classification method, which comprises the steps of extracting frames and optical flows for training video and video to be predicted, and stacking a plurality of optical flows into a multi-channel image; building a space-time attention model, wherein the space-time attention model comprises a space-domain attention network, a time-domain attention network and a connection network; training the three components of the space-time attention model in a joint manner so as to enable the effects of the space-domain attention and the time-domain attention to be simultaneously improved and obtain a space-time attention model capable of accurately modeling the space-domain saliency and the time-domain saliency and being applicable to video classification; extracting the space-domain saliency and the time-domain saliency for the frames and optical flows of the video to be predicted by using the space-time attention model obtained by learning, performing prediction, and integrating prediction scores of the frames and the optical flows to obtain a final semantic category of the video to be predicted. According to the space-time attention based video classification method, modeling can be performing on the space-domain attention and the time-domain attention simultaneously, and the cooperative performance can be sufficiently utilized through joint training, thereby learning more accurate space-domain saliency and time-domain saliency, and thus improving the accuracy of video classification.

Description

technical field [0001] The invention relates to the technical field of video classification, in particular to a video classification method based on spatio-temporal attention. Background technique [0002] With the widespread popularization and rapid development of social media and self-media, the number of videos on the Internet is showing a trend of rapid growth. Research shows that more than 300 hours of videos were uploaded to YouTube every minute in 2016. The 2016 video traffic statistics and forecast report of CISCO in the United States further pointed out that global video traffic will account for 82% of Internet traffic in 2020. At that time, it will take a user five million years to watch all the videos transmitted on the Internet within one month. video. Video and other media data have become the main body of big data. How to accurately analyze and identify video content is of great significance for meeting users' information acquisition needs. [0003] Video cl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/00
CPCG06V20/41G06V20/46
Inventor 彭宇新张俊超
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products