Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Video space-time action positioning method based on progressive attention hypergraph

A positioning method and attention technology, applied in the field of computer vision, can solve problems such as high computational overhead, increased computational overhead, and decreased confidence, and achieve the effects of accurate action recognition rate, improved operating efficiency, and guaranteed computational efficiency

Active Publication Date: 2022-08-09
HANGZHOU DIANZI UNIV
View PDF11 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The shortcomings of the above spatio-temporal action localization methods are mainly manifested in the following three aspects: (1) Although the long-term feature library with a fixed window size can capture the first-order relationship of long-term targets well, for short-term actions, the long-term feature Excessively large time ranges in the library will lead to the extraction of context-independent features by the model, thereby reducing the accuracy of short-duration action feature representation; (2) The impact of high-order relationships on the identification of action categories at the current moment decreases as the time interval increases, However, its computational cost increases with the increase of the time interval. Therefore, it is difficult to construct a long-term target high-order relationship to meet the high real-time requirements of the model; (3) the traditional graph structure can only represent the characteristics of the paired relationship, and it is difficult to describe the complex Diverse target high-order relations
Therefore, it is urgent to design a method that can adaptively adjust the window size according to the original action duration in order to solve the problem of low short-term action confidence caused by the inaccurate capture range of the target first-order relationship and the problem of high computational overhead caused by the unreasonable description of the target high-order relationship. , and can correctly reflect the high-order relationship of the target's spatio-temporal action localization method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video space-time action positioning method based on progressive attention hypergraph
  • Video space-time action positioning method based on progressive attention hypergraph
  • Video space-time action positioning method based on progressive attention hypergraph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] The present invention will be further described below with reference to the accompanying drawings.

[0050] like figure 1 As shown, the video spatiotemporal action localization method based on the progressive attention hypergraph first uniformly samples the original video, uses the convolutional neural network to extract the target region features and the video spatiotemporal feature map; uses the spatiotemporal relation encoder to obtain the target context features and spatiotemporal features Relation matrix; then input the target context feature and spatiotemporal relationship matrix into the progressive variable-length sliding window module to obtain the target first-order features consistent with the original action; at the same time, construct a hypergraph module to generate short-term target height by increasing the constraints of shared attributes First-order features; finally, use the target action regression module to obtain the spatial positions and action cat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a video space-time action positioning method based on a progressive attention hypergraph. The method comprises the following steps: firstly, sampling a given original video to obtain a frame sequence, and obtaining a target context feature and a video spatio-temporal feature map by using a convolutional neural network; target context features and a time-space relation matrix are obtained through a time-space relation encoder; generating a long-term target first-order feature by using a progressive variable-length window method module; meanwhile, target short-term high-order features are obtained through a hypergraph module with shared attribute constraints and a diffusion mechanism; and finally, outputting spatial positions and action categories of all targets at different moments by using a target action regression module. According to the method, the window size can be adaptively adjusted according to the original duration of the action, the target first-order feature consistent with the original duration of the action can be obtained, the potential relation of the target can be captured through the hypergraph module, the target interaction relation can be effectively utilized, and the accuracy of video space-time action positioning is improved.

Description

technical field [0001] The invention belongs to the technical field of computer vision, in particular to the field of action localization in video processing, and relates to a video spatiotemporal action localization method based on a progressive attention hypergraph. Background technique [0002] The rapid rise of the self-media industry has produced massive amounts of video-based multimedia data. Compared with traditional graphic data, video has gradually become the mainstream media form with its rich visual content and intuitive expression. However, massive videos contain a lot of complex scene information, such as a large number of targets and complex actions. Therefore, how to quickly and accurately identify and locate the action categories of all objects from complex scenes has become an important research direction for researchers, the task of Spatio-temporal Action Localization. This task takes as input a long video that is unedited and may contain multiple targets...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06V20/40G06V40/20G06V10/82G06N3/04G06N3/08
CPCG06V20/46G06V40/20G06V10/82G06N3/08G06V2201/07G06N3/045
Inventor 叶兴超李平曹佳晨
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products