Video action detection method, system and equipment and storage medium

An action detection and video technology, applied in the field of video analysis, can solve the problems of no interaction modeling, no emphasis on local correlation, no consideration of the inner correlation of action classes, etc., to solve multi-label problems, low computing cost, and robustness Effect of improving stickiness and discrimination

Pending Publication Date: 2022-05-13
UNIV OF SCI & TECH OF CHINA
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1) ACRN (document "Actor-Centric Relation Network") automatically mines the spatio-temporal elements related to the action performers in the scene through the network, and generates relational features for action classification, but does not explicitly perform interaction between entities. modeling
[0005] 2) LFB (literature "Long-Term Feature Banks for Detailed Video Understanding") provides long-term time support for the model, and models global timing dependencies by computing remote interactions between entities, but it does not emphasize the use of actions in timing local relevance
However, this implementation means that the multi-label classification problem is directly regarded as a binary classification problem on multiple classes. In the final prediction, each class is performed independently, without considering the internal relationship between action classes.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video action detection method, system and equipment and storage medium
  • Video action detection method, system and equipment and storage medium
  • Video action detection method, system and equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0040] In order to solve the deficiencies of the prior art, an embodiment of the present invention provides a video action detection method, which is a video action detection method that integrates interaction relationship and category association. This method specifically models the spatial interaction, short-term temporal interaction, and long-term temporal interaction between action performers to enhance the expressive ability of target features and improve the recognition effect of interactive actions. This method considers both the spatial dimension and the temporal dimension The heterogeneity of the algorithm takes into account the local information and global information in time series. For multi-label problems, a category relationship module is designed to mine the dependencies between different action classes, and use this relationship to fuse the original category representations, making the learned representations more robust and discriminative, and further improving...

Embodiment 2

[0101] The present invention also provides a video motion detection system, which is mainly implemented based on the method provided in the first embodiment, as Image 6 As shown, the system mainly includes:

[0102] A video data acquisition module, configured to acquire video clips and determine key frames in the video clips;

[0103] The feature extraction network part, whose input is a video clip, is used to obtain the regional features corresponding to all the detection frames of the key frame through target detection and feature extraction;

[0104] The short-term interaction module, whose input is the regional features corresponding to all the detection frames of the key frame, is used to model the interaction in the spatial dimension and the temporal dimension respectively to obtain enhanced features;

[0105] The long-term interaction module, whose input is the regional features corresponding to all the detection frames of the key frame and the enhanced features, is u...

Embodiment 3

[0111] The present invention also provides a processing device, such as Figure 7 As shown, it mainly includes: one or more processors; memory for storing one or more programs; wherein, when the one or more programs are executed by the one or more processors, the One or more processors implement the methods provided in the foregoing embodiments.

[0112] Further, the processing device further includes at least one input device and at least one output device; in the processing device, the processor, memory, input device, and output device are connected through a bus.

[0113] In the embodiment of the present invention, the specific types of the memory, input device and output device are not limited; for example:

[0114] The input device can be a touch screen, an image acquisition device, a physical button or a mouse, etc.;

[0115] The output device can be a display terminal;

[0116] The memory may be random access memory (Random Access Memory, RAM), or non-volatile memory...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a video action detection method, system and device and a storage medium, on one hand, the interaction relationship between action executors is modeled in a targeted manner, the time-space characteristics of video signals are fully utilized, the expression ability of target characteristics can be greatly enhanced, and the recognition effect of interaction actions is greatly improved; and on the other hand, the dependency relationship among different action classes is mined by utilizing a self-attention mechanism, so that the robustness and the distinction degree of original class representation are further improved, and the multi-label problem is solved at relatively low calculation cost.

Description

technical field [0001] The present invention relates to the technical field of video analysis, in particular to a video motion detection method, system, device and storage medium. Background technique [0002] With the popularization of electronic shooting equipment, a large amount of video data is generated every day, and most of the videos contain human-centered actions. These videos have been widely used in intelligent robots, security monitoring, automatic driving and other fields. How to understand and analyze the content of massive videos has become a hot topic at present. Video motion detection is one of the key technologies. It means that given a video, it is necessary to detect the spatial positions of all people in the video and determine their corresponding motion categories. It has great application value and research in real-world scenarios. significance. [0003] At present, for video action detection tasks, most of the research work is carried out in two ste...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06V40/20G06V10/764G06V10/62G06V10/80G06V10/82G06K9/62
CPCG06F18/254G06F18/2415
Inventor 王子磊贺楚景
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products