Video action detection method, system and equipment and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An action detection and video technology, applied in the field of video analysis, can solve the problems of no interaction modeling, no emphasis on local correlation, no consideration of the inner correlation of action classes, etc., to solve multi-label problems, low computing cost, and robustness Effect of improving stickiness and discrimination

Pending Publication Date: 2022-05-13

UNIV OF SCI & TECH OF CHINA

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] 1) ACRN (document "Actor-Centric Relation Network") automatically mines the spatio-temporal elements related to the action performers in the scene through the network, and generates relational features for action classification, but does not explicitly perform interaction between entities. modeling

[0005] 2) LFB (literature "Long-Term Feature Banks for Detailed Video Understanding") provides long-term time support for the model, and models global timing dependencies by computing remote interactions between entities, but it does not emphasize the use of actions in timing local relevance

However, this implementation means that the multi-label classification problem is directly regarded as a binary classification problem on multiple classes. In the final prediction, each class is performed independently, without considering the internal relationship between action classes.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0040] In order to solve the deficiencies of the prior art, an embodiment of the present invention provides a video action detection method, which is a video action detection method that integrates interaction relationship and category association. This method specifically models the spatial interaction, short-term temporal interaction, and long-term temporal interaction between action performers to enhance the expressive ability of target features and improve the recognition effect of interactive actions. This method considers both the spatial dimension and the temporal dimension The heterogeneity of the algorithm takes into account the local information and global information in time series. For multi-label problems, a category relationship module is designed to mine the dependencies between different action classes, and use this relationship to fuse the original category representations, making the learned representations more robust and discriminative, and further improving...

Embodiment 2

[0101] The present invention also provides a video motion detection system, which is mainly implemented based on the method provided in the first embodiment, as Image 6 As shown, the system mainly includes:

[0102] A video data acquisition module, configured to acquire video clips and determine key frames in the video clips;

[0103] The feature extraction network part, whose input is a video clip, is used to obtain the regional features corresponding to all the detection frames of the key frame through target detection and feature extraction;

[0104] The short-term interaction module, whose input is the regional features corresponding to all the detection frames of the key frame, is used to model the interaction in the spatial dimension and the temporal dimension respectively to obtain enhanced features;

[0105] The long-term interaction module, whose input is the regional features corresponding to all the detection frames of the key frame and the enhanced features, is u...

Embodiment 3

[0111] The present invention also provides a processing device, such as Figure 7 As shown, it mainly includes: one or more processors; memory for storing one or more programs; wherein, when the one or more programs are executed by the one or more processors, the One or more processors implement the methods provided in the foregoing embodiments.

[0112] Further, the processing device further includes at least one input device and at least one output device; in the processing device, the processor, memory, input device, and output device are connected through a bus.

[0113] In the embodiment of the present invention, the specific types of the memory, input device and output device are not limited; for example:

[0114] The input device can be a touch screen, an image acquisition device, a physical button or a mouse, etc.;

[0115] The output device can be a display terminal;

[0116] The memory may be random access memory (Random Access Memory, RAM), or non-volatile memory...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a video action detection method, system and device and a storage medium, on one hand, the interaction relationship between action executors is modeled in a targeted manner, the time-space characteristics of video signals are fully utilized, the expression ability of target characteristics can be greatly enhanced, and the recognition effect of interaction actions is greatly improved; and on the other hand, the dependency relationship among different action classes is mined by utilizing a self-attention mechanism, so that the robustness and the distinction degree of original class representation are further improved, and the multi-label problem is solved at relatively low calculation cost.

Description

technical field [0001] The present invention relates to the technical field of video analysis, in particular to a video motion detection method, system, device and storage medium. Background technique [0002] With the popularization of electronic shooting equipment, a large amount of video data is generated every day, and most of the videos contain human-centered actions. These videos have been widely used in intelligent robots, security monitoring, automatic driving and other fields. How to understand and analyze the content of massive videos has become a hot topic at present. Video motion detection is one of the key technologies. It means that given a video, it is necessary to detect the spatial positions of all people in the video and determine their corresponding motion categories. It has great application value and research in real-world scenarios. significance. [0003] At present, for video action detection tasks, most of the research work is carried out in two ste...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06V40/20G06V10/764G06V10/62G06V10/80G06V10/82G06K9/62

CPCG06F18/254G06F18/2415

Inventor 王子磊贺楚景

Owner UNIV OF SCI & TECH OF CHINA

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video action detection method, system and equipment and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology