Time domain contrast graph learning system and method for self-supervised video representation learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for learning systems and comparing graphs, which is applied in the field of video representation learning and can solve problems not involved

Pending Publication Date: 2022-03-22

SUN YAT SEN UNIV

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the patent does not address any introduction of discriminative spatio-temporal knowledge in the frequency domain for self-supervised learning from videos

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0133] Such as image 3 As shown, a temporal contrastive graph learning system for self-supervised video representation learning, including:

[0134] Sampling and random scrambling module, for each video, multiple segments are uniformly sampled and scrambled, for each segment, all its frames are sampled into several fixed-length frame sets;

[0135] The spatio-temporal knowledge discovery module discovers the discriminative spatio-temporal representations of video clips and frame sets, and uses 3DCNN to extract the spatio-temporal features of all clips and frame sets;

[0136] The time-domain comparison map learning module uses the spatio-temporal characteristics of the fragment and frame set to construct two temporal-domain contrast map structures, namely, the intra-segment graph and the inter-segment graph;

[0137] In an adaptive order prediction module, video segment features learned from the temporal contrastive graph learning module are adaptively propagated through an ...

Embodiment 2

[0187] A temporal contrastive graph learning method for self-supervised video representation learning comprising the following steps:

[0188] S1: For each video, multiple segments are uniformly sampled and scrambled by the sampling and random scrambling module, and for each segment, all its frames are sampled into several fixed-length frame sets;

[0189] S2: The spatio-temporal knowledge discovery module discovers the discriminative spatio-temporal representations of video clips and frame sets, and uses 3DCNN to extract the spatio-temporal features of all clips and frame sets;

[0190] S3: The time-domain comparative graph learning module utilizes the spatio-temporal features of segments and frame sets to construct two temporal-domain comparative graph structures, namely intra-segment graphs and inter-segment graphs;

[0191] S4: The video segment features learned from the temporal contrastive graph learning module are adaptively propagated through an adaptive segment order ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a time domain contrast graph learning system and method for self-supervised video representation learning, and the system introduces a space-time knowledge discovery module (STKD) which is used for extracting motion-enhanced space-time representation from a video based on the frequency domain analysis of discrete cosine transform; in order to explicitly construct a multi-scale time domain correlation model of an unlabeled video, prior knowledge about frame and segment order is integrated into a graph structure, i.e., an intra-segment / inter-segment time domain contrast graph (TCG). Then, designing a specific time domain contrast graph learning module to maximize the consistency among nodes in different graph views; in order to generate monitoring signals of unlabeled videos, an adaptive sequential prediction module is introduced that utilizes related knowledge between video segments to learn global context representations and adaptively recalibrates channel features.

Description

technical field [0001] The present invention relates to the technical field of video representation learning, and more specifically, relates to a time-domain comparative graph learning system and method for self-supervised video representation learning. Background technique [0002] For video representation learning, a large number of supervised learning methods have received increasing attention. These methods include traditional methods and deep learning methods. Although these supervised methods have achieved good performance in building temporal correlation models, they require a large number of labeled videos to train complex models, which is undoubtedly time-consuming and laborious. . [0003] Self-supervised learning provides a feasible way to generate supervisory signals by using large amounts of unlabeled data to model various agent tasks. Models learned from proxy tasks can be directly applied to downstream tasks for feature extraction or fine-tuning. It require...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06V10/774G06V10/426G06V10/62G06V20/40G06N3/08G06N3/04G06K9/62

CPCG06N3/088G06N3/047G06N3/048G06N3/045G06F18/2155

Inventor 王青兰浩源刘阳林倞

Owner SUN YAT SEN UNIV

Time domain contrast graph learning system and method for self-supervised video representation learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology