Video description method based on space-time super-resolution and electronic equipment

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A super-resolution and video description technology, applied in the fields of computer vision and natural language, can solve the problems of reduced model running speed and large computing costs, and achieve the effects of high-efficiency operation, low computing overhead, and low computing cost

Pending Publication Date: 2022-05-27

TONGJI UNIV

View PDF1 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, this approach ignores the information loss caused by frame sampling and image compression. At the same time, if frame sampling is not performed and the high resolution of the original image is maintained for feature extraction, a large amount of computing costs will be introduced, and the running speed of the model will drop significantly.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0032] The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments. This embodiment is implemented on the premise of the technical solution of the present invention, and provides a detailed implementation manner and a specific operation process, but the protection scope of the present invention is not limited to the following embodiments.

[0033] This embodiment proposes a video description method based on spatiotemporal super-resolution, such as figure 1 As shown, the method is implemented based on a video description model, and includes the following steps: acquiring an input video, sampling the input video to obtain a video frame sequence including several compressed size frames; Perform multi-modal feature extraction and feature encoding, dynamically fuse the encoded multi-modal features, and gradually decode to generate video description sentences. During the training of the video description model, the or...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a video description method based on space-time super-resolution and electronic equipment, and the method is realized based on a video description model, and comprises the following steps: obtaining an input video, and sampling the input video to obtain a video frame sequence containing a plurality of compression size frames; performing multi-modal feature extraction and feature coding on the video frame sequence through the video description model, dynamically fusing the coded multi-modal features, and gradually decoding to generate a video description statement; wherein when the video description model is trained, a frame with an original resolution and a middle missing frame between adjacent sampling frames are reconstructed from two dimensions of space and time, a loss function is constructed by a reconstruction error and a decoding prediction error, and model training is realized. Compared with the prior art, the method has the advantages of rich and accurate description, strong generalization ability, low calculation overhead and the like.

Description

technical field [0001] The invention relates to the fields of computer vision and natural language, and in particular to a video description method and electronic device based on spatiotemporal super-resolution. Background technique [0002] In recent years, with the popularization of 5G network, video as a medium of information interaction has been widely spread in people's daily life, and it has also brought various new challenges, such as automatic classification and retrieval of large-scale videos. , motion and event detection and other video understanding tasks. As one of the key tasks of video understanding tasks, video description aims to automatically generate a natural language description for a given video clip, which has a very broad application prospect in the fields of human-computer interaction, infant teaching and visual impairment assistance. Due to the richness and complex timing of video scenes, it is difficult to model video information. Compared with sta...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06T3/40G06T9/00G06F40/30G06N3/04G06N3/08

CPCG06T3/4053G06N3/08G06T9/002G06F40/30G06T2207/10016G06N3/045

Inventor 王瀚漓曹铨辉

Owner TONGJI UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video description method based on space-time super-resolution and electronic equipment

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology