Video description generation method and device based on bidirectional time sequence diagram

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A video description and sequence diagram technology, applied in character and pattern recognition, instrumentation, computing, etc., can solve the problems of ignoring spatial information, limiting the effect of video description, etc., to achieve the effect of improving accuracy

Inactive Publication Date: 2019-09-06

PEKING UNIV

View PDF6 Cites 14 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The above methods usually use CNN's fully connected layer or global pooling layer to extract video features, ignoring the fine-grained spatial information in the video frame, thus limiting the effect of video description generation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0030] A kind of video classification method based on spatio-temporal attention of the present invention, its flow process is as follows figure 1 As shown, it specifically includes the following steps:

[0031] (1) Extract video frames from the video and perform object detection

[0032] Extract video frames from the videos in the training set and test set, and use the object detection model to perform object detection on the video frames. In this embodiment, T video frames are extracted from each video, and N objects are detected in each frame, that is, N objects with the highest scores of the object detection model are selected as the detection results.

[0033] (2) Construct a bidirectional timing diagram for the video object and calculate the timing trajectory of the object

[0034] Establish a bidirectional sequence graph for ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a video description generation method and device based on a bidirectional time sequence diagram. The method comprises the following steps: extracting video frames from a videoand carrying out object detection, wherein each video frame is detected to obtain a plurality of objects; constructing a two-way time sequence graph for the video object, including a forward graph and a reverse graph, and performing calculating to obtain a two-way time sequence track of the object; extracting local features from the video frame and the object, constructing a feature aggregation model, and aggregating the local features to obtain aggregated features with high expression capability; and constructing a decoding model to generate natural language description, and adaptively distinguishing different video frames and different object instances by utilizing a hierarchical attention mechanism in the generation process. According to the method, the time sequence track of the videoobject can be modeled through the bidirectional time sequence diagram, the time sequence change information of the video object can be effectively expressed, the expression capability of video features is improved by utilizing local feature aggregation, and fine-grained video space-time information is modeled, so that the accuracy of video description generation is improved.

Description

technical field [0001] The present invention relates to the technical field of video description generation, in particular to a video description generation method and device based on a bidirectional sequence diagram. Background technique [0002] With the rapid development of the Internet and multimedia technology, the number of videos on the Internet has increased dramatically. Statistics show that users on the YouTube video sharing website watch more than 1 billion hours of videos every day, and users upload more than 400 hours of videos every minute. According to the forecast of CISCO Corporation of the United States, by 2022, global video traffic will account for 82% of IP traffic. In the face of massive and fast-growing Internet video data, how to effectively analyze and understand its content is of great significance to meet users' information acquisition needs. [0003] Video description generation refers to the automatic generation of natural language sentences de...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/00G06K9/62

CPCG06V20/46G06V2201/07G06F18/253

Inventor 彭宇新张俊超

Owner PEKING UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video description generation method and device based on bidirectional time sequence diagram

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology