Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Video description generation method and device based on bidirectional time sequence diagram

A video description and sequence diagram technology, applied in character and pattern recognition, instrumentation, computing, etc., can solve the problems of ignoring spatial information, limiting the effect of video description, etc., to achieve the effect of improving accuracy

Inactive Publication Date: 2019-09-06
PEKING UNIV
View PDF6 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The above methods usually use CNN's fully connected layer or global pooling layer to extract video features, ignoring the fine-grained spatial information in the video frame, thus limiting the effect of video description generation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video description generation method and device based on bidirectional time sequence diagram
  • Video description generation method and device based on bidirectional time sequence diagram
  • Video description generation method and device based on bidirectional time sequence diagram

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0030] A kind of video classification method based on spatio-temporal attention of the present invention, its flow process is as follows figure 1 As shown, it specifically includes the following steps:

[0031] (1) Extract video frames from the video and perform object detection

[0032] Extract video frames from the videos in the training set and test set, and use the object detection model to perform object detection on the video frames. In this embodiment, T video frames are extracted from each video, and N objects are detected in each frame, that is, N objects with the highest scores of the object detection model are selected as the detection results.

[0033] (2) Construct a bidirectional timing diagram for the video object and calculate the timing trajectory of the object

[0034] Establish a bidirectional sequence graph for ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a video description generation method and device based on a bidirectional time sequence diagram. The method comprises the following steps: extracting video frames from a videoand carrying out object detection, wherein each video frame is detected to obtain a plurality of objects; constructing a two-way time sequence graph for the video object, including a forward graph and a reverse graph, and performing calculating to obtain a two-way time sequence track of the object; extracting local features from the video frame and the object, constructing a feature aggregation model, and aggregating the local features to obtain aggregated features with high expression capability; and constructing a decoding model to generate natural language description, and adaptively distinguishing different video frames and different object instances by utilizing a hierarchical attention mechanism in the generation process. According to the method, the time sequence track of the videoobject can be modeled through the bidirectional time sequence diagram, the time sequence change information of the video object can be effectively expressed, the expression capability of video features is improved by utilizing local feature aggregation, and fine-grained video space-time information is modeled, so that the accuracy of video description generation is improved.

Description

technical field [0001] The present invention relates to the technical field of video description generation, in particular to a video description generation method and device based on a bidirectional sequence diagram. Background technique [0002] With the rapid development of the Internet and multimedia technology, the number of videos on the Internet has increased dramatically. Statistics show that users on the YouTube video sharing website watch more than 1 billion hours of videos every day, and users upload more than 400 hours of videos every minute. According to the forecast of CISCO Corporation of the United States, by 2022, global video traffic will account for 82% of IP traffic. In the face of massive and fast-growing Internet video data, how to effectively analyze and understand its content is of great significance to meet users' information acquisition needs. [0003] Video description generation refers to the automatic generation of natural language sentences de...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06K9/62
CPCG06V20/46G06V2201/07G06F18/253
Inventor 彭宇新张俊超
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products