Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Video description generation system based on graph convolution network

A convolutional network and video description technology, which is applied to biological neural network models, instruments, character and pattern recognition, etc., can solve the problem of underutilization of video timing object information, incomplete mining of video characteristics, loss of other word information, etc. question

Pending Publication Date: 2020-08-04
FUDAN UNIV
View PDF8 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Although the existing work has made great progress in the automatic video description task, the characteristics of the video itself have not been fully exploited, and the timing of the video and the object information in different frames have not been fully utilized. For the generative model, The input is generally the word at the previous moment, and the information of other words at the previous moment is also lost

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video description generation system based on graph convolution network
  • Video description generation system based on graph convolution network
  • Video description generation system based on graph convolution network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] It can be seen from the background technology that the existing video description generation methods do not make full use of sequence information and specific target object information inside the video. The present invention studies the above problems, introduces the latest cutting-edge technology, that is, the graph convolutional network, and reconstructs the visual information inside the video. During the reconstruction process, the sequence information of the video frame and the semantic correlation information between the target objects are fully considered. A two-layer GRU is used as the decoder to generate the final description sentence. In the generation process, a progressive mode of coarse and fine granularity is adopted to make the video description generation more accurate. The model proposed by the invention is applicable to all video description generation technologies based on encoding-decoding mode, and can significantly improve the accuracy of generated s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of cross-media generation, and particularly relates to a video description generation system based on a graph convolution network. The video description generation system comprises a video feature extraction network, a graph convolution network, a visual attention network and a sentence description generation network. The video feature extraction network performs sampling processing on videos to obtain video features and outputs the video features to the graph convolution network. The graph convolution network recreates the video features accordingto semantic relations and inputs the video features into sentence descriptions to generate a recurrent neural network; and the sentence description generation network generates sentences according tofeatures of video reconstruction. The features of a frame-level sequence and a target-level sequence in the videos are reconstructed by adopting graph convolution, and the time sequence information and the semantic information in the videos are fully utilized when description statements are generated, so that the generation is more accurate. The invention is of great significance to video analysisand multi-modal information research, can improve the understanding ability of a model to video visual information, and has a wide application value.

Description

technical field [0001] The invention belongs to the technical field of cross-media generation, and in particular relates to a video description generation system based on a graph convolutional network. Background technique [0002] Among various multimodal information processing tasks, video description automatic generation (Video Captioning) is a very important task, and it is a basic research task in the field of video analysis. Its task is to give a video, the computer automatically analyzes the content of the video, and generates sentences describing the main content of the video. This task is developed on the basis of the image description automatic generation task (Image Captioning), but there are significant differences between video and images. Video can be regarded as a collection of multiple images, accompanied by audio information, and its features are diverse. The complexity of nature and content scenes far exceeds that of images, so the technical difficulty and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06N3/08G06N3/04
CPCG06N3/08G06V20/41G06V20/46G06N3/045
Inventor 张玥杰肖鑫龙
Owner FUDAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products