Video description data processing method, device and storage medium

A video description and data processing technology, applied in the field of video processing, which can solve the problems of large model calculation, complex calculation, and verbose sentences.

Active Publication Date: 2021-07-30
GUILIN UNIV OF ELECTRONIC TECH
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In this process, it will cause complex calculations. At the same time, if two shots with high similarity are input respectively, each time a shot is input into the convolutional neural network, many similar features will be generated. In the video description model Describing each feature makes the calculation of the model very large, and the final description is very long-winded and not smooth, which is quite different from the manual description.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video description data processing method, device and storage medium
  • Video description data processing method, device and storage medium
  • Video description data processing method, device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0057] Such as figure 1 As shown, a video description data processing method includes the following steps:

[0058] Import a video sequence, and divide the video sequence into a plurality of video pictures;

[0059] Perform feature segmentation analysis on all the video pictures through a preset convolutional neural network to obtain multiple lens data sets;

[0060] Merging and analyzing all the shot datasets through the preset convolutional neural network to obtain multiple merged shot datasets;

[0061] performing feature extraction on a plurality of the combined lens data sets through the preset convolutional neural network to obtain a video description feature sequence;

[0062] The video description feature sequence is converted into video description information through a preset video description model.

[0063] It should be understood that the shot data sets are sequentially generated according to the time of occurrence of events of the input video, for example, the...

Embodiment 2

[0069] Such as figure 1 As shown, a video description data processing method includes the following steps:

[0070] Import a video sequence, and divide the video sequence into a plurality of video pictures;

[0071] Perform feature segmentation analysis on all the video pictures through a preset convolutional neural network to obtain multiple lens data sets;

[0072] Merging and analyzing all the shot datasets through the preset convolutional neural network to obtain multiple merged shot datasets;

[0073] performing feature extraction on a plurality of the combined lens data sets through the preset convolutional neural network to obtain a video description feature sequence;

[0074] converting the video description feature sequence into video description information through a preset video description model;

[0075] The process of performing feature segmentation analysis on all the video pictures through the preset convolutional neural network to obtain multiple lens data ...

Embodiment 3

[0082] Such as figure 1 As shown, a video description data processing method includes the following steps:

[0083] Import a video sequence, and divide the video sequence into a plurality of video pictures;

[0084] Perform feature segmentation analysis on all the video pictures through a preset convolutional neural network to obtain multiple lens data sets;

[0085] Merging and analyzing all the shot datasets through the preset convolutional neural network to obtain multiple merged shot datasets;

[0086] performing feature extraction on a plurality of the combined lens data sets through the preset convolutional neural network to obtain a video description feature sequence;

[0087] converting the video description feature sequence into video description information through a preset video description model;

[0088] The process of performing feature segmentation analysis on all the video pictures through the preset convolutional neural network to obtain multiple lens data ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a video description data processing method, a device and a storage medium, and the method comprises the steps: importing a video sequence, and segmenting the video sequence into a plurality of video pictures; performing feature segmentation analysis on all the video pictures through a preset convolutional neural network to obtain a plurality of shot data sets; performing merging analysis on all the lens data sets through the preset convolutional neural network to obtain a plurality of merged lens data sets; performing feature extraction on the plurality of merged shot data sets through the preset convolutional neural network to obtain a video description feature sequence; and converting the video description feature sequence into video description information through a preset video description model. The method does not need to generate the final description after the text description is generated for each shot data, directly converts a natural language problem into an image problem, reduces the redundancy of the generated description, and improves the fluency of the text description.

Description

technical field [0001] The present invention mainly relates to the technical field of video processing, in particular to a video description data processing method, device and storage medium. Background technique [0002] At present, video description mainly includes machine translation accuracy indicators, sentence fluency indicators, etc., but how to solve the video fluency is a thorny problem. The existing technology is to generate several shot data sets according to the video shot segmentation, and then input each shot data into the convolutional neural network to generate a series of features, and then input these features into the video description model to generate sentences. In this process, it will cause complex calculations. At the same time, if two shots with high similarity are input respectively, each time a shot is input into the convolutional neural network, many similar features will be generated. In the video description model Describing each feature makes ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/00G06K9/62G06N3/04
CPCG06V20/41G06V20/46G06V20/49G06N3/045G06F18/22
Inventor 蔡晓东黄庆楠
Owner GUILIN UNIV OF ELECTRONIC TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products