Unlock instant, AI-driven research and patent intelligence for your innovation.

Video description generation method, device and equipment and storage medium

A technology of video description and target video, which is applied in the direction of TV, TV system components, instruments, etc., can solve the problem of poor quality of video description, and achieve the effect of improving accuracy and quality

Active Publication Date: 2019-06-11
TENCENT TECH (SHENZHEN) CO LTD
View PDF7 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The embodiment of the present application provides a method, device, device, and storage medium for generating a video description, which can solve the problem of poor quality of the video description generated by the video description generation model in the related art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video description generation method, device and equipment and storage medium
  • Video description generation method, device and equipment and storage medium
  • Video description generation method, device and equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] In order to make the purpose, technical solution and advantages of the present application clearer, the implementation manners of the present application will be further described in detail below in conjunction with the accompanying drawings.

[0039] In the field of video description, it is a common method to automatically generate video descriptions for videos using a video description generation model built on the basis of an encoding-decoding framework. Among them, the video description generation model based on the encoding-decoding framework can be a soft attention-long short-term memory (Soft Attention Long Short-Term Memory, SA-LSTM) model. In an illustrative example, using the SA-LSTM model The process of generating a video description is as follows figure 1 shown.

[0040] The SA-LSTM model first performs feature extraction on the input video 11, and obtains the visual features 12 of the video 11 (v 1 , v 2 ,...,v n ). Then, according to the last hidden s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a video description generation method, device and equipment and a storage medium. The method comprises the following steps of encoding a target video through an encoder of a video description generation model to obtain a target visual feature of the target video; decoding the target visual feature through a basic decoder of the video description generation model to obtain afirst selection probability corresponding to each candidate vocabulary, the basic decoder being used for decoding the candidate vocabulary matched with the target visual feature by adopting an attention mechanism; decoding the target visual feature through an auxiliary decoder of the video description generation model to obtain a second selection probability corresponding to each candidate vocabulary, the memory structure of the auxiliary decoder comprising reference visual context information corresponding to each candidate vocabulary, and the reference visual context information being generated according to related videos corresponding to the candidate vocabularies; determining decoding words in the candidate vocabularies according to the first selection probability and the second selection probability; and generating a video description according to the plurality of decoding words.

Description

technical field [0001] The embodiments of the present application relate to the field of video description, and in particular, to a method, device, device, and storage medium for generating a video description. Background technique [0002] Video Captioning (Video Captioning) is a technology for generating content description information for videos. In the field of artificial intelligence, video description generation models are usually used to automatically generate video descriptions for videos, and video description generation models are mostly based on the Encoder-Decoder framework. [0003] In the process of applying the video description generation model, the video description generation model first extracts the visual features in the video through the encoder, and then inputs the extracted visual features into the decoder, and the decoder generates decoding words in turn according to the visual features, and finally the generated The individual decoded words are comb...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H04N21/234H04N21/2343H04N21/235H04N21/435H04N21/44H04N21/4402G06V10/75G06V10/772
CPCH04N21/235H04N21/4884H04N21/251H04N21/234336H04N21/23418H04N5/278G06V20/41G06V10/75G06V20/635G06V10/772G06F18/22H04N21/488H04N21/2343H04N21/435H04N21/4402H04N21/8549G06V20/47G06F18/28G06F18/253
Inventor 裴文杰张记袁柯磊戴宇荣沈小勇贾佳亚王向荣
Owner TENCENT TECH (SHENZHEN) CO LTD