Unlock instant, AI-driven research and patent intelligence for your innovation.

Video description generation method based on multi-concept knowledge mining and storage medium

A technology for video description and knowledge mining, applied in character and pattern recognition, instruments, calculations, etc., can solve the problem of not covering all the content of the video, and achieve the effect of fast training speed, fast convergence speed, and improved quality

Pending Publication Date: 2022-07-12
TONGJI UNIV
View PDF1 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Many existing methods mainly obtain prior knowledge by optimizing the processing of video features and text sequences or adding additional modality information to assist the model to generate description sentences. However, the prior knowledge extracted by such methods only focuses on the A single constituent element, such as a subject or act, cannot cover the entirety of the video

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video description generation method based on multi-concept knowledge mining and storage medium
  • Video description generation method based on multi-concept knowledge mining and storage medium
  • Video description generation method based on multi-concept knowledge mining and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments. This embodiment is implemented on the premise of the technical solution of the present invention, and provides a detailed implementation manner and a specific operation process, but the protection scope of the present invention is not limited to the following embodiments.

[0036] This embodiment provides a method for generating video descriptions based on multi-concept knowledge mining, including: acquiring an input video to be processed, extracting visual features and semantic labels on the input video, optimizing the semantic labels, and obtaining the first A priori semantic label, the extracted visual feature and the prior semantic label are used as the input of the video description generation model based on the Transformer structure, and the corresponding description result is obtained, wherein the visual features include 2D features and 3D ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a video description generation method based on multi-concept knowledge mining and a storage medium, and the method comprises the steps: obtaining a to-be-processed input video, carrying out the visual feature and semantic tag extraction of the input video, carrying out the optimization of the semantic tag, obtaining a prior semantic tag, and carrying out the extraction of the prior semantic tag; the extracted visual features and the prior semantic tags are used as input of a video description generation model based on a Transform structure, a corresponding description result is obtained, and the visual features comprise 2D features and 3D features; when the video description generation model is trained, video-text knowledge, video-video knowledge and text-text knowledge are mined from a training sample, and parameters of a multi-head self-attention layer and parameters of a word embedding layer in the video description generation model are optimized. Compared with the prior art, the method has the advantages of high theme relevancy, high semantic richness, high training speed and the like.

Description

technical field [0001] The invention relates to the field of video description generation, in particular to a video description generation method and storage medium based on multi-concept knowledge mining. Background technique [0002] With the increasing share of video on the Internet, new markets and application prospects are gradually being opened up. The use of computers to automatically understand, analyze and process video data has become a technical requirement that needs to be solved at present. As one of the key tasks of video understanding, video description generation aims to describe what happens in a video in the form of natural language. This task has broad application prospects in the fields of early childhood education, development of assistive devices for the visually impaired, and human-computer interaction. Since it involves both computer vision and natural language processing, there are certain technical difficulties in modeling video information with ti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06V20/40G06V10/774G06V10/764G06K9/62G06F40/30
CPCG06F40/30G06F18/214G06F18/24
Inventor 王瀚漓张沁宇
Owner TONGJI UNIV