Video content description method and system based on frame selection

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for video content and video, applied in the field of natural language description generation, which can solve the problems of expensive, wasteful computing power, and subjective video summary annotation data.

Inactive Publication Date: 2019-03-01

INST OF COMPUTING TECH CHINESE ACAD OF SCI

View PDF5 Cites 17 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Such methods waste a lot of computing power on the one hand, and are also susceptible to noise or unimportant visual content on the other;

[0010] 2. The video content description method based on the attention mechanism, especially the method based on the spatial attention mechanism, needs to extract the features of the video frames first, and then perform weighted fusion on them. This can only be done when the video is of limited length and can be completely It is used in the case of observation, which is not suitable for real-time video or real-world scenarios, such as navigation for the blind;

[0011] 3. The video content description method based on key frame extraction also needs to first summarize the global content of the video and then describe it in text according to the summary, which is very difficult to apply to real scenes, because in real scenes, there is no cutting A good video can be used for global content summarization, and the annotation data of video summarization is very subjective and expensive. It is difficult to train a key frame extractor that can be well used for video content description.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0023] In order to make the purpose, technical solution and advantages of the present invention clearer, the method and system for describing video content based on frame selection proposed by the present invention will be further described in detail below in conjunction with the accompanying drawings. It should be understood that the specific implementation methods described here are only used to explain the present invention, and are not intended to limit the present invention.

[0024]The present invention adopts the task-driven video frame selection technology. According to the needs of the task, the information content of the video frame can be defined, and the technology can select the required high-information video frame according to the information content of the video frame defined by the task. Collection, while reducing redundant calculations, also improves the accuracy of task processing; and the uninterrupted video content description technology based on video fram...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a video content description method based on frame selection, comprising the following steps: a feed-forward neural network is used for constructing a screening model, and thescreening model screens the video frame according to the visual richness and semantic consistency of the video frame; constructing a description model for describing the content of the video to be described; training the screening model and the description model with training data; Selecting a description frame in the video to be described through the filtering model; the visual features of the description frame are extracted and the description model is input to obtain the description sentence of the video to be described.

Description

technical field [0001] The invention relates to technologies in the fields of digital image processing and natural language processing, in particular to a technology for generating natural language descriptions for video content. Background technique [0002] Video content description (video captioning) is the task of converting video content into natural language. As early as 2002, Kojima et al. proposed the first video content description system to describe human behavior. Since then, a series of studies on image and video description have been born. Early methods used a bottom-up approach to deal with this problem, that is, first generated descriptors through attribute learning or object detection, and then concatenated these descriptors into a complete sentence through a language model. With the development of neural networks and deep learning, most modern description systems are based on convolutional neural networks and recurrent neural networks, and adopt an encoder...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/00G06K9/62G06F16/332

CPCG06V20/46G06N3/045G06F18/214

Inventor 王树徽陈扬羽黄庆明张维刚

Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video content description method and system based on frame selection

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology