Video question-answering method based on attention model

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
An attention model and video technology, applied in biological neural network models, character and pattern recognition, instruments, etc., to achieve the effect of strengthening connection, good effect, and strengthening connection

Inactive Publication Date: 2018-03-20

TIANJIN UNIV

View PDF7 Cites 32 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

This method ignores the differences between video frames when generating video semantic descriptions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0027] The present invention will be further described below in conjunction with the accompanying drawings.

[0028] figure 1 It is a general overview diagram of an attention model-based video question answering method in the present invention. The present invention is designed based on the frame of encoding and decoding, and learns the visual information and semantic information of the video in an end-to-end manner, so as to achieve the purpose of selecting a suitable answer from options for a given video and corresponding questions. First, use the video interception tool to intercept the video frame and sample the video frame; through the independent design of the frame model, the feature vector of the video is obtained; in the encoding stage, the feature vector of the video is used as the input of the long short-term memory network to obtain the scene feature of the video representation, and use it as the initialization input of the text model in the decoding stage; the te...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a video question-answering method based on an attention model. The method is designed based on an encoding and decoding framework, and the visual information and the semantic information of videos are learned in the end-to-end mode. Through the end-to-end design, the method can effectively strengthen the contact between the visual information and the semantic information. According to the method, a frame model is automatically designed to extract the feature vector of a video. During the encoding stage, the scene characteristic representation of the video is learned through the long-and-short-term memory network. The scene characteristic representation is used as an initial state to be input into a decoding-stage text model. Meanwhile, an attention mechanism added in the text model. By means of the attention mechanism, the contact between a video frame and a problem can be effectively enhanced, so that the semantic information of the video can be better analyzed. Therefore, the video question-answering method based on the attention model has a good effect.

Description

technical field [0001] The invention relates to the fields of computer vision and multimedia analysis, in particular to a video question answering method based on an attention model. Background technique [0002] Video analysis is an important research topic in the field of computer vision and multimedia analysis, and it is also a very challenging hot issue. With the rapid growth of video data, video analysis has attracted people's attention. As a medium of video analysis, video question answering has attracted a lot of attention in recent years. [0003] Video question answering refers to the process of giving appropriate answers to a given video and a question posed to the video by obtaining their visual information and semantic information. When people watch a video, they will obtain the characters, objects, environment, etc. appearing in the video through the scene information displayed by the video frame. The visual information brought by the scene enables people to h...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G06K9/00G06K9/62G06N3/04

CPCG06V20/46G06V20/41G06N3/048G06F18/214

Inventor韩亚洪高昆

OwnerTIANJIN UNIV

Video question-answering method based on attention model

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology