Video question answering method based on object-oriented double-flow attention network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An object-oriented, attention technology, applied in neural learning methods, biological neural network models, digital video signal modification, etc., can solve problems such as video question answering tasks that cannot be image question answering, and achieve the effect of improving exploration ability.

Pending Publication Date: 2022-05-03

HANGZHOU DIANZI UNIV

View PDF0 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Benefits of technology

This patented technology uses two different ways: one allows users to see what they are doing with videos while another analyzes how things behave when interacting within them. By analyzing these properties at once, it provides better understanding about interactions among multiple sources like backgrounds or scenes. It also helps improve upon existing models by providing more accurate representations of real world events such as human actions or movements. Overall, this method enhances learning from data collected through experiments conducted over large amounts of digital media.

Problems solved by technology

In order to solve problem addressed by these systems related to video query answers (QA), they require understanding how well things like backgrounds or other parts of the scene look up from frame data alone while still taking into account their relationships during QA analysis. However, current models have limitations when trying to capture complex behaviors involving multiple modes of interest.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0118] Such as figure 1 , figure 2 As shown, an object-oriented two-stream attention network based video question answering method, the steps are as follows:

[0119] Step (1), carry out data preprocessing to input data, for a section of video of input, at first adopt the mode of average sampling to sample video frame, in the present invention, the sampling number of every section video is T=10 frames. After that, the Faster-RCNN target detection algorithm is used to generate target objects on each frame, and multiple candidate frames are obtained. In addition, a convolutional network is used to extract static appearance features and dynamic behavior features for each video frame. In the present invention, the ResNet-152 network trained on the ImageNet image library is used to extract static appearance features, and the I3D network trained on the Kinetics action recognition data set is used to extract dynamic behavior features of video frames. Finally, the RoIAlign method ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a video question answering method based on an object-oriented double-flow attention network. Visual content of a video is represented using a double stream mechanism, one stream being a static appearance stream of a foreground object and the other stream being a dynamic behavior stream of the foreground object. In each stream, the features of the object include the features of the object itself, and also include the space-time coding of the object and the context information features of the scene where the object is located. The relative space-time relation and the context sensing relation between the objects can be explored when deep feature extraction is carried out in subsequent image convolution operation. Meanwhile, a double-flow mechanism is used for solving the problem that a previous video question-answer model only considers static characteristics of an object and lacks dynamic information analysis. According to the method, the exploration capability on intra-modal interaction and inter-modal semantic alignment is improved, and a better result is obtained on a related video question and answer data set.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Owner HANGZHOU DIANZI UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video question answering method based on object-oriented double-flow attention network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Benefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology