Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Dual-channel semantic positioning multi-granularity attention mutual enhancement video question answering method and system

A multi-granularity, dual-channel technology, applied in semantic analysis, neural learning methods, biological neural network models, etc., can solve problems such as relationship disturbance, lack of contextual information connection, multi-channel feature fusion and insufficient reasoning mechanism

Pending Publication Date: 2022-02-08
SUN YAT SEN UNIV
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The main problems to be solved by the present invention are as follows: firstly, the existing models use graphs to model video when they do not pay attention to the problem, resulting in excessive redundant information in the graphs, causing disturbances between relationships, and poor model performance The second is that the existing models lack fine-grained representation and contextual information connection; the third is that the existing models have insufficient problems in the multi-channel feature fusion and reasoning mechanism of video

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dual-channel semantic positioning multi-granularity attention mutual enhancement video question answering method and system
  • Dual-channel semantic positioning multi-granularity attention mutual enhancement video question answering method and system
  • Dual-channel semantic positioning multi-granularity attention mutual enhancement video question answering method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] The technical inventions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Apparently, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0065] figure 1 It is the general flowchart of the video question answering method of dual-channel semantic positioning and multi-granularity attention mutual enhancement in the embodiment of the present invention, as figure 1 As shown, the method includes:

[0066] S1, build a multi-channel feature extraction module, use the pre-trained network to extract multi-channel features, input video question and answer related data sets and related data info...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a dual-channel semantic positioning multi-granularity attention mutual enhancement video question answering method and system. The method comprises the following steps: constructing a feature extraction coding module to extract features; constructing a coarse and fine granularity global retrieval module to perform attention calculation on the features; constructing a graph attention network module to enhance key information between the features; constructing a local positioning module to position key feature information of different channels; and constructing a fusion attention module to perform multi-level fusion on the enhanced features, and finally obtaining answer index information through an answer prediction module. According to the method, feature information of different granularities in a video is defined as visual semantic and text semantic channels through multi-module design, an auxiliary positioning mechanism is designed for different channels, and feature information most related to a problem is obtained by enhancing sharing representation. Correlation mutual attention is carried out on global and local information by utilizing a graph attention network, so that characteristics related to questions under the same time and space are optimally expressed, and the difficulty of current video question answering is relieved.

Description

technical field [0001] The invention relates to the field of video question answering, in particular to a video question answering method and system for dual-channel semantic positioning, multi-granularity attention mutual enhancement. Background technique [0002] Video has become the most important multimedia information carrier at present and is widely used in people's lives. Due to the characteristics of time and space, video has the characteristics of high information density, diverse categories, changeable content, and complex structure. Users cannot quickly locate the results of the video content they are interested in, and at the same time, it brings new challenges to video understanding tasks such as video description, classification, and recognition. Video question answering (VideoQA) is a fine-grained video understanding task after video description. Compared with the general description in video description task, video question answering not only needs to be able...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/332G06F16/33G06F16/35G06F16/783G06F40/30G06N3/04G06N3/08
CPCG06F16/3329G06F16/3346G06F16/35G06F16/7844G06F40/30G06N3/084G06N3/045
Inventor 周凡张富为林格王若梅林谋广
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products