Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Visual Dialogue Generation Method Based on Context-Aware Graph Neural Network

A contextual and visual technology, applied in the field of computer vision, can solve the problems of not considering the semantics of words, lack of learning of semantic dependencies of visual objects, and not considering interdependence, etc., to achieve the effect of accurate visual semantic information

Active Publication Date: 2021-06-08
HEFEI UNIV OF TECH
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] For example, in 2017, Jiasen Lu and other authors published the article "Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model" at the top international conference Conference and Workshop on Neural Information Processing Systems (NIPS 2017). Image attention method, this method first performs sentence-level attention processing on historical dialogues, and then performs attention learning on image features based on the processed text features, but this method only considers the sentence level when processing the text information of the current problem Semantics, without considering the semantics of the word level, and in the actual question sentence, usually only some keywords are most relevant to the predicted answer
Therefore, this method has certain limitations in practical application.
[0006] 2. Existing methods lack the learning of semantic dependencies between visual objects when processing image information
Although the method proposed in this paper effectively models the semantic dependencies between different dialogue segments, this method only considers the interdependence at the text level, and does not consider the interdependence between different visual objects in image information. Dependency, so that the visual semantic information can not be learned more fine-grained, and there are limitations in the final prediction answer generation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Visual Dialogue Generation Method Based on Context-Aware Graph Neural Network
  • A Visual Dialogue Generation Method Based on Context-Aware Graph Neural Network
  • A Visual Dialogue Generation Method Based on Context-Aware Graph Neural Network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0086] In this example, if figure 1 As shown, a visual dialogue generation method based on context-aware graph neural network is carried out as follows:

[0087] Step 1. Preprocessing of text input in visual dialogue and construction of word list:

[0088] Step 1.1. Obtain visual dialogue datasets from the Internet. The currently public datasets mainly include VisDialDataset, which is collected by relevant researchers from the Georgia Institute of Technology. The visual dialogue dataset contains sentence text and images;

[0089] Perform word segmentation processing on all sentence texts in the visual dialogue dataset to obtain segmented words;

[0090] Step 1.2, screen out all words whose word frequency is greater than the threshold from the word after segmentation, the size of the threshold can be set to 4, and build the word index table Voc; the method for creating the word index table Voc: the word table can contain words, punctuation marks ; Count the number of words an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a visual dialogue generation method based on a context-aware graph neural network, comprising the following steps: 1. Preprocessing of text input in visual dialogue and construction of a word list; 2. Feature extraction of dialogue images and features of dialogue text Extraction; 3. Obtaining historical dialogue context feature vectors; 4. Constructing context-aware graphs; 5. Iteratively updating context-aware graphs; 6. Focusing on context-aware graph nodes based on current problems; 7. Multimodal semantic fusion and decoding Generate answer feature sequences; 8. Parameter optimization of visual dialogue generation network model based on context-aware graph neural network; 9. Predictive answer generation. The invention builds a context-aware graph neural network on the visual dialogue, which can use finer-grained text semantic information to infer the implicit relationship between different objects in the image, thereby improving the reasonableness of the answers predicted and generated by the agent to the question. sex and accuracy.

Description

technical field [0001] The invention belongs to the technical field of computer vision, relates to technologies such as pattern recognition, natural language processing, and artificial intelligence, and specifically relates to a visual dialogue generation method based on a context-aware graph neural network. Background technique [0002] Visual dialogue is a method of human-computer interaction, the purpose of which is to enable machine agents and humans to conduct reasonable and correct natural dialogues in the form of questions and answers on a given daily scene graph. Therefore, how to make the agent correctly understand the multi-modal semantic information composed of images and texts so as to give reasonable answers to the questions raised by humans is the key to the visual dialogue task. Visual dialogue is currently one of the hot research topics in the field of computer vision, and its application scenarios are also very extensive, including: helping visually impaired...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/332G06F16/583G06F40/30G06F40/211G06F40/284G06N3/04G06N3/08
CPCG06F16/3329G06F16/5846G06N3/08G06N3/044
Inventor 郭丹王辉汪萌
Owner HEFEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products