A Visual Dialogue Generation Method Based on Context-Aware Graph Neural Network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A contextual and visual technology, applied in the field of computer vision, can solve the problems of not considering the semantics of words, lack of learning of semantic dependencies of visual objects, and not considering interdependence, etc., to achieve the effect of accurate visual semantic information

Active Publication Date: 2021-06-08

HEFEI UNIV OF TECH

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] For example, in 2017, Jiasen Lu and other authors published the article "Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model" at the top international conference Conference and Workshop on Neural Information Processing Systems (NIPS 2017). Image attention method, this method first performs sentence-level attention processing on historical dialogues, and then performs attention learning on image features based on the processed text features, but this method only considers the sentence level when processing the text information of the current problem Semantics, without considering the semantics of the word level, and in the actual question sentence, usually only some keywords are most relevant to the predicted answer

Therefore, this method has certain limitations in practical application.

[0006] 2. Existing methods lack the learning of semantic dependencies between visual objects when processing image information

Although the method proposed in this paper effectively models the semantic dependencies between different dialogue segments, this method only considers the interdependence at the text level, and does not consider the interdependence between different visual objects in image information. Dependency, so that the visual semantic information can not be learned more fine-grained, and there are limitations in the final prediction answer generation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0086] In this example, if figure 1 As shown, a visual dialogue generation method based on context-aware graph neural network is carried out as follows:

[0087] Step 1. Preprocessing of text input in visual dialogue and construction of word list:

[0088] Step 1.1. Obtain visual dialogue datasets from the Internet. The currently public datasets mainly include VisDialDataset, which is collected by relevant researchers from the Georgia Institute of Technology. The visual dialogue dataset contains sentence text and images;

[0089] Perform word segmentation processing on all sentence texts in the visual dialogue dataset to obtain segmented words;

[0090] Step 1.2, screen out all words whose word frequency is greater than the threshold from the word after segmentation, the size of the threshold can be set to 4, and build the word index table Voc; the method for creating the word index table Voc: the word table can contain words, punctuation marks ; Count the number of words an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a visual dialogue generation method based on a context-aware graph neural network, comprising the following steps: 1. Preprocessing of text input in visual dialogue and construction of a word list; 2. Feature extraction of dialogue images and features of dialogue text Extraction; 3. Obtaining historical dialogue context feature vectors; 4. Constructing context-aware graphs; 5. Iteratively updating context-aware graphs; 6. Focusing on context-aware graph nodes based on current problems; 7. Multimodal semantic fusion and decoding Generate answer feature sequences; 8. Parameter optimization of visual dialogue generation network model based on context-aware graph neural network; 9. Predictive answer generation. The invention builds a context-aware graph neural network on the visual dialogue, which can use finer-grained text semantic information to infer the implicit relationship between different objects in the image, thereby improving the reasonableness of the answers predicted and generated by the agent to the question. sex and accuracy.

Description

technical field [0001] The invention belongs to the technical field of computer vision, relates to technologies such as pattern recognition, natural language processing, and artificial intelligence, and specifically relates to a visual dialogue generation method based on a context-aware graph neural network. Background technique [0002] Visual dialogue is a method of human-computer interaction, the purpose of which is to enable machine agents and humans to conduct reasonable and correct natural dialogues in the form of questions and answers on a given daily scene graph. Therefore, how to make the agent correctly understand the multi-modal semantic information composed of images and texts so as to give reasonable answers to the questions raised by humans is the key to the visual dialogue task. Visual dialogue is currently one of the hot research topics in the field of computer vision, and its application scenarios are also very extensive, including: helping visually impaired...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06F16/332G06F16/583G06F40/30G06F40/211G06F40/284G06N3/04G06N3/08

CPCG06F16/3329G06F16/5846G06N3/08G06N3/044

Inventor 郭丹王辉汪萌

Owner HEFEI UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A Visual Dialogue Generation Method Based on Context-Aware Graph Neural Network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology