Visual dialogue generation system based on semantic alignment

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for generating systems and semantics, applied in biological neural network models, natural language data processing, instruments, etc., can solve the problems of not considering the quality of visual dialogue text, interference, and ignoring effects

Pending Publication Date: 2020-11-20

HEFEI UNIV OF TECH

View PDF0 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

This processing ignores the gap in the representation of different modal information. If the image features and semantic information are not well aligned, can we really obtain sufficient information based on the extracted image features to generate reply, still in doubt

[0008] 2. Too much reliance on conversation history instead of image information to generate responses

However, although many models currently try to obtain more and more targeted information from images, they ignore whether the improvement of the effect is caused by the interference caused by adding too much historical information.

[0009] 3. Not considering the textual quality of generative visual dialogue

[0010] From the above analysis, we can see that the traditional visual dialogue generation system needs to be improved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0076] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0077] The task of visual dialogue generation is defined as follows: Given an image I, an image description C and a dialogue history H for t-1 rounds t = {C, (Q 1 , A 1 ),..., (Q t-1 , A t-1 )}, and the information of the current round of question Q, to generate the answer A for the current round of question Q.

[0078] The embodiment of the present invention finds that the problems of the traditional visual dialogue generation system include at least: the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a visual dialogue generation system based on semantic alignment. According to the invention, the image information is extracted from two aspects, i.e., a global image information and a local image information. Global image representation based on semantics is obtained through semantic alignment, meanwhile, local dense image description is obtained through dense caption, and high-level semantics of text representation is beneficial to better information acquisition. The two jointly provide clues of image information for generating replies. Meanwhile, comprehensive constraint is carried out from the aspects of text fluency, text coherence and correctness, and generation of replies is guided. In addition, the embodiment of the invention provides a keyword constraint method to constrain the correctness of the reply, so as to enrich the representation form of the generated reply.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of language processing, and in particular to a visual dialogue generation system based on semantic alignment. Background technique [0002] In recent years, with the rapid development of artificial intelligence and robotics, the multimodal semantic understanding of vision and language has received more and more attention and attention in the fields of computer vision and natural language processing. Human-computer interaction cannot only consider a single mode. In real life, the interaction between people is often not limited to a single text, vision or hearing. The multi-modal natural interaction method can not only realize a more friendly interface between machines and humans, but also is the only way to achieve strong artificial intelligence. [0003] Understanding the real world by analyzing vision and language is the primary task of artificial intelligence to achieve human-lik...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F40/35G06K9/62G06N3/04G06N3/08

CPCG06F40/35G06N3/049G06N3/08G06N3/047G06N3/045G06F18/253Y02D10/00

Inventor 孙晓王佳敏汪萌

Owner HEFEI UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Visual dialogue generation system based on semantic alignment

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology