Multi-round visual dialogue method based on multi-level sorting learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A sorting learning and multi-level technology, applied in the field of image processing, can solve the problem of insufficient generalization ability of the model, and achieve the effect of improving generalization ability and avoiding redundant information

Active Publication Date: 2021-09-24

UNIV OF ELECTRONIC SCI & TECH OF CHINA

View PDF8 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The previous method only considers how to improve the ranking of the correct answer in the final ranking list, but ignores the ranking of other options that are not marked as the correct answer but have the same semantics as the correct answer, which leads to the generalization ability of the model is not good enough

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0059] Exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be understood that the implementations shown and described in the drawings are only exemplary, intended to explain the principle and spirit of the present invention, rather than limit the scope of the present invention.

[0060] Embodiments of the present invention provide a multi-round visual dialogue method based on multi-level ranking learning, such as figure 1 and figure 2 Commonly shown, including the following steps S1-S13:

[0061] S1. Using a pre-trained fast region object detector (Faster-RCNN) to extract visual features in an image.

[0062] S2. Using BiLSTM as the text encoder of the question to obtain the text features of the question.

[0063] S3. Using the two-way long-short-term memory network as a text encoder of the dialogue history to obtain the text features of the dialogue history.

[0064] S4. Using the direct an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-round visual dialogue method based on multi-level sorting learning, provides a context control gate mechanism, adaptively endows dialogue historical information with a weight to answer a current question, and avoids redundant information caused by blindly using the dialogue historical information. Meanwhile, a multi-level sorting learning module is designed, all options are divided into three levels, the ranking of the options which are similar to correct answers in semantics but not marked as the correct answers is improved, and therefore the generalization ability of the model is improved.

Description

technical field [0001] The invention belongs to the technical field of image processing, and in particular relates to the design of a multi-round visual dialogue method based on multi-level ranking learning. Background technique [0002] With the rapid development of the field of visual and language interaction, multi-turn visual dialogue methods have received extensive attention and great progress in recent years. As a branch of the traditional visual question answering method, the multi-round visual dialogue method is to conduct multiple rounds of dialogue around a given picture. Its focus is to analyze the relationship between the current question and the dialogue history information, and use it to support question answering. It has a wide range of applications, such as visual assistants for the visually impaired, big data analysis assistants to assist analysts, search and rescue assistants, etc. Compared with traditional visual question answering, multi-round visual dia...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/00G06F40/30G06F16/332G06F16/432G06F16/48G06N3/04G06N3/08

CPCG06F40/30G06F16/3329G06F16/434G06F16/48G06N3/049G06N3/08G06N3/048G06N3/044G06N3/045

Inventor 高联丽陈堂明李向鹏宋井宽

Owner UNIV OF ELECTRONIC SCI & TECH OF CHINA

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-round visual dialogue method based on multi-level sorting learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology