Multi-round visual dialogue method based on multi-level sorting learning

A sorting learning and multi-level technology, applied in the field of image processing, can solve the problem of insufficient generalization ability of the model, and achieve the effect of improving generalization ability and avoiding redundant information

Active Publication Date: 2021-09-24
UNIV OF ELECTRONIC SCI & TECH OF CHINA
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The previous method only considers how to improve the ranking of the correct answer in the final ranking list, but ignores the ranking of other options that ar

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-round visual dialogue method based on multi-level sorting learning
  • Multi-round visual dialogue method based on multi-level sorting learning
  • Multi-round visual dialogue method based on multi-level sorting learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] Exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be understood that the implementations shown and described in the drawings are only exemplary, intended to explain the principle and spirit of the present invention, rather than limit the scope of the present invention.

[0060] Embodiments of the present invention provide a multi-round visual dialogue method based on multi-level ranking learning, such as figure 1 and figure 2 Commonly shown, including the following steps S1-S13:

[0061] S1. Using a pre-trained fast region object detector (Faster-RCNN) to extract visual features in an image.

[0062] S2. Using BiLSTM as the text encoder of the question to obtain the text features of the question.

[0063] S3. Using the two-way long-short-term memory network as a text encoder of the dialogue history to obtain the text features of the dialogue history.

[0064] S4. Using the direct an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-round visual dialogue method based on multi-level sorting learning, provides a context control gate mechanism, adaptively endows dialogue historical information with a weight to answer a current question, and avoids redundant information caused by blindly using the dialogue historical information. Meanwhile, a multi-level sorting learning module is designed, all options are divided into three levels, the ranking of the options which are similar to correct answers in semantics but not marked as the correct answers is improved, and therefore the generalization ability of the model is improved.

Description

technical field [0001] The invention belongs to the technical field of image processing, and in particular relates to the design of a multi-round visual dialogue method based on multi-level ranking learning. Background technique [0002] With the rapid development of the field of visual and language interaction, multi-turn visual dialogue methods have received extensive attention and great progress in recent years. As a branch of the traditional visual question answering method, the multi-round visual dialogue method is to conduct multiple rounds of dialogue around a given picture. Its focus is to analyze the relationship between the current question and the dialogue history information, and use it to support question answering. It has a wide range of applications, such as visual assistants for the visually impaired, big data analysis assistants to assist analysts, search and rescue assistants, etc. Compared with traditional visual question answering, multi-round visual dia...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/00G06F40/30G06F16/332G06F16/432G06F16/48G06N3/04G06N3/08
CPCG06F40/30G06F16/3329G06F16/434G06F16/48G06N3/049G06N3/08G06N3/048G06N3/044G06N3/045
Inventor 高联丽陈堂明李向鹏宋井宽
Owner UNIV OF ELECTRONIC SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products