Indication expression understanding method based on multi-level expression attention-guiding network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-level, attentional technology, applied in the field of indication expression understanding, can solve the problems of indistinguishable target area from other areas, indistinguishable objects that cannot be similar, etc.

Active Publication Date: 2021-03-12

GUIZHOU UNIV

View PDF8 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Existing methods usually use a self-attention mechanism to focus on important words or phrases in the expression, which may lead to the inability to distinguish the target region from other regions, especially when the regions are very similar

In addition, existing methods generally use a single-stage approach to match regions, which cannot distinguish similar objects well

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

example

[0160] The present invention is tested on three large-scale benchmark data sets RefCOCO, RefCOCO+ and RefCOCOg. From the experimental results, it can be seen that the present invention is superior to the highest level method, as shown in Table 1.

[0161]

[0162] Table 1

[0163] It can be concluded from Table 1 that the present invention has achieved the best performance in most of the subtasks, achieving accuracy rates of 87.45% and 86.93% respectively on the testA and testB test sets in the RefCOCO data; in the RefCOCO+ data The accuracy rates of 77.05% and 69.65% were respectively achieved on the testA and testB test sets in the RefCOCOg data; the accuracy rates of 80.29% were respectively achieved on the test sets in the RefCOCOg data.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an indication expression understanding method based on a multi-level expression attention-guiding network, and innovatively designs a new multi-level attention mechanism, namely a multi-level expression attention-guiding network (MEGA-Net), which comprises a three-level attention network. The multi-level attention mechanism can generate image region representations with distinction degrees under the guidance of expression representations of different levels (statement levels, word levels and phrase levels), thereby helping to accurately determine a target region. In addition, an existing method generally adopts a single-stage mode to match regions, and the mode cannot well distinguish similar objects or targets. Aiming at the problem, the invention designs a two-stage structure to compare the similar image areas and find out the difference between the similar image areas, so as to match the optimal image area. According to the method, evaluation is carried out on three popular data sets, and experimental results show that the performance of the method is superior to that of other highest-level models.

Description

technical field [0001] The present invention belongs to the technical field of Referring Expression Comprehension (REC), and more specifically, relates to a referring expression comprehension method based on a multi-level expression guiding attention network. Background technique [0002] The main task of Referring Expression Comprehension (REC) is to identify relevant targets or regions in a given image based on natural language expressions. A typical approach to this task is to first use a recurrent neural network model (RNN) to process expression sentences to obtain a representation of the text, and then use a convolutional neural network (CNN) to extract representations of image regions; after that, the two representations are mapped to A common semantic space is used to determine the best matching image regions. [0003] Some existing methods apply self-attention mechanism to implicitly partition expression sentences into different phrase representations (subject, pred...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/32G06K9/46G06K9/62G06N3/04G06N3/08

CPCG06N3/08G06V10/25G06V10/44G06N3/045G06F18/22G06F18/2415

Inventor 杨阳彭亮

Owner GUIZHOU UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Indication expression understanding method based on multi-level expression attention-guiding network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

example

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology