An Image Paragraph Description Method Based on Relational Coding and Hierarchical Attention Mechanism

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A technique of attention and encoding, applied in the field of image processing

Active Publication Date: 2022-08-02

BEIJING UNIV OF POSTS & TELECOMM

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, there is a serious problem with this simple fusion method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0070] In order to make those skilled in the art better understand the technical solutions of the present invention, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments.

[0071] The image paragraph description method (DualRel) based on relational coding and hierarchical attention mechanism of the present invention, the details of the DualRel model are as follows figure 2 shown. Our DualRel model contains two main modules, a relational encoding module and a hierarchical attention decoding module. The relation encoding module inputs the region feature V, the region position B and the region category O, and generates the spatial relationship encoding feature V through the spatial relation encoder and the semantic relation encoder, respectively. P and semantic relation encoding features V s , and in order to supervise the model to learn prior knowledge about semantic relations, we propose a novel semantic ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an image paragraph description method based on relational coding and hierarchical attention mechanism. The method model is composed of a relational coding module and a hierarchical attention decoding module. The relational encoding module captures the encoded spatial relational information and semantic relational information through two encoders, where the prior knowledge of the semantical relation is learned by training a supervised semantic classifier during semantic relational encoding. The hierarchical attention of the hierarchical attention decoding module uses hierarchical attention with relational gates and visual gates to dynamically fuse relational information and object region features. The relational gates are used to switch between spatial relational information and semantic relational information. To decide whether to use visual information for embedding, the model adopts a strategy from coarse-grained regions to fine-grained spatial and semantic relations to fuse visual information during paragraph generation. Extensive experiments on the Stanford paragraph description dataset show that the method of the present invention is significantly better than the existing methods in multiple evaluation indicators in the field.

Description

technical field [0001] The invention relates to the technical field of image processing, in particular to an image paragraph description method based on relational coding and hierarchical attention mechanism. Background technique [0002] Image captioning is the task of automatically generating a descriptive sentence for a given image, also known as image single-sentence captioning. This basic cross-modality task may have multiple applications, such as image / video retrieval, early childhood education, and helping visually impaired people understand image content. Therefore, this task has attracted a lot of attention from the AI community. [0003] In the past few years, many studies have made impressive progress on the task of generating one-sentence image descriptions. However, due to the limitation of describing an image in one sentence, it is usually not enough to summarize various details in an image, because "a picture is worth a thousand words". To address the lim...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G06F40/30G06N3/04G06N3/08

CPCG06F40/30G06N3/049G06N3/08G06N3/045

Inventor李睿凡刘云石祎晖冯方向马占宇王小捷

OwnerBEIJING UNIV OF POSTS & TELECOMM

An Image Paragraph Description Method Based on Relational Coding and Hierarchical Attention Mechanism

What is AI technical title? AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document. A technique of attention and encoding, applied in the field of image processing

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A technique of attention and encoding, applied in the field of image processing

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology