An Image Paragraph Description Method Based on Relational Coding and Hierarchical Attention Mechanism

A technique of attention and encoding, applied in the field of image processing
CN114186568BActive Publication Date: 2022-08-02BEIJING UNIV OF POSTS & TELECOMM

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING UNIV OF POSTS & TELECOMM
Publication Date
2022-08-02

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses an image paragraph description method based on relational coding and hierarchical attention mechanism. The method model is composed of a relational coding module and a hierarchical attention decoding module. The relational encoding module captures the encoded spatial relational information and semantic relational information through two encoders, where the prior knowledge of the semantical relation is learned by training a supervised semantic classifier during semantic relational encoding. The hierarchical attention of the hierarchical attention decoding module uses hierarchical attention with relational gates and visual gates to dynamically fuse relational information and object region features. The relational gates are used to switch between spatial relational information and semantic relational information. To decide whether to use visual information for embedding, the model adopts a strategy from coarse-grained regions to fine-grained spatial and semantic relations to fuse visual information during paragraph generation. Extensive experiments on the Stanford paragraph description dataset show that the method of the present invention is significantly better than the existing methods in multiple evaluation indicators in the field.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the technical field of image processing, in particular to an image paragraph description method based on relational coding and hierarchical attention mechanism. Background technique

[0002] Image captioning is the task of automatically generating a descriptive sentence for a given image, also known as image single-sentence captioning. This basic cross-modality task may have multiple applications, such as image / video retrieval, early childhood education, and helping visually impaired people understand image content. Therefore, this task has attracted a lot of attention from the AI ​​community.

[0003] In the past few years, many studies have made impressive progress on the task of generating one-sentence image descriptions. However, due to the limitation of describing an image in one sentence, it is usually not enough to summarize various details in an image, because "a picture is worth a thousand words". To address the lim...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More