An image description network and technology based on attribute enhancement attention model

An attention model and image description technology, applied in the field of neural networks, can solve the problems of redundant regional features, scattered attention model weights, not considering whether there is key information, etc., to achieve the effect of improving the ability.

Active Publication Date: 2018-12-07
TSINGHUA UNIV
View PDF8 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Secondly, due to the limitation that the convolutional neural network can only extract the features of the grid-like area, the current image description method based on the attention model still has room for improvement.
Each feature in the feature map of the convolutional neural network is only related to the image information in a fixed area, and does not consider whether there is key information in this area. At the same time, due to the limitation of the fixed receptive field, the features of the area are redundant. , leading to the problem of weight dispersion in the attention model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An image description network and technology based on attribute enhancement attention model
  • An image description network and technology based on attribute enhancement attention model
  • An image description network and technology based on attribute enhancement attention model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0033] An image description network based on attribute-enhanced attention model, including:

[0034] Attribute prediction model: used to use attention technology to use image features as input to obtain prediction results of attribute words; the image features are grid area features of images extracted using convolutional neural networks;

[0035] Sentence generation model: used to generate sentences by using the prediction results of the attribute prediction model as input through attention technology; the prediction results of the attribute prediction model include attribute word distribution information and visual features corresponding to each attribute word.

[0036] In the image description network provided in this embodiment, before constructing and training the image description network, it is first necessary to obtain a data set that can be used for image description and perform data preprocessing on the data set. The dataset consists of images and sentences describin...

Embodiment 2

[0049] An image captioning technique based on an attribute-enhanced attention model, including:

[0050] S1. Obtain a data set for image description and perform data preprocessing on the data set; the data set includes images and sentences used to describe the images; data preprocessing includes extracting attribute words from images and preprocessing sentences;

[0051] S2. Utilize the convolutional neural network and the recurrent neural network to construct the image description network based on the attribute-enhanced attention model as described in embodiment 1, and the image description network includes an attribute prediction model and a sentence generation model;

[0052] S3. First, use attention technology to input image information and attribute word information into the attribute prediction model, and train the attribute prediction model through the cross-entropy loss function as shown below,

[0053]

[0054] In the above formula, V is the grid area feature extra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an image description network and technology based on an attribute enhancement attention model. Through modeling of a correlation and symbiotic relationship between attribute words, the model can predict attribute words not only by using the information of the image, but also by understanding the relationship between attribute words when predicting attributes. At the same time, on the basis of attribute prediction, image information features relative to the attributes are introduced to solve the problems of redundancy of image features and irrelevant features with image content information in a current attention model, thereby improving the ability of predicting attributes and image description of the model.

Description

technical field [0001] The invention belongs to the technical field of neural networks, and in particular relates to an image description network and technology based on an attribute-enhanced attention model. Background technique [0002] Attribute-based and attention-based models are two important approaches in existing image description methods. The attribute-based image description method first identifies some key information in the image, such as people, places, features, etc., and then encodes it into a vector, which is input to a decoder composed of a cyclic neural network for decoding to obtain the final description. sexual statement. This method allows the decoder to perceive the key information of the image, but it relies heavily on the prediction model of attribute words. If the extraction of attribute words is not accurate enough, it will mislead the decoder to generate sentences. The image description method based on the visual attention model solves the proble...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 丁贵广陈辉
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products