Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

An image description method based on a bidirectional double-attention mechanism

An image description and attention technology, applied in computer parts, instruments, computing and other directions, can solve problems such as reduced accuracy, and achieve the effect of high accuracy and good generalization

Active Publication Date: 2019-06-21
SHANXI UNIV
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the current description is highly correlated with the previous and subsequent information, the model only considers the picture and the generated above information, which will reduce the accuracy of the description

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An image description method based on a bidirectional double-attention mechanism
  • An image description method based on a bidirectional double-attention mechanism
  • An image description method based on a bidirectional double-attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Recurrent neural network RNN ​​is a type of neural network for processing sequence data, mainly processing and predicting sequence data. figure 1 A typical recurrent neural network is shown. At each moment, input x t and the last hidden layer state h t-1 As input to the recurrent neural network, the recurrent neural network produces an output o t and update h t passed to the next moment. Since the variables and operations in the cyclic neural network are the same at different times, the cyclic neural network can be regarded as the result of the same neural network being replicated infinite times. A represents all other states inside the hidden layer.

[0029] Recurrent neural networks have only a "causal" structure, and the state at the current moment can only obtain information from past states and current inputs. But in many application tasks, the output is likely to be dependent on the entire sequence. In order to solve this problem, a Bidirectional Recurrent ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an image description method based on a bidirectional double-attention mechanism. The image description method based on the bidirectional double-attention mechanism includes that: image features of the image are extracted through a convolutional neural network; The convolutional neural network takes the image feature of the last convolutional layer as the input of an attention mechanism, and inputs the image feature into a bidirectional long-short term memory network containing the attention mechanism; The attention mechanism obtains the hidden layer state of the last bidirectional long-short-term memory network, the bidirectional long-short-term memory network predicts the current hidden layer state by using the last hidden layer state, the significant image information and the current input, and then inputs the current hidden layer state into the attention mechanism to obtain the current significant information; And the bidirectional attention network predicts and describes the image according to the forward hidden layer state, the salient image information, the backward hidden layer state and the salient information.

Description

technical field [0001] The invention relates to an image description method. Background technique [0002] In recent years, based on computer vision and natural language processing, a lot of research has been done on image description research. The process of image description is to input the image into the "encoding-decoding model" to generate a language description. Encoding converts the input image into a fixed-length vector, and decoding converts the generated vector into an output language sequence. The commonly used encoder model in image description is Convolutional Neural Network (CNN for short), and the decoder is various variants of Recurrent Neural Network (RNN for short), such as Long Short-Term Memory Network (Long Short-Term Memory Network). Term Memory Network, referred to as LSTM). In recent years, Kelvin Xu et al. have introduced the attention mechanism to focus on the salient parts of the image when generating descriptions, thereby improving the accuracy...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62
Inventor 张丽红陶云松
Owner SHANXI UNIV
Features
  • Generate Ideas
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More