Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Picture description method of guiding attention mode on the basis of attribute probability vector

A probability vector, attribute technology, applied in the field of deep learning, can solve the problem of inaccurate picture description results

Pending Publication Date: 2018-01-12
SICHUAN UNIV
View PDF7 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Among the image description models based on attention mode, the Soft Attention (Soft-ATT) model proposed by Xu et al. is very representative, but the obtained image description results are still not accurate enough.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Picture description method of guiding attention mode on the basis of attribute probability vector
  • Picture description method of guiding attention mode on the basis of attribute probability vector
  • Picture description method of guiding attention mode on the basis of attribute probability vector

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] The present invention will be further described below in conjunction with accompanying drawing:

[0017] figure 1 A schematic diagram of initializing the attention model with attribute probability vectors, including the following steps:

[0018] (1) The input image obtains the feature map through the fully convolutional neural network, and then obtains the attribute probability vector through the multi-instance learning algorithm layer.

[0019] (2) The obtained attribute probability vector selects a certain threshold to initialize the LSTM hidden state c 0 , h 0 . The specific initialization method is defined as follows:

[0020] c 0 =f(W ini ⊙V att )

[0021] h 0 =f(W ini ⊙V att )

[0022] In the above formula, W ini Indicates the parameters that need to be learned in training, V att Represents the attribute probability vector, ⊙ represents the multiplication of the corresponding matrix.

[0023] figure 2 It is a functional block diagram of the guidin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a picture description method of guiding an attention mode on the basis of an attribute probability vector. The method comprises the following steps: inputting an image to obtain a feature graph through a fully convolutional neural network, and then obtaining the attribute probability vector through a multi-instance learning algorithm layer; selecting certain threshold values for the obtained attribute probability vector to initialize hidden states of c0 and h0 of a long short-term memory (LSTM) unit; guiding the attention mode through the attribute probability vector, and combining a state of ht-1 of a description statement LSTM of a previous moment to generate an encoding vector, which currently needs to be attended, at a region which is on the feature graph and attended by the attention mode of a current moment; outputting an output state of ht of the current moment by the description statement LSTM according to the current encoding vector; and becoming the state of the previous moment by the output state of the current moment, and repeating the previous operations until generation of a description language is completed. Compared with other methods, the picture description method of guiding the attention mode on the basis of the attribute probability vector of the invention obviously improves effects, is better in comprehensive performance of evaluation indexes, and can basically be competent for general picture description needs.

Description

technical field [0001] The present invention designs a picture description method based on an attribute probability vector to guide an attention pattern, and relates to the technical fields of deep learning and computer vision. Background technique [0002] An important feature of human perception of the world is that they do not process the entire scene at one time, but focus on certain parts of the visual space to obtain the required time and place information, and as time progresses, Human beings will build the internal representation of the scene based on the information of different fixed points to guide a series of cognition and actions in the future; due to the simplicity of some scenes compared to the whole scene, this kind of "limited" brain resources are concentrated on the perception part The mechanism on important scenes directly leads to the reduction of the complexity of human processing scenes, because it allows humans to always place the objects of interest i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
Inventor 何小海何榜耕张杰苏婕卿粼波吴晓红滕奇志
Owner SICHUAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products