Picture description method of guiding attention mode on the basis of attribute probability vector

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A probability vector, attribute technology, applied in the field of deep learning, can solve the problem of inaccurate picture description results

Pending Publication Date: 2018-01-12

SICHUAN UNIV

View PDF7 Cites 10 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] Among the image description models based on attention mode, the Soft Attention (Soft-ATT) model proposed by Xu et al. is very representative, but the obtained image description results are still not accurate enough.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0016] The present invention will be further described below in conjunction with accompanying drawing:

[0017] figure 1 A schematic diagram of initializing the attention model with attribute probability vectors, including the following steps:

[0018] (1) The input image obtains the feature map through the fully convolutional neural network, and then obtains the attribute probability vector through the multi-instance learning algorithm layer.

[0019] (2) The obtained attribute probability vector selects a certain threshold to initialize the LSTM hidden state c 0 , h 0 . The specific initialization method is defined as follows:

[0020] c 0 =f(W ini ⊙V att )

[0021] h 0 =f(W ini ⊙V att )

[0022] In the above formula, W ini Indicates the parameters that need to be learned in training, V att Represents the attribute probability vector, ⊙ represents the multiplication of the corresponding matrix.

[0023] figure 2 It is a functional block diagram of the guidin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a picture description method of guiding an attention mode on the basis of an attribute probability vector. The method comprises the following steps: inputting an image to obtain a feature graph through a fully convolutional neural network, and then obtaining the attribute probability vector through a multi-instance learning algorithm layer; selecting certain threshold values for the obtained attribute probability vector to initialize hidden states of c0 and h0 of a long short-term memory (LSTM) unit; guiding the attention mode through the attribute probability vector, and combining a state of ht-1 of a description statement LSTM of a previous moment to generate an encoding vector, which currently needs to be attended, at a region which is on the feature graph and attended by the attention mode of a current moment; outputting an output state of ht of the current moment by the description statement LSTM according to the current encoding vector; and becoming the state of the previous moment by the output state of the current moment, and repeating the previous operations until generation of a description language is completed. Compared with other methods, the picture description method of guiding the attention mode on the basis of the attribute probability vector of the invention obviously improves effects, is better in comprehensive performance of evaluation indexes, and can basically be competent for general picture description needs.

Description

technical field [0001] The present invention designs a picture description method based on an attribute probability vector to guide an attention pattern, and relates to the technical fields of deep learning and computer vision. Background technique [0002] An important feature of human perception of the world is that they do not process the entire scene at one time, but focus on certain parts of the visual space to obtain the required time and place information, and as time progresses, Human beings will build the internal representation of the scene based on the information of different fixed points to guide a series of cognition and actions in the future; due to the simplicity of some scenes compared to the whole scene, this kind of "limited" brain resources are concentrated on the perception part The mechanism on important scenes directly leads to the reduction of the complexity of human processing scenes, because it allows humans to always place the objects of interest i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62

Inventor 何小海何榜耕张杰苏婕卿粼波吴晓红滕奇志

Owner SICHUAN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Picture description method of guiding attention mode on the basis of attribute probability vector

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A probability vector, attribute technology, applied in the field of deep learning, can solve the problem of inaccurate picture description results

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A probability vector, attribute technology, applied in the field of deep learning, can solve the problem of inaccurate picture description results

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology