Deep attention mechanism-based image description generation method

An image description and attention technology, applied in the field of image understanding, can solve problems such as shallow model depth, complex processing process, and weak semantic information of sentences, and achieve the effect of improving semantic expression ability, deep expression ability, and improving method performance

Active Publication Date: 2018-05-18
TONGJI UNIV
View PDF10 Cites 65 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these methods rely too much on the previous visual technology, the processing process is complicated, and the language model for generating sentences at the back end of the system is insufficiently optimized; when LSTM unit

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep attention mechanism-based image description generation method
  • Deep attention mechanism-based image description generation method
  • Deep attention mechanism-based image description generation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. This embodiment is carried out on the premise of the technical solution of the present invention, and detailed implementation methods and specific operation processes are given, but the protection scope of the present invention is not limited to the following embodiments.

[0042] This embodiment proposes an image description generation method based on a deep attention mechanism, including:

[0043] The step of establishing the deep long-term short-term memory network model is to add the attention mechanism function between the units of the long-term short-term memory network model, and use the training picture features extracted by the convolutional neural network and the description information of the training picture to add the attention mechanism function The long-term short-term memory network is trained to obtain a deep long-term short-te...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a deep attention mechanism-based image description generation method. The method comprises a deep long-short-term memory network model building step of adding an attention mechanism function between units of a long-short-term memory network model and training a long-short-term memory network added with the attention mechanism function by utilizing features and descriptioninformation of training images extracted by a convolutional neural network to obtain a deep long-short-term memory network model, and an image description generation step of generating descriptions corresponding to the images through a convolutional neural network model and the deep long-short-term memory network model in sequence according to the images to be subjected to description generation.Compared with the prior art, the method has the advantages of high information extraction effectiveness, high deep expression capability, accurate description and the like.

Description

technical field [0001] The invention relates to the field of image understanding, in particular to an image description generation method based on a deep attention mechanism. Background technique [0002] Image caption generation is a very challenging task, and it has broad application prospects in the fields of early childhood education, visually impaired assistance, and human-computer interaction. It combines the two fields of natural language processing and computer vision to describe a natural image in the form of natural language, or translate the image into natural language. It first requires the system to be able to accurately understand the content in the image, such as identifying the scene in the image, various objects, object attributes, ongoing actions, and the relationship between objects, etc.; then according to grammatical rules and language structure, generate human understandable sentences. [0003] A variety of methods have been proposed to solve this pro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/28G06N3/04G06K9/62
CPCG06F40/55G06N3/045G06F18/214
Inventor 王瀚漓方芳
Owner TONGJI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products