Multiple attention and multiple scale-based image describing method

A technology of image description and attention, applied in the field of image processing, can solve problems such as being unused and ignored

Active Publication Date: 2018-11-23
SHAANXI NORMAL UNIV
View PDF8 Cites 46 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although the above method utilizes the context information in the input image, the language decoding model only uses a single attention model to use the extracted image features, and the input image only use...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multiple attention and multiple scale-based image describing method
  • Multiple attention and multiple scale-based image describing method
  • Multiple attention and multiple scale-based image describing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] Taking 100,000 images in the Microsoft Context Common Objects 2014 dataset as an example, the image description generation method based on multi-attention and multi-scale consists of the following steps:

[0050] (1) Select an image detection model for extracting image features

[0051] Select the convolutional neural network area target detection method to construct the target detection model. The convolutional neural network area target detection method is a known method, which has been disclosed in "In Advances in neural information processing systems.2015". Use the 2007 data set of the Pascal visual object classification competition to pre-train the object detection model, and select the model with the best object detection effect in the training as the object detection model for extracting image features.

[0052] (2) Divide the network training set, verification set, and test set

[0053] Divide the Microsoft Context Common Objects 2014 data set into network trai...

Embodiment 2

[0085] Taking 100,000 images in the Microsoft Context Common Objects 2014 dataset as an example, the image description generation method based on multi-attention and multi-scale consists of the following steps:

[0086] In the step (1) of selecting an image detection model for extracting image features, the convolutional neural network area target detection method is selected to construct a target detection model. The convolutional neural network area target detection method is a known method, which has been published in "InAdvances in Neural information processing systems.2015" published. The Pascal Visual Object Classification 2012 dataset is used to pre-train the object detection model, and the model with the best object detection effect in training is selected as the object detection model for extracting image features.

[0087] Other steps are the same as in Example 1. Complete the image description.

[0088] In order to verify the beneficial effects of the present inve...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multiple attention and multiple scale-based image describing method. The method comprises the steps of selecting an image detecting model for extracting image features, dividing into a network training set, a verification set and a test set, extracting the image features, constructing an attention recurrent neural network model, training the attention recurrent neural network model and carrying out image description. According to the method disclosed by the invention, an image description generating network model formed by original image feature extracting, multi-attention multi-scale feature mapping, recurrent neural network residual connecting and recurrent neural network language decoding is constructed, so that the quality of the image description is improvedand the detail of the image description is enriched. By means of the method disclosed by the invention, a high quality image can be generated by adopting the neural network model to carry out description under the circumstance of only having an image.

Description

technical field [0001] The invention relates to the technical field of image processing, and specifically relates to a multi-attention and multi-scale image description method. technical background [0002] In fields such as robot question answering, pedestrian guides, and children's auxiliary education, we often encounter problems that require understanding the meaning of images and conveying them to people through text. Image description combines the two fields of natural language processing and computer vision, and generates language and text corresponding to the content of the image by inputting natural images. [0003] Because the image not only contains basic information indicating the type and location of the object, but also has some high-level information such as relationship and emotion, if only the image object is detected and recognized, a large amount of contextual information including mutual relationship, emotion, etc. will be lost , so how to effectively use...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06K9/46
CPCG06V10/462G06F18/24G06F18/214
Inventor 吴晓军张钰陈龙杰张玉梅
Owner SHAANXI NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products