Multiple attention and multiple scale-based image describing method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of image description and attention, applied in the field of image processing, can solve problems such as being unused and ignored

Active Publication Date: 2018-11-23

SHAANXI NORMAL UNIV

View PDF8 Cites 46 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although the above method utilizes the context information in the input image, the language decoding model only uses a single attention model to use the extracted image features, and the input image only uses high-level semantic features, and the features extracted by the shallow convolutional layer are in the network model. is not utilized in , and the contribution of shallow features to image description is ignored

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0049] Taking 100,000 images in the Microsoft Context Common Objects 2014 dataset as an example, the image description generation method based on multi-attention and multi-scale consists of the following steps:

[0050] (1) Select an image detection model for extracting image features

[0051] Select the convolutional neural network area target detection method to construct the target detection model. The convolutional neural network area target detection method is a known method, which has been disclosed in "In Advances in neural information processing systems.2015". Use the 2007 data set of the Pascal visual object classification competition to pre-train the object detection model, and select the model with the best object detection effect in the training as the object detection model for extracting image features.

[0052] (2) Divide the network training set, verification set, and test set

[0053] Divide the Microsoft Context Common Objects 2014 data set into network trai...

Embodiment 2

[0085] Taking 100,000 images in the Microsoft Context Common Objects 2014 dataset as an example, the image description generation method based on multi-attention and multi-scale consists of the following steps:

[0086] In the step (1) of selecting an image detection model for extracting image features, the convolutional neural network area target detection method is selected to construct a target detection model. The convolutional neural network area target detection method is a known method, which has been published in "InAdvances in Neural information processing systems.2015" published. The Pascal Visual Object Classification 2012 dataset is used to pre-train the object detection model, and the model with the best object detection effect in training is selected as the object detection model for extracting image features.

[0087] Other steps are the same as in Example 1. Complete the image description.

[0088] In order to verify the beneficial effects of the present inve...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multiple attention and multiple scale-based image describing method. The method comprises the steps of selecting an image detecting model for extracting image features, dividing into a network training set, a verification set and a test set, extracting the image features, constructing an attention recurrent neural network model, training the attention recurrent neural network model and carrying out image description. According to the method disclosed by the invention, an image description generating network model formed by original image feature extracting, multi-attention multi-scale feature mapping, recurrent neural network residual connecting and recurrent neural network language decoding is constructed, so that the quality of the image description is improvedand the detail of the image description is enriched. By means of the method disclosed by the invention, a high quality image can be generated by adopting the neural network model to carry out description under the circumstance of only having an image.

Description

technical field [0001] The invention relates to the technical field of image processing, and specifically relates to a multi-attention and multi-scale image description method. technical background [0002] In fields such as robot question answering, pedestrian guides, and children's auxiliary education, we often encounter problems that require understanding the meaning of images and conveying them to people through text. Image description combines the two fields of natural language processing and computer vision, and generates language and text corresponding to the content of the image by inputting natural images. [0003] Because the image not only contains basic information indicating the type and location of the object, but also has some high-level information such as relationship and emotion, if only the image object is detected and recognized, a large amount of contextual information including mutual relationship, emotion, etc. will be lost , so how to effectively use...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/62G06K9/46

CPCG06V10/462G06F18/24G06F18/214

Inventor 吴晓军张钰陈龙杰张玉梅

Owner SHAANXI NORMAL UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multiple attention and multiple scale-based image describing method

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A technology of image description and attention, applied in the field of image processing, can solve problems such as being unused and ignored

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of image description and attention, applied in the field of image processing, can solve problems such as being unused and ignored

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology