Multimode recurrent neural network picture description method based on FCN feature extraction

A technology of cyclic neural network and image description, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve problems such as inability to generate more complete image descriptions, loss, and inability to generate image descriptions

Inactive Publication Date: 2017-06-13
SYSU CMU SHUNDE INT JOINT RES INST +1
View PDF2 Cites 45 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Although M-RNN can achieve good results under various test standards, the model can only generate descriptions for large-area targets in the image.
For some areas that occupy a small area in the image, their information has been lost wh

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multimode recurrent neural network picture description method based on FCN feature extraction
  • Multimode recurrent neural network picture description method based on FCN feature extraction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] Such as Figure 1-2 As shown, a multimodal recurrent neural network image description method based on FCN feature extraction includes the following steps:

[0050] S1 construction and training of fully convolutional network FCN

[0051] S1.1 Acquire images: Download the PASCAL VOC dataset from the Internet, which provides a set of standard excellent datasets for image recognition and image classification. And use this data set to fine-tune and test the model;

[0052] S1.2 Adjust the existing trained convolutional neural network model Alex Net to obtain a preliminary full convolutional network model;

[0053] S1.3 Delete the classification layer of the Alex Net convolutional neural network, and convert the fully connected layer to a convolutional layer;

[0054] S1.4 Perform 2x upsampling on the result of the convolution of the highest pooling layer 5 to obtain an upsampling prediction of the pooling layer 5, and the prediction result has rough image information. Th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a multimode recurrent neural network picture description method based on FCN feature extraction. A multimode model composed of three parts, namely, a recurrent neural network (RNN), a fully convolutional neural network (FCN) and a multimode layer is obtained by training massive images labeled with text description, and automatic generation of text description of any input test image is achieved. By means of the method, image features can be effectively extracted, more detail information of the images can be retained, and the relation between words in the text description and the images can be better established. The method has significant advantages on semantics-based description between image salient targets or scenes.

Description

technical field [0001] The present invention relates to the field of artificial intelligence, and more specifically, relates to a multimodal recurrent neural network image description method based on FCN feature extraction. Background technique [0002] In recent years, the cyclic neural network RNN ​​and convolutional neural network CNN have achieved success in natural language processing and image classification processing respectively, which has led to the emergence of a combination of cyclic neural network and convolutional neural network for automatic image description in the field of machine learning. method. Automatic generation of image description is an important branch of artificial intelligence, which can be widely used in image retrieval, blind navigation and so on. Therefore, it has attracted the attention of more and more researchers. In 2011, Mikolov et al. proposed a recurrent neural network model for natural language processing, which achieved the best res...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06N3/04G06N3/08
CPCG06F16/51G06F16/5866G06N3/084G06N3/045
Inventor 胡海峰王伟轩张俊轩杨梁王腾
Owner SYSU CMU SHUNDE INT JOINT RES INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products