Image description method based on convolutional neural network, computer readable storage medium and electronic equipment

A convolutional neural network and image description technology, applied in neural learning methods, biological neural network models, calculations, etc., can solve the problem of time-consuming calculations that cannot process sequence signals in parallel, achieve accurate description, improve computing efficiency, and accurately image content effect

Active Publication Date: 2019-09-27
XI'AN INST OF OPTICS & FINE MECHANICS - CHINESE ACAD OF SCI
View PDF9 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to solve the problem that the existing recursive neural network method cannot process sequence signals in parallel an

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image description method based on convolutional neural network, computer readable storage medium and electronic equipment
  • Image description method based on convolutional neural network, computer readable storage medium and electronic equipment
  • Image description method based on convolutional neural network, computer readable storage medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] Below in conjunction with accompanying drawing and specific embodiment the content of the present invention is described in further detail:

[0062] The invention discloses a method for image description (illustration generation or sentence generation) based on a convolutional neural network, which automatically generates a section of descriptive text from a picture, and mainly solves the problem that the existing recurrent neural network (Recurrent Neural Network, RNN) method cannot be parallelized Dealing with sequence signal issues. The implementation steps are: (1) pre-train the convolutional neural network in the dataset ImageNet; (2) use the pre-trained convolutional neural network to extract global features and local features in the image-text dataset; (3) integrate The image features and description sentence features of the image-text training set are input to the multimodal recurrent neural network to learn the mapping relationship between the image and text; (...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an image description method based on a convolutional neural network, a computer readable storage medium and electronic equipment, and solves the problems that an existing recurrent neural network method cannot process sequence signals in parallel and is time-consuming in operation. The method comprises the following steps: 1) pre-training the convolutional neural network; 2) extracting global features and local features of the image, and projecting the global features and the local features of the image into a multi-modal mapping space; 3) performing convolutional coding on the image expression in the multi-modal mapping space in the step 2); 4) carrying out word feature expression; 5) carrying out convolution coding on the description statement in the step 4), 6) calculating the attention, and obtaining the probability of word generation corresponding to the input image; 7) constructing a target loss function between input and output, and performing neural network training by using the loss function to obtain position parameters of the neural network, and 8) inputting the test image into the trained neural network system to obtain a descriptive natural statement corresponding to the test image.

Description

technical field [0001] The invention relates to image and text multimodal fusion technology, in particular to an image description method based on convolutional neural network, computer readable storage medium, and electronic equipment, which can be used for early childhood education, human-computer interaction, and visual aid for visually impaired people. Wait. Background technique [0002] With the development of science and technology, artificial intelligence has gradually become a decisive force to push mankind into the intelligent age. Artificial intelligence studies how to let the machine simulate the human thinking process and intelligent behavior, let the computer automatically generate a descriptive text from the natural image, and describe the content of the image in one sentence. In recent years, deep learning has made great breakthroughs in the fields of computer vision, natural language processing, and speech information processing, and has also received widesp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06T9/00G06N3/04G06N3/08
CPCG06T9/002G06N3/08G06N3/045
Inventor 郑向涛卢孝强吴思远
Owner XI'AN INST OF OPTICS & FINE MECHANICS - CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products