Image description method based on convolutional neural network, computer readable storage medium and electronic equipment

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and image description technology, applied in neural learning methods, biological neural network models, calculations, etc., can solve the problem of time-consuming calculations that cannot process sequence signals in parallel, achieve accurate description, improve computing efficiency, and accurately image content effect

Active Publication Date: 2019-09-27

XI'AN INST OF OPTICS & FINE MECHANICS - CHINESE ACAD OF SCI

View PDF9 Cites 15 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The purpose of the present invention is to solve the problem that the existing recursive neural network method cannot process sequence signals in parallel and time-consuming operation, and propose an image description method based on convolutional neural network, computer-readable storage medium, and electronic equipment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0061] Below in conjunction with accompanying drawing and specific embodiment the content of the present invention is described in further detail:

[0062] The invention discloses a method for image description (illustration generation or sentence generation) based on a convolutional neural network, which automatically generates a section of descriptive text from a picture, and mainly solves the problem that the existing recurrent neural network (Recurrent Neural Network, RNN) method cannot be parallelized Dealing with sequence signal issues. The implementation steps are: (1) pre-train the convolutional neural network in the dataset ImageNet; (2) use the pre-trained convolutional neural network to extract global features and local features in the image-text dataset; (3) integrate The image features and description sentence features of the image-text training set are input to the multimodal recurrent neural network to learn the mapping relationship between the image and text; (...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides an image description method based on a convolutional neural network, a computer readable storage medium and electronic equipment, and solves the problems that an existing recurrent neural network method cannot process sequence signals in parallel and is time-consuming in operation. The method comprises the following steps: 1) pre-training the convolutional neural network; 2) extracting global features and local features of the image, and projecting the global features and the local features of the image into a multi-modal mapping space; 3) performing convolutional coding on the image expression in the multi-modal mapping space in the step 2); 4) carrying out word feature expression; 5) carrying out convolution coding on the description statement in the step 4), 6) calculating the attention, and obtaining the probability of word generation corresponding to the input image; 7) constructing a target loss function between input and output, and performing neural network training by using the loss function to obtain position parameters of the neural network, and 8) inputting the test image into the trained neural network system to obtain a descriptive natural statement corresponding to the test image.

Description

technical field [0001] The invention relates to image and text multimodal fusion technology, in particular to an image description method based on convolutional neural network, computer readable storage medium, and electronic equipment, which can be used for early childhood education, human-computer interaction, and visual aid for visually impaired people. Wait. Background technique [0002] With the development of science and technology, artificial intelligence has gradually become a decisive force to push mankind into the intelligent age. Artificial intelligence studies how to let the machine simulate the human thinking process and intelligent behavior, let the computer automatically generate a descriptive text from the natural image, and describe the content of the image in one sentence. In recent years, deep learning has made great breakthroughs in the fields of computer vision, natural language processing, and speech information processing, and has also received widesp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06T9/00G06N3/04G06N3/08

CPCG06T9/002G06N3/08G06N3/045

Inventor 郑向涛卢孝强吴思远

Owner XI'AN INST OF OPTICS & FINE MECHANICS - CHINESE ACAD OF SCI

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Image description method based on convolutional neural network, computer readable storage medium and electronic equipment

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology