An image description method based on deep learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An image description and deep learning technology, applied in the field of image description based on deep learning, can solve the problems of low sentence accuracy, simple structure, slow model convergence speed, etc., to reduce training time, high accuracy, and good spatial expression ability. Effect

Active Publication Date: 2019-05-03

SHAANXI NORMAL UNIV

View PDF7 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004]The network models used in these methods have problems such as slow model convergence, low accuracy of generated sentences, and simple structure

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0031] The image data set used in this embodiment is the MSCOCO data set, and the MSCOCO data set consists of images and manually annotated sentences corresponding to the images.

[0032] exist figure 1 , the image description method based on deep learning of the present embodiment consists of the following steps:

[0033] (1) Select 82,783 images and manually annotated sentences corresponding to the images from the MSCOCO dataset as the training set, and select 4,000 images as the test set;

[0034] (2) Build an image description model

[0035] The image description model is composed of a spatial transformation network, a deformable convolutional residual network, and a bidirectional self-constrained threshold recurrent network. The spatial transformation network and the deformable convolutional residual network are used to extract the features of the image. Constrained threshold recurrent network is used to construct language model to generate sentences corresponding to im...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an image description method based on deep learning. The method comprises the steps of dividing an image data set into a training set and a test set; constructing an image description model; training a residual network pre-training model on the ImageNet data set, and loading residual network pre-training model parameters into a deformable convolutional residual network in the image description model; sending the images in the training set into a spatial transformation network, sending an output result of the spatial transformation network to a deformable convolutional residual network, and outputting feature vectors of the images by the deformable convolutional residual network; generating a text sequence corresponding to the image; completing the construction of thelanguage model to generate statements corresponding to the image; using an AdamW optimization algorithm to train the image description model; outputting a description statement corresponding to the image; the extracted image features have better spatial expression capability, the generated sentences are high in accuracy, the language structure is rich, the model training time is short, and the convergence speed is high.

Description

technical field [0001] The invention belongs to the field of artificial intelligence deep learning, in particular to an image description method based on deep learning. Background technique [0002] Image description is the automatic translation of an image by a machine into a sentence that humans can understand. It is a fundamental problem involving computer vision, natural language processing, and machine learning. The system not only needs to identify the objects in the image, but also identifies the attributes, positions and relationships between the objects in the image, and then converts them into sentences with a certain grammatical structure through natural language processing. Image captioning has great implications in helping people with visual impairments alleviate visual impairments, early infancy education, and image retrieval. [0003] The traditional image description is based on templates and semantic transfer-based methods, but the sentence structure genera...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/53G06K9/68G06N3/04G06N3/08

Inventor 郭敏张洁庆彭亚丽肖冰裴炤

Owner SHAANXI NORMAL UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

An image description method based on deep learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology