Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

An image description method based on deep learning

An image description and deep learning technology, applied in the field of image description based on deep learning, can solve the problems of low sentence accuracy, simple structure, slow model convergence speed, etc., to reduce training time, high accuracy, and good spatial expression ability. Effect

Active Publication Date: 2019-05-03
SHAANXI NORMAL UNIV
View PDF7 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004]The network models used in these methods have problems such as slow model convergence, low accuracy of generated sentences, and simple structure

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An image description method based on deep learning
  • An image description method based on deep learning
  • An image description method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0031] The image data set used in this embodiment is the MSCOCO data set, and the MSCOCO data set consists of images and manually annotated sentences corresponding to the images.

[0032] exist figure 1 , the image description method based on deep learning of the present embodiment consists of the following steps:

[0033] (1) Select 82,783 images and manually annotated sentences corresponding to the images from the MSCOCO dataset as the training set, and select 4,000 images as the test set;

[0034] (2) Build an image description model

[0035] The image description model is composed of a spatial transformation network, a deformable convolutional residual network, and a bidirectional self-constrained threshold recurrent network. The spatial transformation network and the deformable convolutional residual network are used to extract the features of the image. Constrained threshold recurrent network is used to construct language model to generate sentences corresponding to im...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an image description method based on deep learning. The method comprises the steps of dividing an image data set into a training set and a test set; constructing an image description model; training a residual network pre-training model on the ImageNet data set, and loading residual network pre-training model parameters into a deformable convolutional residual network in the image description model; sending the images in the training set into a spatial transformation network, sending an output result of the spatial transformation network to a deformable convolutional residual network, and outputting feature vectors of the images by the deformable convolutional residual network; generating a text sequence corresponding to the image; completing the construction of thelanguage model to generate statements corresponding to the image; using an AdamW optimization algorithm to train the image description model; outputting a description statement corresponding to the image; the extracted image features have better spatial expression capability, the generated sentences are high in accuracy, the language structure is rich, the model training time is short, and the convergence speed is high.

Description

technical field [0001] The invention belongs to the field of artificial intelligence deep learning, in particular to an image description method based on deep learning. Background technique [0002] Image description is the automatic translation of an image by a machine into a sentence that humans can understand. It is a fundamental problem involving computer vision, natural language processing, and machine learning. The system not only needs to identify the objects in the image, but also identifies the attributes, positions and relationships between the objects in the image, and then converts them into sentences with a certain grammatical structure through natural language processing. Image captioning has great implications in helping people with visual impairments alleviate visual impairments, early infancy education, and image retrieval. [0003] The traditional image description is based on templates and semantic transfer-based methods, but the sentence structure genera...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/53G06K9/68G06N3/04G06N3/08
Inventor 郭敏张洁庆彭亚丽肖冰裴炤
Owner SHAANXI NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products