Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

An image description method based on distributed word vector cnn-rnn network

A technology of image description and word embedding, applied in biological neural network models, semantic analysis, instruments, etc., can solve problems such as difficult training, long training time, and insufficient display semantics

Active Publication Date: 2021-09-24
GUILIN UNIV OF ELECTRONIC TECH
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, the network proposed by Mao et al. has a parallel structure, and the image embedding and word embedding are fused together to complete the sentence construction through the idea of ​​feature fusion; The initial state of the hidden layer of the LSTM unit h 0 and c 0 , the prediction of the sentence starts at t=1; the method proposed by You et al. directly uses the image embedding as the input of the initial state of the LSTM unit; in the work of Liu et al., the semantic specification layer is proposed to realize the structured training strategy. The two subnetworks solve the problems of difficult training, long training time, and noise interference to CNN during training. At the same time, the concept of display semantics is introduced to make the tasks of the two subnetworks clear in the network. However, the display semantics using one-hot representation has obvious problems. lack of
The number of words involved in image description is tens of thousands, and the semantic space formed by one-hot representation is very limited. Therefore, the semantic space formed by one-hot representation ignores a large number of semantics and cannot meet the needs of image description tasks.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An image description method based on distributed word vector cnn-rnn network
  • An image description method based on distributed word vector cnn-rnn network
  • An image description method based on distributed word vector cnn-rnn network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0032] refer to figure 1 , an image description method based on a distributed word vector CNN-RNN network, comprising the following steps:

[0033] 1) Generation of distribution representation word vector: with the help of distribution representation word vector generation tool Word2vec, generate natural sentence form label I of training set image seq-label The words contained in (w 1 ,w 2 ,w 3 ,...) distribution represents the word vector (p 1 ,p 2 ,p 3 ,...), the contained vocabulary p and its corresponding distributed word vector w are called vocabulary;

[0034] 2) Generation of distribution representation labels: refer to figure 2 , image 3 , to convert the natural sentence form label of the entire training set image, that is, the natural sentence form label I of image I seq-label Use the vocabulary in step 1) as a unit to represent with distributed word vectors one by one, and arrange them into a distributed representation label matrix Here...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an image description method based on a distributed word vector CNN-RNN network, which is characterized in that it comprises the following steps: 1) generation of distribution representation word vector; 2) generation of distribution representation label; 3) distribution representation semantic label 4) network design; 5) generating descriptive sentences for images. This method is introduced into the original CNN-RNN network model so that it can generate more accurate results, enable the CNN subnetwork to provide richer semantic content to the RNN subnetwork, and enable the entire CNN-RNN network model to remain structured Advantages, the low-dimensional dense distribution representation in this method can easily embed a large number of words to form a complete semantic space, the visual content can be better mapped to the semantic space, and the supervisory signal designed based on the distribution representation word vector can be more accurately summarized Visual content and fuller use of vector space supervised CNN optimization direction.

Description

technical field [0001] The invention relates to the technical field of intelligent image processing, in particular to an image description method based on a distributed word vector CNN-RNN network. Background technique [0002] In the field of computer vision, breakthroughs continue to be made in basic vision tasks such as image classification, object detection, and semantic segmentation. People's interest gradually turns to image description, a more complex and advanced visual task. The specific task of image description is to generate descriptive sentences of semantic information in the image. Therefore, it is not only necessary to identify and understand (refer to action) the relevant content in the image, but also to describe it in the form of natural language. In practical applications such as assistive systems for the blind, image retrieval, and intelligent interactive systems, the ability to use images to generate corresponding natural language descriptions is crucia...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/33G06F40/30G06F40/44G06K9/62G06N3/04
CPCG06F16/3344G06F40/44G06F40/30G06N3/045G06F18/214
Inventor 莫建文王少晖欧阳宁林乐平袁华首照宇张彤陈利霞肖海林
Owner GUILIN UNIV OF ELECTRONIC TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products