Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Diversified image description statement generation technology based on deep learning

A technology of image description and deep learning, applied in neural learning methods, still image data retrieval, still image data indexing, etc., can solve problems such as ignoring image details and generating single sentences, achieve good readability, and improve network parameter utilization rate effect

Pending Publication Date: 2021-10-22
BEIHANG UNIV
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a variety of image description sentence generation technology based on deep learning, which is used to improve the problem of single, broad and ignoring image details in the traditional image description sentence generation technology, and is suitable for generating images into diverse description statement

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Diversified image description statement generation technology based on deep learning
  • Diversified image description statement generation technology based on deep learning
  • Diversified image description statement generation technology based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0059] figure 1 It is the overall flowchart of the image description generation statement: the specific steps are as follows:

[0060] 1) Get a real world image file. 2) For an image file, it must first be matrixed. In the matrix, each element represents the content information of the corresponding position of the picture. The number of matrices and the relationship between values ​​at the same position in different matrices depend on the color type of the picture. 3) In order to speed up the convergence speed of the image description generation model, it is necessary to perform data mapping between [0-1] and standardize the matrixed image description files. 4) The normalized image matrix is ​​input to the deep convolutional neural network. 5) Through the multi-level feature extraction of the deep convolutional neural network, the high-dimensional semantic features of the image are obtained. 6) The high-dimensional semantic features of the image are input into the encoder...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a diversified image description statement generation technology based on deep learning, and belongs to the technical field of image description statement generation. The diversified image description statement generation technology based on deep learning is used for solving the problems that generated statements are single and wide and image details are ignored in a traditional image description statement generation technology, and is suitable for generating diversified description statements from an image.

Description

Technical field: [0001] The invention relates to a technology for generating diversified image description sentences based on deep learning, and belongs to the technical field of image description sentence generation. Background technique: [0002] At present, image description sentence generation technology based on deep learning is generally realized by building an "encoder-decoder" model. Among them, the "encoder" is used to convert the digital matrix of the image into a high-dimensional feature code rich in semantic information, which is implemented by a residual model based on a convolutional neural network; and the "decoder" is used to convert the above high-dimensional features Decode, and input the semantic information into the text generation model to obtain the description sentence. Generally, the decoder uses two structures based on the long-short-time neural network and the self-attention mechanism based on the Transformer (Transformer) for text generation. The ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/51G06N3/04G06N3/08
CPCG06F16/51G06N3/08G06N3/045Y02T10/40
Inventor 任磊孟子豪王涛
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products