Image subtitle generation method based on multi-attention generative adversarial network

An attention and network technology, applied in biological neural network models, image communication, neural learning methods, etc., can solve problems such as lack of capturing global information
CN110135567APending Publication Date: 2019-08-16CHINA UNIV OF PETROLEUM (EAST CHINA)

Patent Information

Authority / Receiving Office
CN Β· China
Current Assignee / Owner
CHINA UNIV OF PETROLEUM (EAST CHINA)
Publication Date
2019-08-16

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses an image subtitle generation method based on a multi-attention generative adversarial network, belongs to the technical field of image caption generation, and solves the problem that features extracted in the image caption generation method based on the generative adversarial network only contain local points and global information is not captured. A multi-attention mechanism based on local and global information is put forward for the first time to be used for image subtitle generation, and on the basis, a multi-attention generation confrontation image subtitle generation network is put forward and comprises a multi-attention generator and a discriminator. The multi-attention generator is used for generating more accurate sentences, and the multi-attention discriminator is used for judging whether the generated sentences are manually described or generated by a machine. According to the invention, a large number of experimental verifications are carried out onthe proposed framework on the basis of the MSCOCO reference data set, and a very competitive evaluation result is obtained through the evaluation of the MSCOCO subtitle challenge evaluation server.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the technical fields of computer vision and natural language processing, in particular to an image subtitle generation method based on a multi-attention generation confrontation network. Background technique

[0002] The goal of image captioning technology is to generate human-friendly description sentences for a given image. Image subtitle generation technology has set off a research boom in the academic circle, and it is widely used in video retrieval and infant education and other fields. Unlike other computer vision tasks (image classification, object detection, etc.), training an effective image captioning model is more challenging because it requires a comprehensive understanding of the basic entities and their relationships in images. The traditional image subtitle generation model uses an encoder-decoder framework as the core, which uses a convolutional neural network-based encoder to encode pixel-level information int...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More