Image subtitle generation method based on multi-attention generative adversarial network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An attention and network technology, applied in biological neural network models, image communication, neural learning methods, etc., can solve problems such as lack of capturing global information

Pending Publication Date: 2019-08-16

CHINA UNIV OF PETROLEUM (EAST CHINA)

View PDF3 Cites 17 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] The purpose of the present invention is to solve the problem that the features extracted in the image subtitle generation method based on the generative confrontation network only contain local conditions, but lack of capturing global information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0080] The accompanying drawings are for illustrative purposes only and should not be construed as limiting the patent.

[0081] The present invention will be further elaborated below in conjunction with the accompanying drawings and embodiments.

[0082] figure 1 Schematic diagram of the adversarial network architecture for multi-attention generation. Such as figure 1 As shown, the multi-attention generation confrontation network includes two multi-attention generators (XE-Generator, RL-Generator) and a multi-attention discriminator, where the cross-entropy-generator (XE-Generator) and reinforcement learning-generation Both RL-Generators are multi-attention generators with the same structure, but different training strategies, and both training strategies are trained based on the proposed multi-attention generator structure.

[0083] figure 2 It is a schematic diagram of the multi-attention mechanism network structure. Such as figure 2 As shown, the top of the figure ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an image subtitle generation method based on a multi-attention generative adversarial network, belongs to the technical field of image caption generation, and solves the problem that features extracted in the image caption generation method based on the generative adversarial network only contain local points and global information is not captured. A multi-attention mechanism based on local and global information is put forward for the first time to be used for image subtitle generation, and on the basis, a multi-attention generation confrontation image subtitle generation network is put forward and comprises a multi-attention generator and a discriminator. The multi-attention generator is used for generating more accurate sentences, and the multi-attention discriminator is used for judging whether the generated sentences are manually described or generated by a machine. According to the invention, a large number of experimental verifications are carried out onthe proposed framework on the basis of the MSCOCO reference data set, and a very competitive evaluation result is obtained through the evaluation of the MSCOCO subtitle challenge evaluation server.

Description

technical field [0001] The invention relates to the technical fields of computer vision and natural language processing, in particular to an image subtitle generation method based on a multi-attention generation confrontation network. Background technique [0002] The goal of image captioning technology is to generate human-friendly description sentences for a given image. Image subtitle generation technology has set off a research boom in the academic circle, and it is widely used in video retrieval and infant education and other fields. Unlike other computer vision tasks (image classification, object detection, etc.), training an effective image captioning model is more challenging because it requires a comprehensive understanding of the basic entities and their relationships in images. The traditional image subtitle generation model uses an encoder-decoder framework as the core, which uses a convolutional neural network-based encoder to encode pixel-level information int...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06N3/08H04N5/278H04N21/488H04N21/81

CPCG06N3/049G06N3/08H04N5/278H04N21/4884H04N21/8133G06N3/045

Inventor 曹海文魏燚伟吴春雷王雷全邵明文

Owner CHINA UNIV OF PETROLEUM (EAST CHINA)

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Image subtitle generation method based on multi-attention generative adversarial network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology