Image description confrontation generation method based on reinforcement learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for image description and enhanced learning, applied in neural learning methods, biological neural network models, instruments, etc., can solve problems that need to be improved, and achieve the effects of improving uniqueness, ensuring registration fidelity, and increasing diversity

Pending Publication Date: 2022-02-08

ZHEJIANG LAB +1

View PDF0 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The advancement of image retrieval has greatly helped the research on the uniqueness of image descriptions, but the uniqueness of description generation in image descriptions still needs to be improved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0046] Specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0047] The present invention first adopts the image retrieval method (VSE++) that uses difficult samples to improve joint semantic embedding, trains the data sets MSCOCO and Flickr30K, maps images and text descriptions into the same space, and uses triplet loss to obtain training A good similar image and a model that describe the common space of the text; then rely on the generative confrontation network (GAN) to generate a unique image, specifically, use the generative network to extract features from the image data, generate a description of the input image, and use A discriminative network and a discriminative loss to distinguish this description from other d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an image description confrontation generation method based on reinforcement learning. The method comprises the following steps: S1, retrieving similar images and text description thereof according to a to-be-described image; s2, constructing an image description generation network based on an attention mechanism, introducing the attention mechanism and a long and short-term memory network when a text is generated from an image to be described, and obtaining a generated text description by combining the output of the long and short-term memory network with the extracted image features through the attention mechanism and loss calculation; s3, constructing an image description similarity pairing discrimination network; s4, performing pairing discrimination through a discrimination network by using the to-be-described image, the text description of the annotation pairing of the to-be-described image, the text description generated by the generative network and the text description of the similar image of the to-be-described image, and performing joint iterative optimization on the generative network and the discrimination network according to a reward value output by the discrimination network; and S5, inputting an image of which description is to be generated into the trained generative network to generate text description.

Description

technical field [0001] The present invention relates to the intersecting technical field of computer vision and NLP, in particular to an image description confrontation generation method based on reinforcement learning. Background technique [0002] Image captioning is an emerging research task. Over the past two decades, the fields of natural language processing (NLP) and computer vision (CV) have made tremendous progress in analyzing and generating text and understanding images. Although both fields share a set of approaches based on machine learning and artificial intelligence, they have been studied separately in the past and have not interacted much in the scientific community. However, in recent years, with the progress in the field of artificial intelligence and the development of deep learning models, scholars have become more and more interested in the problem of combining language and visual information. At the same time, there is a large amount of data combining...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06V10/74G06V10/774G06V10/82G06K9/62G06N3/04G06N3/08

CPCG06N3/049G06N3/08G06N3/045G06F18/22G06F18/214

Inventor 王蕊吕飞霄李太豪裴冠雄

Owner ZHEJIANG LAB

Image description confrontation generation method based on reinforcement learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology