Method of text-to-image

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A text and image technology, applied in the field of deep learning, to enhance stability, speed up convergence, and reduce differences

Active Publication Date: 2020-03-06

SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV

View PDF2 Cites 17 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] In order to solve the problem from text to image in the prior art, the present invention provides a method of text to image

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0032] In deep learning, auto-encoder is a widely used method for implementing a generative model and extracting features from data. There are many improvements and variants of autoencoders, such as noise reduction autoencoders, sparse autoencoders, and so on. Among them, the most widely used one is the variational auto-encoder based on variational derivation.

[0033] The self-encoder consists of two deep neural networks, one is the encoder, which is responsible for compressing the input high-dimensional sample data into low-dimensional data features, and the other is the decoder, which is responsible for restoring the low-dimensional data features into high-dimensional data. sample. In order to achieve this effect, the training goal of the autoencoder needs to minimize the reconstruction loss between the input and the output, so L2 loss or binary cross-entropy loss is generally used as the reconstruction loss.

[0034] The variational autoencoder further constrains the dis...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a method of text-to-image. The method comprises the following steps: S1, training an adversarial visual semantic embedding model which comprises an image encoder network, an image decoder network, a generator network and a discriminator network paired with the generator network; S2, inputting a text into the generator network, and embedding text features output by the generator network; and S3, embedding and inputting the text features into the decoder network, wherein the decoder network outputs an image conforming to semantic description of the text. An adversarial training mode is used for enhancing a visual feature embedding method of existing text data, so that the difference between the distribution of the text modal data and the distribution of the image modaldata in a semantic space is reduced.

Description

technical field [0001] The invention relates to the technical field of deep learning, in particular to a text-to-image method. Background technique [0002] Text-to-image generation is a hot research topic in the field of computer vision in recent years. Among the existing methods, the deep generative models based on Generative Adversarial Networks (GAN) are particularly important, because in theory they can generate various realistic images with relatively few model parameters, which means that they have Ability to capture the essence of natural images. GANs have attracted a lot of attention as a class of generative models for their ability to fit the distribution of natural images and have been widely used in various image generation tasks, such as image inpainting, super-resolution, image-to-image translation and future frame prediction. [0003] In recent years, many methods have tried to extract the semantic embedding of text, such as the classic word2vec. In the fi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06T11/00G06N3/04

CPCG06T11/001G06N3/045

Inventor 袁春吴航昊贲有成

Owner SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV

Method of text-to-image

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A text and image technology, applied in the field of deep learning, to enhance stability, speed up convergence, and reduce differences

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A text and image technology, applied in the field of deep learning, to enhance stability, speed up convergence, and reduce differences

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology