Unlock instant, AI-driven research and patent intelligence for your innovation.

Text image generation method and system based on multi-stage generative adversarial network

An image generation, multi-stage technology, applied in biological neural network models, image data processing, 2D image generation, etc.

Active Publication Date: 2021-09-07
SHANDONG NORMAL UNIV
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] To sum up, there is no method and system in the prior art that can not only guarantee the quality of the initial image generation, but also fully guarantee the semantic expression

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text image generation method and system based on multi-stage generative adversarial network
  • Text image generation method and system based on multi-stage generative adversarial network
  • Text image generation method and system based on multi-stage generative adversarial network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0050] Such as figure 1 As shown, this embodiment provides a text generation image method based on a multi-stage generative adversarial network. This embodiment uses the method applied to a server as an example for illustration. It can be understood that this method can also be applied to a terminal, or The application includes terminals, servers and systems, and is realized through the interaction between terminals and servers. The server can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or it can provide cloud services, cloud database, cloud computing, cloud function, cloud storage, network server, cloud communication, intermediate Cloud servers for basic cloud computing services such as software services, domain name services, security service CDN, and big data and artificial intelligence platforms. The terminal may be a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speak...

Embodiment 2

[0081] This embodiment provides a text generation image system based on a multi-stage generation confrontation network.

[0082] Such as figure 2 As shown, the multi-stage generative adversarial network with integrated attention mechanism consists of three parts: text feature extraction, generative network and discriminative network. The text description is encoded into a sentence vector and a word vector by a text encoder, the sentence vector is used as the initial feature input, and the word vector is used for initial image generation and post-image refinement, respectively. In the generation stage of the image, the initial features add text features to the generated image through the upward module and the traditional attention module. The discriminative network predicts an adversarial loss to evaluate the visual authenticity and semantic consistency of generated image features by extracting features from generated images and spatially stitching them with textual informati...

Embodiment approach

[0088] As one or more implementations, the generation network module includes: an initial image generation module, a first refinement module, and a second refinement module;

[0089] The initial image generation module is configured to: receive word vectors and splicing vectors, perform word-level deep fusion processing, output initial image feature vectors, and convolute the initial image feature vectors to obtain a first resolution image;

[0090] The first refinement module is configured to: receive a word vector, use a traditional attention mechanism to convert the word vector into a common semantic space of image features, and calculate the word context vector and the initial image feature vector according to the initial image feature vector splicing, outputting the first image feature vector, and convoluting the first image feature vector to obtain a second resolution image;

[0091] The second refinement module is configured to: receive a word vector, use a traditional ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of cross-modal generation, and provides a text image generation method and system based on a multi-stage generative adversarial network. The method comprises the following steps: acquiring text information, inputting the text information into a text encoder, and extracting sentence vectors and word vectors; performing conditional enhancement processing on the sentence vector to obtain a conditional vector, and performing vector splicing on the conditional vector and the noise vector to obtain a spliced vector; inputting the word vector and the splicing vector into a generation network, and respectively outputting a first resolution image, a second resolution image and a third resolution image through the processing of an initial image generation stage, a first refining stage and a second refining stage; and inputting the first resolution image, the second resolution image and the third resolution image into a discrimination network, extracting image features, carrying out space splicing on the image features and condition vectors, carrying out convolution processing on the obtained vectors, introducing a target loss function, and strengthening the similarity between the generated image and a real image.

Description

technical field [0001] The invention belongs to the technical field of cross-modal generation, and in particular relates to a text generation image method and system based on a multi-stage generation confrontation network. Background technique [0002] The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art. [0003] Automatically generating images from natural language descriptions is a fundamental problem in many applications, such as art generation and computer-aided design, etc. It realizes the conversion from text modalities to image modalities, and promotes multimodal learning across vision and language and reasoning research progress. The use of Generative Adversarial Networks (GAN) in generating images from text has resulted in a huge improvement in the quality of generated images. The stability of the early generative confrontation network is difficult to be guaranteed, but w...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/205G06F40/126G06F40/30G06K9/46G06K9/62G06T11/00G06N3/04
CPCG06F40/205G06F40/126G06F40/30G06T11/001G06N3/045G06F18/25Y02T10/40
Inventor 刘丽王泽康马跃崔怀磊张化祥
Owner SHANDONG NORMAL UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More