Method and apparatus for generating media content, and device and storage medium

By obtaining generation requests to generate intermediate media content and using machine learning models to determine the visual content layout, the problem of low generation efficiency and relevance in existing technologies is solved, and efficient and relevant multi-object visual content generation is achieved.

WO2026123372A1 Publication Date: 2026-06-18BEIJING ZITIAO NETWORK TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
BEIJING ZITIAO NETWORK TECH CO LTD
Filing Date
2024-12-13
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing technologies have low efficiency and relevance in generating media content, making it difficult to effectively utilize user-input reference text and images to generate high-quality multi-object visual content.

Method used

By acquiring generation requests associated with multiple objects, intermediate media content is generated. Then, a machine learning model is used to generate target media content based on reference text and images. The layout of multiple visual contents is determined, and combined with image data and preset parameter information, efficient and relevant media content is generated.

🎯Benefits of technology

It improves the efficiency of media content generation and its relevance to reference text and images, enhances user interactivity, and generates visual content that is closer to user expectations.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
Patent Text Reader

Abstract

On the basis of the embodiments of the present disclosure, a method and apparatus for generating media content, and a device and a storage medium are provided. The method comprises: acquiring a generation request associated with a plurality of objects, wherein the generation request indicates reference text and / or a reference image; and providing target media content generated on the basis of the generation request, wherein the target media content comprises a plurality of pieces of visual content corresponding to the plurality of objects, the plurality of pieces of visual content are determined on the basis of image data of the plurality of objects, the layout of the plurality of pieces of visual content in the target media content is determined on the basis of intermediate media content, and the intermediate media content is generated on the basis of the reference text and / or the reference image. The present disclosure can effectively improve the interactivity of users, and improve the generation efficiency of media content, and the correlation between target media content and reference text and / or a reference image.
Need to check novelty before this filing date? Find Prior Art