Method and apparatus for generating media content, and device and storage medium
By obtaining generation requests to generate intermediate media content and using machine learning models to determine the visual content layout, the problem of low generation efficiency and relevance in existing technologies is solved, and efficient and relevant multi-object visual content generation is achieved.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- BEIJING ZITIAO NETWORK TECH CO LTD
- Filing Date
- 2024-12-13
- Publication Date
- 2026-06-18
AI Technical Summary
Existing technologies have low efficiency and relevance in generating media content, making it difficult to effectively utilize user-input reference text and images to generate high-quality multi-object visual content.
By acquiring generation requests associated with multiple objects, intermediate media content is generated. Then, a machine learning model is used to generate target media content based on reference text and images. The layout of multiple visual contents is determined, and combined with image data and preset parameter information, efficient and relevant media content is generated.
It improves the efficiency of media content generation and its relevance to reference text and images, enhances user interactivity, and generates visual content that is closer to user expectations.
Smart Images

Figure 1 
Figure 2