Hierarchical 3D refrigerator sticker digital model generation method based on single image
By using a hierarchical 3D refrigerator magnet digital model generation method based on a single image, the problems of long modeling cycle and structural instability in existing technologies are solved, and efficient and automated 3D refrigerator magnet model generation is achieved, ensuring close adhesion between the back and the background board and visual quality.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- QILU UNIVERSITY OF TECHNOLOGY (SHANDONG ACADEMY OF SCIENCES)
- Filing Date
- 2026-04-14
- Publication Date
- 2026-06-16
AI Technical Summary
Existing technologies for generating 3D refrigerator magnets suffer from problems such as long modeling cycles, high labor costs, insufficient three-dimensional depth, and unstable model structures, making it difficult to ensure a tight fit between the back and the background panel. In particular, deformation, breakage, or poor visual quality are prone to occur during the 3D printing process.
The method for generating hierarchical 3D refrigerator magnet digital models based on single images includes generating reference images, separating foreground and background, monocular depth estimation and 3D mesh model construction, and geometric optimization. It utilizes AI image generation and deep learning technologies to automatically complete the separation of foreground and background and 3D reconstruction, and ensures that the back and background are closely attached through geometric optimization.
It enables the automated generation of high-fidelity, manufacturable 3D refrigerator magnet models from single images, maintaining front-view appearance details and ensuring stable adhesion between the back and the background, reducing reliance on manual modeling and improving generation efficiency and structural stability.
Smart Images

Figure CN122023728B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of computer vision and additive manufacturing technology, and in particular to a method for generating hierarchical 3D refrigerator magnet digital models based on a single image. Background Technology
[0002] With the rapid development of additive manufacturing (3D printing) technology and the personalized customization market, the demand for automated 3D modeling technology—which offers low cost, short lead time, high fidelity, and direct printability—is increasing for small cultural and creative decorative items (such as 3D refrigerator magnets and badges). Traditional refrigerator magnet design methods typically rely on manual modeling using 3D modeling software or converting 2D images into simple 2.5D lamination or embossing structures to achieve a 3D effect. However, these methods generally suffer from long modeling cycles, high labor costs, insufficient 3D depth, and difficulty in ensuring the stability of the model structure.
[0003] In recent years, generative artificial intelligence and monocular depth estimation technologies have made significant progress, enabling the automatic generation of 3D models or depth maps from text or images. However, these general 3D generation methods are typically designed for open scenes, and the generated results are difficult to directly apply to 3D refrigerator magnet design in terms of geometry, topological stability, and manufacturability. Specifically, on the one hand, general 3D generation models usually do not provide structural constraints for the thin structure and back-side adhesion requirements of refrigerator magnet products. Due to the irregular back structure, the generated 3D objects are prone to deformation, breakage, or failure to adhere tightly to the background board during 3D printing or assembly. If the generally generated 3D objects are simply spliced with the background board, the lack of refined analysis and optimization of mesh vertex visibility and spatial structural relationships can easily lead to local self-intersections, clipping, or geometric distortions, thus affecting the visual quality and structural stability of the final product. Summary of the Invention
[0004] In view of this, the present invention provides a method for generating hierarchical 3D refrigerator magnet digital models based on a single image, which can maintain the front appearance details and ensure that the back of the 3D model is stably attached to the background plate, so as to meet the requirements of 3D printing manufacturability.
[0005] In a first aspect, the present invention provides a method for generating hierarchical 3D refrigerator magnet digital models based on a single image, the method comprising:
[0006] Step 1: Generate a reference image for the main 3D refrigerator magnet sculpture based on the creative description information;
[0007] Step 2: Separate the main object in the reference image from the background to obtain the foreground RGB image, the foreground 2D mask image, and the background RGB image;
[0008] Step 3: Construct a 3D mesh model based on the foreground RGB image, the foreground 2D mask image, and the background RGB image;
[0009] Step 4: Perform geometric optimization on the 3D mesh model to obtain a hierarchical 3D refrigerator magnet digital model.
[0010] Optionally, step 1 includes:
[0011] First, input the creative prompt word "Prompt". The prompt word contains information about the main object, style description, material characteristics and three-dimensional structure description to enhance the spatial hierarchy and volume of the generated image.
[0012] When using AI-generated image tools to generate images, the following constraints should be applied to the design of the prompts:
[0013] (1) The main subject is prominent, and the main subject is located in the center of the picture; (2) Frontal view, choose the frontal view; (3) Complete outline, clear subject boundary; (4) Clear lighting, the scene lighting is evenly distributed; (5) Three-dimensional expression, add keywords of 3D sculpture style and 3D rendering effect to the prompt words so that the generated image has three-dimensional morphological characteristics.
[0014] Based on the above prompts, the final reference image is the main object with three-dimensional shape features. The reference image serves as the basic input for subsequent foreground-background separation and 3D modeling.
[0015] Optionally, step 2 includes:
[0016] First, the main subject of the reference image is obtained using AI image editing tools or deep learning-based image segmentation models; then, three types of intermediate data are obtained through foreground separation processing:
[0017] (1) Foreground RGB image: that is, a color image containing the central main object, with its background area removed or made transparent, retaining only the main sculpture content. The foreground RGB image is used to generate the three-dimensional mesh model;
[0018] (2) Foreground 2D mask image: It is a binary image with the same resolution as the original image, in which the pixel value of the foreground region is 1 and the pixel value of the background region is 0; the foreground 2D mask image is used for regional constraints and depth estimation range limits in the subsequent 3D modeling process;
[0019] (3) Background RGB image: After the foreground is separated, the main area in the original image will be empty. In order to obtain a complete background image, the empty area is automatically filled by AI image editing tools. The above process uses contextual texture information to generate natural and continuous background content, thereby obtaining a complete background image.
[0020] Optionally, step 3 includes:
[0021] Based on the foreground RGB image, the foreground 2D mask image, and the background RGB image, a 3D mesh model is generated for subsequent geometric optimization and printing manufacturing. This model includes a 3D solid model of the background plate and a 3D mesh model of the foreground object. The two are then aligned in scale and position in 3D space. The workflow is as follows:
[0022] Step 31: Generate the 3D solid model of the background panel;
[0023] First, using the background RGB image as input, the depth information of the background image is generated through a monocular depth estimation method, and the depth values are normalized so that the minimum depth value in the depth map corresponds to the Z = 0 plane in three-dimensional space, thereby establishing a unified spatial reference benchmark.
[0024] Subsequently, based on the mapping relationship between the depth map and the image pixel coordinates, each pixel is converted into vertex coordinates in three-dimensional space, and adjacent pixels are connected into triangular patches according to the image mesh structure, thereby generating the triangular mesh surface on the front of the background panel.
[0025] Next, the background RGB image is mapped onto the triangular mesh surface as a texture map. UV coordinate mapping is used to achieve a one-to-one correspondence between the texture and the geometric structure, so that the background surface retains the color and detail information of the original image.
[0026] Finally, in order to form a background plate structure with solid thickness, the triangular mesh surface is solidified: (1) the four boundary contours of the triangular mesh surface are extracted, including the upper boundary, lower boundary, left boundary and right boundary; (2) the four boundaries are extruded along the negative Z-axis to generate four side wall planes that are approximately perpendicular to the front of the background plate, thus forming the side structure of the background plate; (3) a triangular mesh plane parallel to the front of the background plate is constructed at the bottom of the side wall as the back of the background plate; the triangular mesh on the back is connected to the edges of the four side walls, thus forming a closed three-dimensional solid mesh model together with the front of the background plate.
[0027] Through the above operations, a complete 3D solid model of the background board is obtained;
[0028] Step 32: Generate the 3D mesh model of the foreground object;
[0029] After generating the 3D solid model of the background, the foreground main object is reconstructed in 3D to obtain the 3D mesh model of the foreground object; taking the foreground RGB image as input, the AI tool for generating 3D models from images is called to generate the 3D mesh model of the foreground object, and output in a standard 3D mesh data format, including 3D vertex coordinates (Vertex), triangular facet topology (Face), UV texture coordinates and corresponding texture map images.
[0030] Through the above operations, a three-dimensional mesh model of the foreground object with complete geometric structure and surface texture is obtained, which will be spatially aligned and structurally blended with the background plate in the subsequent process;
[0031] Step 33: Alignment in 3D space;
[0032] Automatically scale and translate 3D objects to align them with the foreground masking area on a 2D projection.
[0033] Optionally, step 33 includes:
[0034] Step 331: Calculate the size of the two-dimensional mask;
[0035] First, a pixel scan is performed on the foreground 2D mask image to extract the boundary range of the foreground mask region in the image coordinate system, including: calculating the maximum and minimum values U of the foreground pixels in the UV direction. min U max V min V max The mask width U is calculated. mask =U max –U min and masking height V mask =V max –V min Simultaneously calculate the two-dimensional center point O of the foreground masking region;
[0036] Step 332: Calculate the dimensions of the three-dimensional object;
[0037] Iterate through the 3D coordinates of all vertices in the 3D mesh model of the foreground object and calculate its spatial extent X in the XY direction. max X min Y max Y min To obtain the spatial dimensions X of the three-dimensional object mesh =X max –X min and Y mesh = Y max –Y min Simultaneously calculate the geometric center point C of the three-dimensional object;
[0038] Step 333: Scale and translation alignment;
[0039] To ensure that the 3D object aligns with the foreground masking area in the 2D projection, the proportional relationship between the 3D object and the masking area is calculated; first, the horizontal scaling factor S is calculated. x = U mask / X mesh and longitudinal scaling factor S y = V mask / Y mesh Then, select the smaller of the two values as the overall size scaling factor: S = min(S x , S y Then, a uniform scaling operation is performed on the coordinates of all vertices of the 3D object: (X,Y,Z) → (S). X, S Y, S Z); After scaling, calculate the two-dimensional translation vector T so that the projection position of the geometric center point C of the three-dimensional object on the xy plane is aligned with the two-dimensional center point O of the foreground masking area, and perform translation transformation on all vertex coordinates.
[0040] Through the above operations, the three-dimensional object is aligned with the foreground masking area in the two-dimensional projection plane, thus providing a spatial basis for subsequent fusion with the background panel;
[0041] Step 334: Three-dimensional height compression;
[0042] The height of the foreground 3D object is moderately compressed, including multiplying the Z coordinates of all vertices of the 3D object by a compression factor α: Z'=α Z, where α∈[0.4, 0.6].
[0043] Optionally, step 4 includes:
[0044] While maintaining the approximate frontal visual appearance of the foreground object, a constrained global optimization of the foreground 3D mesh is performed in the Z direction to ensure that the back of the foreground object fits tightly with the background plate and to guarantee the stability of the mesh topology and surface details. The process includes vertex visibility determination, Z-reference correction, anchor point selection, energy model establishment, and sparse linear system solution.
[0045] Assume the Z-coordinates of the grid vertices are vectors. Let n be the number of vertices. Geometric optimization is based on preserving the local differential coordinates of the mesh, while applying target constraints in the Z direction to several vertices. Its mathematical form is equivalent to solving the following least squares problem with soft constraints:
[0046] ;
[0047] Where L is the discrete Laplacian matrix based on the grid topology, and δ is the difference coordinate of the original Z coordinate; each anchor point group Target depth vector With weight Free points are considered as constraint groups with weights close to zero;
[0048] Step 41: Vertex visibility determination, i.e., front or back visibility;
[0049] Before selecting anchor points, each vertex is classified by visibility, i.e., whether it is visible from the front or the back, to ensure that anchor points only come from the visible area behind the object.
[0050] (1) Pre-selection: Calculate the Z component Nz of the normal N of each vertex. If Nz is consistent with the viewing direction, then the current vertex is a candidate visible vertex.
[0051] (2) Ray occlusion detection: For each candidate vertex, a ray is generated by offsetting a very small distance ε along the observation direction. The ray triangle intersection acceleration structure is used to detect whether the ray intersects with other mesh faces. If they do not intersect, the current vertex is marked as visible.
[0052] (3) Perform detection in two directions, from front to back and from back to front, respectively, to obtain the front visible point set and the back visible point set; the front visible point set is used for subsequent Z-reference correction; the back visible point set is used for anchor point connected component extraction;
[0053] Step 42, Z-reference correction, i.e., initial alignment;
[0054] To ensure that the background and foreground 3D meshes have a unified Z-reference datum, initial Z-reference correction is performed:
[0055] a. Calculate the minimum value of the vertices visible from the front. ;
[0056] b. Subtract the minimum value from the Z-coordinates of all vertices in the entire foreground 3D mesh: ;
[0057] c. Truncate Z values that are less than zero after correction: If If <0, then let =0;
[0058] Step 43: Selection of strong anchor point, weak anchor point, or free point;
[0059] h, strong anchor point;
[0060] In the vertex set visible on the back side, construct an induced subgraph based on the vertex adjacency relationship of the grid; calculate the connected components on the induced subgraph and sort the connected components in descending order of the number of vertices; select the connected component with the most vertices as the candidate strong anchor point region;
[0061] Within the candidate strong anchor point region, select those with Z coordinates less than a threshold τ as the set of strong anchor points. For each strong anchor point, locate the corresponding pixel on the background depth map using its UV or XY projection position on the mesh and read the background depth value. Use the background depth value as the absolute target depth of the current vertex. Strong anchor point weighting The goal of forcibly adhering to the background panel is achieved through strong constraints;
[0062] i. Weak anchor points and free points;
[0063] The set of vertices visible from the back side that are on the back side but are not strong anchors is the set of weak anchors. For weak anchor points, multiply their original Z-coordinate by a compression factor to obtain the relative target. and assign weights Target coefficient A value between 0.5 and 0.7 indicates movement towards the background, but only to a portion of the original depth; vertices other than strong and weak anchor points are defined as free points, and their target weights... A value of 0.001 allows for a smooth transition with the neighborhood through mesh topology pulling;
[0064] Step 44: Construct a sparse linear system and solve it;
[0065] Stack the Laplace equation and anchor point soft constraints row by row to form a linear system. ;
[0066] b= ;
[0067] Among them, the first block L and δ are n×n discrete Laplacian matrices and the difference coordinates are calculated. For each anchor group, construct a sparse matrix. , The number of rows equals the number of vertices, and the number of columns equals n, which is located at (row, col). The constructed target vector is ;
[0068] By constructing the normal equations Then, z is solved using a sparse direct solver, and non-negativity is truncated for z: This is used to prevent vertices from penetrating behind the background; finally, the resulting new z is updated back to the mesh vertices and exported.
[0069] In a second aspect, embodiments of the present invention provide a computer-readable storage medium comprising a stored program, wherein, when the program is executed, it controls the device on which the computer-readable storage medium is located to execute the hierarchical 3D refrigerator magnet digital model generation method based on a single image, as described in the first aspect or any possible implementation thereof.
[0070] Thirdly, embodiments of the present invention provide an electronic device, including: one or more processors; a memory; and one or more computer programs, wherein the one or more computer programs are stored in the memory, and the one or more computer programs include instructions that, when executed by the device, cause the device to perform the hierarchical 3D refrigerator magnet digital model generation method based on a single image in the first aspect or any possible implementation of the first aspect.
[0071] The technical solution provided by this invention includes a method that generates a reference image of the main sculpted 3D refrigerator magnet based on creative description information; separates the main object in the reference image from the background to obtain a foreground RGB image, a foreground 2D mask image, and a background RGB image; constructs a 3D mesh model based on the foreground RGB image, the foreground 2D mask image, and the background RGB image; and performs geometric optimization on the 3D mesh model to obtain a hierarchical 3D refrigerator magnet digital model. This method takes a single reference image as input and automatically completes foreground separation, monocular depth recovery, 3D model generation, and geometric optimization, thus maintaining the details of the front view appearance and ensuring stable adhesion between the back of the 3D model and the background plate, thereby meeting the requirements of 3D printing manufacturability. Attached Figure Description
[0072] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0073] Figure 1 A flowchart illustrating the method for generating a hierarchical 3D refrigerator magnet digital model based on a single image, as provided in an embodiment of the present invention.
[0074] Figure 2 This is a schematic diagram of the 3D refrigerator magnet modeling process provided in an embodiment of the present invention, wherein (a) is a reference image of the theme sculpture, (b) is a foreground RGB image, (c) is a complete background image after completion, (d) is a foreground 3D model generated from the foreground RGB image, (e) is a background board 3D model generated from the background image, (f) is a model after three-dimensional spatial alignment, (g) is a compressed model, and (h) is the final 3D refrigerator magnet digital model.
[0075] Figure 3 This is a schematic diagram of an electronic device provided in an embodiment of the present invention. Detailed Implementation
[0076] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0077] The terminology used in the embodiments of this invention is for the purpose of describing particular embodiments only and is not intended to limit the invention. The singular forms “a,” “the,” and “the” used in the embodiments of this invention are also intended to include the plural forms unless the context clearly indicates otherwise.
[0078] It should be understood that the term "and / or" used in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, or B existing alone. Additionally, the character " / " in this article generally indicates that the preceding and following related objects have an "or" relationship.
[0079] Depending on the context, the word "if" as used here can be interpreted as "when," "when," "in response to determination," or "in response to detection." Similarly, depending on the context, the phrase "if determination" or "if detection (of the stated condition or event)" can be interpreted as "when determination," "in response to determination," "when detection (of the stated condition or event)," or "in response to detection (of the stated condition or event)."
[0080] This invention provides a method for generating hierarchical 3D refrigerator magnet digital models based on a single image, such as... Figure 1 and Figure 2 As shown in (a) to (h) in the figure, the method includes:
[0081] Step 1: Generate a reference image of the main 3D refrigerator magnet sculpture based on the creative description information.
[0082] In this embodiment of the invention, AI image generation technology and image editing technology are used to generate a reference image of a main sculpture with obvious three-dimensional structural features, as well as to automatically separate and complete the foreground subject and background area, providing standardized input for the subsequent generation of three-dimensional 3D refrigerator magnet digital models.
[0083] In this embodiment of the invention, step 1 includes:
[0084] First, the user enters a creative prompt word, which contains information about the main object, style description, material characteristics, and three-dimensional structure description. For example, it includes descriptions such as sculpture style, three-dimensional ornament, 3D rendering effect, and high level of detail to enhance the spatial hierarchy and volume of the generated image.
[0085] When using AI image generation tools (such as Flux, JiMeng, etc.) to generate images, the following constraints are applied to the design of prompts to ensure that the generated results are suitable for the subsequent 3D reconstruction process:
[0086] (1) Prominent subject: The main subject is located in the center of the image to avoid complex obstruction; (2) Frontal view: Choose a frontal view to make the main structure clearly visible; (3) Complete outline: The main subject has a clear boundary to avoid serious fusion with the background; (4) Clear lighting: The scene lighting is evenly distributed to avoid strong shadows; (5) Three-dimensional expression: Add keywords such as 3D sculpture style and 3D rendering effect to the prompts to make the generated image have three-dimensional morphological characteristics.
[0087] Based on the above prompts, the final reference image is the main object with three-dimensional shape features. The reference image serves as the basic input for subsequent foreground-background separation and 3D modeling.
[0088] Step 2: Separate the main object in the reference image from the background to obtain the foreground RGB image, the foreground 2D mask image, and the background RGB image.
[0089] In this embodiment of the invention, after obtaining the reference image of the main sculpture, the main object in the image is separated from the background so that the foreground and background can be reconstructed in three dimensions respectively.
[0090] In this embodiment of the invention, step 2 includes:
[0091] First, the main subject of the reference image is obtained using AI image editing tools (such as Jimeng, Nano Banana2, etc.) or deep learning-based image segmentation models (such as Segment Anything); through foreground separation processing, three types of intermediate data are obtained:
[0092] (1) Foreground RGB image: that is, a color image containing the central main object, with its background area removed or made transparent, retaining only the main sculpture content. The foreground RGB image is used to generate the three-dimensional mesh model;
[0093] (2) Foreground 2D mask image: It is a binary image with the same resolution as the original image, in which the pixel value of the foreground region is 1 and the pixel value of the background region is 0; the foreground 2D mask image is used for regional constraints and depth estimation range limits in the subsequent 3D modeling process;
[0094] (3) Background RGB image: After the foreground is separated, the main area in the original image will be empty. In order to obtain a complete background image, the empty area is automatically filled by AI image editing tools. The above process uses contextual texture information to generate natural and continuous background content, thereby obtaining a complete background image.
[0095] Through the reference image generation stage, the user's creative description is automatically transformed into structured visual data, laying the foundation for the subsequent generation of 3D refrigerator magnet models.
[0096] Step 3: Construct a 3D mesh model based on the foreground RGB image, the foreground 2D mask image, and the background RGB image.
[0097] In this embodiment of the invention, step 3 includes:
[0098] Based on the foreground RGB image, the foreground 2D mask image, and the background RGB image, a 3D mesh model is generated for subsequent geometric optimization and printing manufacturing. This model includes a 3D solid model of the background plate and a 3D mesh model of the foreground object. The two are then aligned in scale and position in 3D space. The workflow is as follows:
[0099] Step 31: Generate the 3D solid model of the background panel;
[0100] First, using the background RGB image as input, the depth information of the background image is generated by monocular depth estimation methods (such as AI models like DepthAnythingV2, MogeV2, DepthPro, MonoReliefV2, etc.), and the depth values are normalized so that the minimum depth value in the depth map corresponds to the Z = 0 plane in three-dimensional space, thereby establishing a unified spatial reference benchmark.
[0101] Subsequently, based on the mapping relationship between the depth map and the image pixel coordinates, each pixel is converted into vertex coordinates in three-dimensional space, and adjacent pixels are connected into triangular patches according to the image grid structure, thereby generating the triangular grid surface on the front of the background panel; this triangular grid surface can retain the slight height undulations in the background image, giving the background panel a certain three-dimensional appearance effect.
[0102] Next, the background RGB image is mapped onto the triangular mesh surface as a texture map. UV coordinate mapping is used to achieve a one-to-one correspondence between the texture and the geometric structure, so that the background surface retains the color and detail information of the original image.
[0103] Finally, in order to form a background plate structure with solid thickness, the triangular mesh surface is solidified: (1) the four boundary contours of the triangular mesh surface are extracted, including the upper boundary, lower boundary, left boundary and right boundary; (2) the four boundaries are extruded along the negative Z-axis to generate four side wall planes that are approximately perpendicular to the front of the background plate, thus forming the side structure of the background plate; (3) a triangular mesh plane parallel to the front of the background plate is constructed at the bottom of the side wall as the back of the background plate; the triangular mesh on the back is connected to the edges of the four side walls, thus forming a closed three-dimensional solid mesh model together with the front of the background plate.
[0104] Through the above operations, a complete 3D solid model of the background board is obtained. This structure can maintain the visual effect of the background image and meet the solid structure requirements of subsequent geometric splicing and 3D printing.
[0105] Step 32: Generate the 3D mesh model of the foreground object;
[0106] After generating the 3D solid model of the background, the foreground main object is reconstructed in 3D to obtain the 3D mesh model of the foreground object; taking the foreground RGB image as input, an AI tool for generating 3D models from images (such as Tencent Hunyuan 3D, Meshy, Tripo, Luma, etc.) is called to generate the 3D mesh model of the foreground object and output it in a standard 3D mesh data format (such as OBJ, PLY, GLTF, etc.), including 3D vertex coordinates (Vertex), triangular facet topology (Face), UV texture coordinates and corresponding texture map images;
[0107] Through the above operations, a three-dimensional mesh model of the foreground object with complete geometric structure and surface texture is obtained, which will be spatially aligned and structurally blended with the background plate in the subsequent process;
[0108] Step 33: Alignment in 3D space;
[0109] Since the foreground 3D object model is generated by an independent 3D AI tool, its size and spatial position are usually inconsistent with the position of the main subject in the background image. Automatic scaling and spatial translation of the 3D object are performed to align it with the foreground masking area on the 2D projection.
[0110] In this embodiment of the invention, step 33 includes:
[0111] Step 331: Calculate the size of the two-dimensional mask;
[0112] First, a pixel scan is performed on the foreground 2D mask image to extract the boundary range of the foreground mask region in the image coordinate system, including: calculating the maximum and minimum values U of the foreground pixels in the UV direction. min Umax V min V max The mask width U is calculated. mask =U max –U min and masking height V mask =V max –V min Simultaneously calculate the two-dimensional center point O of the foreground masking region;
[0113] Step 332: Calculate the dimensions of the three-dimensional object;
[0114] Iterate through the 3D coordinates of all vertices in the 3D mesh model of the foreground object and calculate its spatial extent X in the XY direction. max X min Y max Y min To obtain the spatial dimensions X of the three-dimensional object mesh =X max –X min and Y mesh = Y max –Y min Simultaneously calculate the geometric center point C of the three-dimensional object;
[0115] Step 333: Scale and translation alignment;
[0116] To ensure that the 3D object aligns with the foreground masking area in the 2D projection, the proportional relationship between the 3D object and the masking area is calculated; first, the horizontal scaling factor S is calculated. x = U mask / X mesh and longitudinal scaling factor S y = V mask / Y mesh Then, select the smaller of the two values as the overall size scaling factor: S = min(S x , S y Then, a uniform scaling operation is performed on the coordinates of all vertices of the 3D object: (X,Y,Z) → (S). X, S Y, S Z); After scaling, calculate the two-dimensional translation vector T so that the projection position of the geometric center point C of the three-dimensional object on the xy plane is aligned with the two-dimensional center point O of the foreground masking area, and perform translation transformation on all vertex coordinates.
[0117] Through the above operations, the three-dimensional object is aligned with the foreground masking area in the two-dimensional projection plane, thus providing a spatial basis for subsequent fusion with the background panel;
[0118] Step 334: Three-dimensional height compression;
[0119] To ensure the generated 3D refrigerator magnets maintain good structural stability during practical use (such as hanging or magnetic attachment), the height of the foreground 3D object is moderately compressed, including multiplying the Z-coordinates of all vertices of the 3D object by a compression factor α: Z'=α Z, where α∈[0.4,0.6]. This compression operation can reduce the thickness ratio of the model while maintaining the overall shape characteristics of the object, thereby improving the stability and durability of the refrigerator magnet in the printing, pasting and hanging states.
[0120] Step 4: Perform geometric optimization on the 3D mesh model to obtain a hierarchical 3D refrigerator magnet digital model.
[0121] In this embodiment of the invention, step 4 includes:
[0122] While maintaining the approximate frontal visual appearance of the foreground object, a constrained global optimization of the foreground 3D mesh is performed in the Z direction to ensure that the back of the foreground object fits tightly with the background plate and to guarantee the stability of the mesh topology and surface details. The process includes vertex visibility determination, Z-reference correction, anchor point selection, energy model establishment, and sparse linear system solution.
[0123] Assume the Z-coordinates of the grid vertices are vectors. Let n be the number of vertices. Geometric optimization is based on preserving the local differential coordinates of the mesh, while applying target constraints in the Z direction to several vertices. Its mathematical form is equivalent to solving the following least squares problem with soft constraints:
[0124] ;
[0125] Where L is the discrete Laplacian matrix based on the grid topology, and δ is the difference coordinate of the original Z coordinate; each anchor point group Target depth vector With weight (A larger weight indicates a stronger constraint), and free points are considered as constraint groups with weights close to zero;
[0126] Step 41: Vertex visibility determination, i.e., front or back visibility;
[0127] Before selecting anchor points, each vertex is classified by visibility, i.e., whether it is visible from the front or the back, to ensure that anchor points only come from the visible area behind the object and reduce erroneous constraints caused by occluded areas.
[0128] (1) Pre-selection: Calculate the Z component Nz of the normal N of each vertex. If Nz is consistent with the viewing direction (e.g., Nz>0 for front view and Nz<0 for back view), then the current vertex is a candidate visible vertex.
[0129] (2) Ray occlusion detection: For each candidate vertex, a ray is generated by offsetting a small distance ε along the viewing direction (to avoid intersecting with its own face). The ray triangle intersection acceleration structure (such as the BVH accelerator of Trimesh) is used to detect whether the ray intersects with other facets of the mesh. If they do not intersect, the current vertex is marked as visible.
[0130] (3) Perform detection in two directions, from front to back and from back to front, respectively, to obtain the front visible point set and the back visible point set; the front visible point set is used for subsequent Z-reference correction; the back visible point set is used for anchor point connected component extraction;
[0131] Step 42, Z-reference correction, i.e., initial alignment;
[0132] To ensure that the background and foreground 3D meshes have a unified Z-reference datum, initial Z-reference correction is performed:
[0133] a. Calculate the minimum value of the vertices visible from the front. ;
[0134] b. Subtract the minimum value from the Z-coordinates of all vertices in the entire foreground 3D mesh: ;
[0135] c. Truncate Z values that are less than zero after correction: If If <0, then let =0;
[0136] This correction places the set of visible points in front of the Z=0 plane, thus providing a reliable reference for subsequent approaching of the target towards the background plate.
[0137] Step 43: Selection of strong anchor point, weak anchor point, or free point;
[0138] h, strong anchor point;
[0139] In the set of vertices visible on the back side, construct an induced subgraph based on the vertex adjacency of the mesh (containing only vertices visible on the back side and the connecting edges between them); compute connected components on the induced subgraph and sort the connected components in descending order of the number of vertices; select the connected component with the most vertices as the candidate strong anchor point region; this region usually covers the main area where the model contacts the background plate, which can minimize unnecessary global distortion.
[0140] Within the candidate strong anchor point region, select vertices whose Z-coordinate is less than a threshold τ (vertices closer to the background and with a smaller distance from the background panel) as the set of strong anchor points. For each strong anchor point, locate the corresponding pixel on the background depth map using its UV or XY projection position on the mesh and read the background depth value. Use the background depth value as the absolute target depth of the current vertex. Strong anchor point weighting Its default value is 0.1, which achieves the goal of forcibly adhering to the background board through strong constraints;
[0141] i. Weak anchor points and free points;
[0142] The set of vertices visible from the back side that are on the back side but are not strong anchors is the set of weak anchors. For weak anchor points, instead of directly locking to the background depth, their original Z-coordinate is multiplied by a compression factor to obtain the relative target. and assign weights (Used to guide but not force fit); Target coefficient A value between 0.5 and 0.7 indicates movement towards the background, but only to a portion of the original depth; vertices other than strong and weak anchor points are defined as free points, and their target weights... A value of 0.001 allows for a smooth transition with the neighborhood through mesh topology pulling;
[0143] Step 44: Construct a sparse linear system and solve it;
[0144] Stack the Laplace equation and anchor point soft constraints row by row to form a linear system. ;
[0145] b= ;
[0146] Among them, the first block L and δ are n×n discrete Laplacian matrices and the difference coordinates are calculated. For each anchor group, construct a sparse matrix. , The number of rows equals the number of vertices, and the number of columns equals n, which is located at (row, col). The constructed target vector is ;
[0147] By constructing the normal equations Then, z is solved using a sparse direct solver (such as SciPy's spsolve), and a non-negativity truncation is applied to z: This is used to prevent vertices from penetrating behind the background; finally, the new z-values are updated back to the mesh vertices and exported in formats such as OBJ / PLY / GLTF for integration with the printing pipeline.
[0148] This invention achieves an automated method for recovering and constraining foreground 3D objects from a single image and fitting them to a background using a combination of visualization-guided anchor point selection, differential coordinate conformal analysis, and soft-constrained sparse least squares solution. Its advantages include the ability to automatically infer the anchor points and precise target depth using only the image and monocular depth (background depth map) as references; the preservation of frontal visual details by using differential coordinates avoids significant distortion of the frontal view caused by the need for fitting; and the adoption of a connected component-priority selection and weighted hierarchical anchor point strategy balances fitting robustness with local topological stability.
[0149] The method of this invention comprehensively utilizes generative artificial intelligence image generation technology, 3D model generation technology, and mesh geometry optimization technology to automatically generate a high-fidelity 3D refrigerator magnet digital model suitable for full-color 3D printing and easy to assemble with magnetic or hanging structures from a single reference image.
[0150] The method of this invention achieves automatic modeling through the following technical process: First, the foreground and background of the input image are separated, and the background region is automatically completed; second, a solid model of the background plate and a textured 3D mesh model of the foreground are generated respectively through monocular depth estimation and image-to-mesh 3D generation technology; then, the two are scaled and spatially aligned in 3D space, and the height of the 3D model is moderately compressed to improve structural stability; subsequently, the front and back visibility of the 3D mesh vertices are analyzed, and the back vertices are divided into strong anchor points, weak anchor points, and free points according to the visibility results; finally, a sparse linear optimization algorithm with anchor point constraints is constructed to perform global geometric optimization of the vertex Z coordinate, thereby achieving a tight fit between the back of the foreground 3D model and the background plate, and maintaining the stability of the overall mesh topology.
[0151] Through the above technical solution, this invention can significantly reduce the reliance on manual modeling in the design process of 3D refrigerator magnets, and improve the automation and generation efficiency from a single image to a printable 3D model. At the same time, the 3D model generated by this method has good structural stability, ease of assembly, and visual fidelity, thus better meeting the needs of rapid design and manufacturing of personalized cultural and creative products.
[0152] The technical solution provided by this invention includes a method that generates a reference image of the main 3D refrigerator magnet sculpture based on creative description information; separates the main object in the reference image from the background to obtain a foreground RGB image, a foreground two-dimensional mask image, and a background RGB image; constructs a three-dimensional mesh model based on the foreground RGB image, the foreground two-dimensional mask image, and the background RGB image; and performs geometric optimization on the three-dimensional mesh model to obtain a hierarchical 3D refrigerator magnet digital model. This method takes a single reference image as input and automatically completes foreground separation, monocular depth recovery, three-dimensional model generation, and geometric optimization, which not only maintains the front appearance details but also ensures that the back of the three-dimensional 3D model is stably attached to the background plate, thereby meeting the requirements of 3D printing manufacturability.
[0153] The various steps in the embodiments of the present invention can be performed by an electronic device. This electronic device includes, but is not limited to, tablet computers, portable PCs, and desktop computers.
[0154] This invention provides a computer-readable storage medium including a stored program, wherein, when the program is running, it controls the electronic device containing the computer-readable storage medium to execute the above-described embodiment of the hierarchical 3D refrigerator magnet digital model generation method based on a single image.
[0155] Figure 3 A schematic diagram of an electronic device provided in an embodiment of the present invention, such as... Figure 3 As shown, the electronic device 21 includes a processor 211, a memory 212, and a computer program 213 stored in the memory 212 and executable on the processor 211. When the computer program 213 is executed by the processor 211, it implements the hierarchical 3D refrigerator magnet digital model generation method based on a single image in the embodiment. To avoid repetition, it will not be described in detail here.
[0156] Electronic device 21 includes, but is not limited to, processor 211 and memory 212. Those skilled in the art will understand that... Figure 3 This is merely an example of electronic device 21 and does not constitute a limitation on electronic device 21. It may include more or fewer components than shown, or combine certain components, or different components. For example, electronic device may also include input / output devices, network access devices, buses, etc.
[0157] The processor 211 may be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or any conventional processor.
[0158] The memory 212 can be an internal storage unit of the electronic device 21, such as a hard disk or RAM of the electronic device 21. The memory 212 can also be an external storage device of the electronic device 21, such as a plug-in hard disk, Smart Media Card (SMC), Secure Digital (SD) card, or FlashCard equipped on the electronic device 21. Furthermore, the memory 212 can include both internal and external storage units of the electronic device 21. The memory 212 is used to store computer programs and other programs and data required by network devices. The memory 212 can also be used to temporarily store data that has been output or will be output.
[0159] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0160] The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A method for generating hierarchical 3D refrigerator magnet digital models based on a single image, characterized in that, The method includes: Step 1: Generate a reference image for the main 3D refrigerator magnet sculpture based on the creative description information; Step 2: Separate the main object in the reference image from the background to obtain the foreground RGB image, the foreground 2D mask image, and the background RGB image; Step 3: Construct a 3D mesh model based on the foreground RGB image, the foreground 2D mask image, and the background RGB image; Step 4: Perform geometric optimization on the 3D mesh model to obtain a hierarchical 3D refrigerator magnet digital model; Step 3 includes: Based on the foreground RGB image, the foreground 2D mask image, and the background RGB image, a 3D mesh model is generated for subsequent geometric optimization and printing manufacturing. This model includes a 3D solid model of the background plate and a 3D mesh model of the foreground object. The two are then aligned in scale and position in 3D space. The workflow is as follows: Step 31: Generate the 3D solid model of the background panel; First, using the background RGB image as input, the depth information of the background image is generated through a monocular depth estimation method, and the depth values are normalized so that the minimum depth value in the depth map corresponds to the Z = 0 plane in three-dimensional space, thereby establishing a unified spatial reference benchmark. Subsequently, based on the mapping relationship between the depth map and the image pixel coordinates, each pixel is converted into vertex coordinates in three-dimensional space, and adjacent pixels are connected into triangular patches according to the image mesh structure, thereby generating the triangular mesh surface on the front of the background panel. Next, the background RGB image is mapped onto the triangular mesh surface as a texture map. UV coordinate mapping is used to achieve a one-to-one correspondence between the texture and the geometric structure, so that the background surface retains the color and detail information of the original image. Finally, in order to form a background plate structure with solid thickness, the triangular mesh surface is solidified: (1) the four boundary contours of the triangular mesh surface are extracted, including the upper boundary, lower boundary, left boundary and right boundary; (2) the four boundaries are extruded along the negative Z-axis to generate four side wall planes that are approximately perpendicular to the front of the background plate, thus forming the side structure of the background plate; (3) a triangular mesh plane parallel to the front of the background plate is constructed at the bottom of the side wall as the back of the background plate; the triangular mesh on the back is connected to the edges of the four side walls, thus forming a closed three-dimensional solid mesh model together with the front of the background plate. Through the above operations, a complete 3D solid model of the background board is obtained; Step 32: Generate the 3D mesh model of the foreground object; After generating the 3D solid model of the background, the foreground main object is reconstructed in 3D to obtain the 3D mesh model of the foreground object; taking the foreground RGB image as input, the AI tool for generating 3D models from images is called to generate the 3D mesh model of the foreground object, and output in a standard 3D mesh data format, including 3D vertex coordinates (Vertex), triangular facet topology (Face), UV texture coordinates and corresponding texture map images. Through the above operations, a three-dimensional mesh model of the foreground object with complete geometric structure and surface texture is obtained, which will be spatially aligned and structurally blended with the background plate in the subsequent process; Step 33: Alignment in 3D space; Automatically scale and translate 3D objects to align them with the foreground masking area on a 2D projection; Step 4 includes: While maintaining the approximate frontal visual appearance of the foreground object, a constrained global optimization of the foreground 3D mesh is performed in the Z direction to ensure that the back of the foreground object fits tightly with the background plate and to guarantee the stability of the mesh topology and surface details. The process includes vertex visibility determination, Z-reference correction, anchor point selection, energy model establishment, and sparse linear system solution. Assume the Z-coordinates of the grid vertices are vectors. Let n be the number of vertices. Geometric optimization is based on preserving the local differential coordinates of the mesh, while applying target constraints in the Z direction to several vertices. Its mathematical form is equivalent to solving the following least squares problem with soft constraints: ; Where L is the discrete Laplacian matrix based on the grid topology, and δ is the difference coordinate of the original Z coordinate; each anchor point group Target depth vector With weight Free points are considered as constraint groups with weights close to zero; Step 41: Vertex visibility determination, i.e., front or back visibility; Before selecting anchor points, each vertex is classified by visibility, i.e., whether it is visible from the front or the back, to ensure that anchor points only come from the visible area behind the object. (1) Pre-selection: Calculate the Z component Nz of the normal N of each vertex. If Nz is consistent with the viewing direction, then the current vertex is a candidate visible vertex. (2) Ray occlusion detection: For each candidate vertex, a ray is generated by offsetting a very small distance ε along the observation direction. The ray triangle intersection acceleration structure is used to detect whether the ray intersects with other mesh faces. If they do not intersect, the current vertex is marked as visible. (3) Perform detection in two directions, from front to back and from back to front, respectively, to obtain the front visible point set and the back visible point set; the front visible point set is used for subsequent Z-reference correction; the back visible point set is used for anchor point connected component extraction; Step 42, Z-reference correction, i.e., initial alignment; To ensure that the background and foreground 3D meshes have a unified Z-reference datum, initial Z-reference correction is performed: a. Calculate the minimum value of the vertices visible from the front. ; b. Subtract the minimum value from the Z-coordinates of all vertices in the entire foreground 3D mesh: ; c. Truncate Z values that are less than zero after correction: If If <0, then let =0; Step 43: Selection of strong anchor point, weak anchor point, or free point; h, strong anchor point; In the vertex set visible on the back side, construct an induced subgraph based on the vertex adjacency relationship of the grid; calculate the connected components on the induced subgraph and sort the connected components in descending order of the number of vertices; select the connected component with the most vertices as the candidate strong anchor point region; Within the candidate strong anchor point region, select those with Z coordinates less than a threshold τ as the set of strong anchor points. For each strong anchor point, locate the corresponding pixel on the background depth map using its UV or XY projection position on the mesh and read the background depth value. Use the background depth value as the absolute target depth of the current vertex. Strong anchor point weighting The goal of forcibly adhering to the background panel is achieved through strong constraints; i. Weak anchor points and free points; The set of vertices visible from the back side that are on the back side but are not strong anchors is the set of weak anchors. For weak anchor points, multiply their original Z-coordinate by a compression factor to obtain the relative target. and assign weights Target coefficient A value between 0.5 and 0.7 indicates movement towards the background, but only to a portion of the original depth; vertices other than strong and weak anchor points are defined as free points, and their target weights... A value of 0.001 allows for a smooth transition with the neighborhood through mesh topology pulling; Step 44: Construct a sparse linear system and solve it; Stack the Laplace equation and anchor point soft constraints row by row to form a linear system. ; ,b= ; Among them, the first block L and δ are n×n discrete Laplacian matrices and the difference coordinates are calculated. For each anchor group, construct a sparse matrix. , The number of rows equals the number of vertices, and the number of columns equals n, which is located at (row, col). The constructed target vector is ; By constructing the normal equations Then, z is solved using a sparse direct solver, and non-negativity is truncated for z: This is used to prevent vertices from penetrating behind the background; finally, the resulting new z is updated back to the mesh vertices and exported.
2. The method according to claim 1, characterized in that, Step 1 includes: First, input the creative prompt word "Prompt". The prompt word contains information about the main object, style description, material characteristics and three-dimensional structure description to enhance the spatial hierarchy and volume of the generated image. When using AI-generated image tools to generate images, the following constraints should be applied to the design of the prompts: (1) The main subject is prominent, and the main subject is located in the center of the picture; (2) Frontal view, choose the frontal view; (3) Complete outline, clear subject boundary; (4) Clear lighting, the scene lighting is evenly distributed; (5) Three-dimensional expression, add keywords of 3D sculpture style and 3D rendering effect to the prompt words so that the generated image has three-dimensional morphological characteristics. Based on the above prompts, the final reference image is the main object with three-dimensional shape features. The reference image serves as the basic input for subsequent foreground-background separation and 3D modeling.
3. The method according to claim 2, characterized in that, Step 2 includes: First, the main subject of the reference image is obtained using AI image editing tools or deep learning-based image segmentation models; then, three types of intermediate data are obtained through foreground separation processing: (1) Foreground RGB image: that is, a color image containing the central main object, with its background area removed or made transparent, retaining only the main sculpture content. The foreground RGB image is used to generate the three-dimensional mesh model; (2) Foreground 2D mask image: It is a binary image with the same resolution as the original image, in which the pixel value of the foreground region is 1 and the pixel value of the background region is 0; the foreground 2D mask image is used for regional constraints and depth estimation range limits in the subsequent 3D modeling process; (3) Background RGB image: After the foreground is separated, the main area in the original image will be empty. In order to obtain a complete background image, the empty area is automatically filled by AI image editing tools. The above process uses contextual texture information to generate natural and continuous background content, thereby obtaining a complete background image.
4. The method according to claim 1, characterized in that, Step 33 includes: Step 331: Calculate the size of the two-dimensional mask; First, the foreground two-dimensional mask image is scanned pixel by pixel to extract the boundary range of the foreground mask region in the image coordinate system, including: calculating the maximum and minimum values of the foreground pixels in the UV direction U min 、 max 、 min 、 max , the mask width U mask =U max –U min and the mask height V mask =V max –V min are calculated, and the two-dimensional center point O of the foreground mask region is calculated simultaneously; Step 332: Calculate the dimensions of the three-dimensional object; Iterate through the 3D coordinates of all vertices in the 3D mesh model of the foreground object and calculate its spatial extent X in the XY direction. max X min Y max Y min To obtain the spatial dimensions X of the three-dimensional object mesh =X max –X min and Y mesh = Y max –Y min Simultaneously calculate the geometric center point C of the three-dimensional object; Step 333: Scale and translation alignment; To ensure that the 3D object aligns with the foreground masking area in the 2D projection, the proportional relationship between the 3D object and the masking area is calculated; first, the horizontal scaling factor S is calculated. x = U mask / X mesh and longitudinal scaling factor S y = V mask / Y mesh Then, select the smaller of the two values as the overall size scaling factor: S = min(S x ,S y Then, a uniform scaling operation is performed on the coordinates of all vertices of the 3D object: (X, Y, Z) → (S). X, S Y, S Z); After scaling, calculate the two-dimensional translation vector T so that the projection position of the geometric center point C of the three-dimensional object on the xy plane is aligned with the two-dimensional center point O of the foreground masking area, and perform translation transformation on all vertex coordinates. Through the above operations, the three-dimensional object is aligned with the foreground masking area in the two-dimensional projection plane, thus providing a spatial basis for subsequent fusion with the background panel; Step 334: Three-dimensional height compression; The height of the foreground 3D object is moderately compressed, including multiplying the Z coordinates of all vertices of the 3D object by a compression factor α: Z'=α Z, where α∈[0.4, 0.6].
5. A computer-readable storage medium, characterized in that, The computer-readable storage medium includes a stored program, wherein, when the program is executed, it controls the device on which the computer-readable storage medium is located to perform the hierarchical 3D refrigerator magnet digital model generation method based on a single image as described in any one of claims 1 to 4.
6. An electronic device, characterized in that, include: One or more processors; Memory; And one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more computer programs including instructions that, when executed by the device, cause the device to perform the method for generating hierarchical 3D refrigerator magnet digital models based on a single image as described in any one of claims 1 to 4.