Systems and methods for extended PBR materials in image synthesis (EPBR)
By extending PBR materials with transparency and background channels, the technique addresses the challenge of synthesizing complex materials, offering efficient and controlled image synthesis for transparent surfaces.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- FUTUREWEI TECHNOLOGIES INC
- Filing Date
- 2026-03-20
- Publication Date
- 2026-06-25
AI Technical Summary
Existing PBR materials struggle to accurately model and synthesize complex materials like glass, windows, and polished metals due to their high-specular and transparent surfaces, while learning-based approaches lack physical consistency and computational efficiency.
Incorporate a transparency channel into material channels and a background channel into illumination channels, using screen-space ray tracing and image inpainting to model both reflection and transmission properties, enabling accurate synthesis of transparent and translucent materials.
Provides deterministic and interpretable image synthesis with precise control over material properties, reducing computational overhead and enhancing efficiency for real-time rendering applications.
Smart Images

Figure US2026020218_25062026_PF_FP_ABST
Abstract
Description
SYSTEMS AND METHODS FOR EXTENDED PBR MATERIALS IN IMAGE SYNTHESIS (ePBR)CROSS-REFERENCE TO RELATED APPLICATION
[0001] This patent application claims priority to U. S. Provisional Application No. 63 / 775,693, filed on March 21, 2025, entitled “ePBR: Extended PBR Materials in Image Synthesis,” and U. S. Provisional Application No. 63 / 977,703, filed on February 6, 2026, entitled “SIRR-LMM: Single-image Reflection Removal via Large Multimodal Model,” applications of which are hereby incorporated by reference herein as if reproduced in their entirety.TECHNICAL FIELD
[0002] The present disclosure relates generally to computer vision and graphics, and, in particular embodiments, to systems and methods for image or video synthesis.BACKGROUND
[0003] Realistic indoor or outdoor image synthesis can be technically challenging in computer vision and graphics. The learning-based approach is easy to use but lacks physical consistency, while traditional Physically Based Rendering (PBR) offers high realism but is computationally expensive. Intrinsic representation offers a well-balanced trade-off, decomposing images into fundamental components (intrinsic channels) such as geometry, materials, and illumination for controllable synthesis. However, existing PBR materials struggle with complex surface models, particularly high-specular and transparent surfaces.SUMMARY
[0004] Technical advantages are generally achieved, by implementations of this disclosure which describe methods, apparatus, and system.
[0005] In accordance with implementations, an apparatus receives image data representing a scene. The apparatus extracts intrinsic channels from the image or video data. The intrinsic channels comprise geometry channels, material channels, and illumination channels. The material channels include a transparency channel. The illumination channels include a mirror reflectance channel and a background channel. The apparatus composites a synthesized image based on the geometry channels, the material channels, and the illumination channels.FW 6000752PCT03 1[ooo6] In accordance with some implementations, the transparency channel may represent light transmission properties of surfaces in the scene. The mirror reflectance channel may represent reflected scene content. The background channel may represent scene content visible through transparent surfaces in the scene.
[0007] In accordance with some implementations, the mirror reflectance channel maybe generated using screen-space raytracing based on the geometry channels.
[0008] In accordance with some implementations, the background channel may be generated using image inpainting when scene content behind a transparent surface is not directly visible.
[0009] In accordance with some implementations, to composite the synthesized image, the apparatus may compute a diffuse reflection component based on an albedo channel in the material channels and a diffuse irradiance channel in the illumination channels. The apparatus may compute a specular reflection component based on the mirror reflectance channel. The apparatus may compute a transmission component based on the transparency channel and the background channel.
[0010] In accordance with some implementations, to composite the synthesized image, the apparatus may combine the diffuse reflection component weighted by an inverse of the transparency channel, the specular reflection component, and the transmission component weighted by the transparency channel.
[0011] In accordance with some implementations, the apparatus may composite the synthesized image is based on: I = (i-T)(i-M)Idiff + Ispec + TItran. I denotes the synthesized image. T denotes the transparency channel. M denotes a metallic channel in the material channels. (t-T) denotes the inverse of the transparency channel. (t-M) denotes an inverse of the metallic channel. Idiff denotes the diffuse reflection component. Ispredenotes the specular reflection component. Itran denotes the transmission component.
[0012] In accordance with some implementations, Idiff = A-E, and A denotes the albedo channel, E denoting the diffuse irradiance channel. Ispec= (A·F0+ B)·CONV(K, Amr), and Fodenotes a Fresnel coefficient. B denotes a roughness-dependent value. K denotes a filtering kernel. Amr denotes the mirror reflectance channel. CONV denotes a convolution operation. Itran= (A·F0+ B)·CONV(K, CONV(K, Abg))·A, and Abgdenotes the background channel.
[0013] In accordance with some implementations, the transparency channel enables modeling of thin transparent surfaces in the scene.FW 6000752PCT03 2
[0014] The described technical solutions provide several technical advantages over conventional approaches. First, by introducing a transparency channel to the material channels and a background channel to the illumination channels, the disclosed techniques could extend conventional reflection-only intrinsic representations to support both reflection and transmission properties, enabling more accurate modeling and synthesis of transparent and translucent surfaces such as glass, windows, clear plastic, acrylic, ice, and mirror-like materials (e.g., metals or polished wood / marble surfaces) that conventional PBR materials struggle to handle. Further, the explicit intrinsic compositing framework provides deterministic image synthesis, offering precise and predictable control over material properties, in contrast to learning-based rendering methods that produce stochastic and less controllable outputs. Moreover, the analytical solution for image composition could operate without Monte Carlo sampling, eliminating stochastic sampling noise and reducing computational and memory overhead associated with high sample counts, making the technique more efficient and well-suited for realtime rendering applications, high-resolution image generation, and interactive material editing in applications such as virtual content creation, digital art, photo editing, and photorealistic rendering.BRIEF DESCRIPTION OF THE DRAWINGS
[0015] For a more complete understanding of the present disclosure, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
[0016] FIG. 1 illustrates examples of highly specular and transparent objects that are very common in the real world;
[0017] FIG. 2 shows an example of thin surface assumption, according to some implementations;
[0018] FIG. 3 shows examples of three typical materials, according to some implementations;
[0019] FIG. 4 shows example roughness evaluation, according to some implementations;
[0020] FIG. 5 shows example generation of Amrfrom SSRT, according to some implementations;
[0021] FIGs. 6A-6D illustrate example ePBR material intrinsic evaluation, according to some implementations;FW 6000752PCT03 3
[0022] FIG. 7 shows example intrinsic channels used in the described model, according to some implementations;
[0023] FIG. 8 shows examples of image composition, according to some implementations;
[0024] FIG. 9 illustrates an example of flowchart of operations performed by a device, in accordance with some implementations;
[0025] FIG. 10 illustrates an example of a computing system that may be used for implementing the devices and methods described herein, in accordance with some implementations.
[0026] Corresponding numerals and symbols in the different figures generally refer to corresponding parts unless otherwise indicated. The figures are drawn to clearly illustrate the relevant aspects of the embodiments and are not necessarily drawn to scale.DETAILED DESCRIPTIONS
[0027] The present disclosure extends intrinsic representations to incorporate both reflection and transmission properties, enabling the synthesis of transparent materials such as glass, clear plastic, acrylic, ice, and mirror-like materials (e.g., metals or polished wood / marble surfaces). This disclosure provides an explicit intrinsic compositing framework that provides deterministic, interpretable image synthesis. With the Extended PBR Materials (ePBR), the described technical solution can effectively edit the materials with precise controls.
[0028] Image synthesis can be a fundamental challenge in computervision and graphics, with applications ranging from content generation to realistic scene manipulation. Existing synthesis methods can be broadly classified into three main approaches, each offering different levels of control, realism, and computational efficiency.
[0029] Physically Based Rendering (PBR) generates images by simulating the physical interaction of light with scene geometry, materials, and illumination. This method provides highly realistic results by ensuring consistency with real-world physics. However, it requires a complete 3-dimensional (3D) scene representation and is computationally expensive, making it impractical for scenarios where only a single image is available or when real-time performance is needed.
[0030] Learning-based direct image generation leverages deep generative models like diffusion models to synthesize images based on learned priors. These models can efficiently generate high-quality, diverse images, making them suitable for applications FW 6000752PCT03 4like texture synthesis, artistic rendering, and creative content generation. However, since these approaches do not rely on explicit scene representations, they lack precise control over scene properties such as geometry, material consistency, and lighting, often leading to physically implausible results.
[0031] Intrinsic representation for image synthesis offers a hybrid approach that balances realism and controllability. Decomposing an image into intrinsic channels — such as geometry, materials, shading, and illumination — provides a structured representation that enables flexible image manipulation while maintaining a meaningful connection to physical properties. Unlike full PBR rendering, which requires complex simulations, intrinsic representations can synthesize images efficiently by modifying specific scene attributes and recombining them. This approach is widely adopted in real-time rendering applications, such as video games, where intrinsic channels facilitate material and lighting adjustments without expensive re-rendering.
[0032] Despite advances in intrinsic decomposition, current models often struggle to handle complex materials commonly found in real-world environments, particularly in indoor scenes. Traditional reflection models, such as Lambertian reflectance for diffuse surfaces and microfacet models for glossy surfaces, fail to capture highly specular and transparent materials like glass, mirrors, and polished metals, as shown in FIG. 1. This disclosure describes an extended intrinsic representation incorporating reflection and transmission properties to address these technical challenges, allowing for a more comprehensive material synthesis.
[0033] By leveraging these enhanced intrinsic channels, the described technical approach enables more accurate and controllable image synthesis, bridging the gap between physically based methods and artificial intelligence (Al)-driven generative models. This framework provides a structured yet efficient way of synthesizing diverse and realistic images while maintaining control over scene properties. It is well-suited for applications in virtual content creation, digital art, and photorealistic rendering. The described technique can be summarized as below.
[0034] This disclosure extends the intrinsic representation of images from reflection only to transmission, which could model transparent / translucent materials like windows or glass doors.
[0035] This disclosure provides an explicit intrinsic compositing method to render the image. Compared to Diffusion -based rendering, the disclosed technique provides more accurate controls on high specular regions, and the output is deterministic. This provides efficient feedback without constraints on GPU memory or image resolution. FW 6000752PCT03 5
[0036] This disclosure focuses on intrinsic channels (X) followed by RGB <-> X itself, and to verify the robustness of X, this disclosure provides an analytic solution for image composition. Replacing PBR materials with extended PBR materials may also be implemented.
[0037] Physically Based Rendering (PBR) has been a foundational approach in computer graphics, focusing on simulating the interaction of light with complex scene descriptions including geometry, materials, and illumination. Traditional rendering pipelines heavily rely on Monte Carlo (MC) light transport simulation, which, despite its accuracy, introduces stochastic noise due to limited sampling. Another key challenge in PBR is the representation of materials. A variety of real-world surfaces could be modeled by the Bidirectional Scattering Distribution Function (BSDF). One of the most influential solutions is the Disney Principled BSDF, which is largely grounded in physical principles and empirical observations. While not strictly enforcing physical accuracy, the Disney BSDF provides a practical and intuitive parameterization of plausible materials, prioritizing artistic control and ease of use over strict physical correctness. This approach has significantly influenced material systems across commercial and open-source rendering engines, including Blender’s Principled BSDF, Unreal Engine’s Physically-based materials, Mitsubas’s Principled and Thin Principled BSDF, etc. Its widespread adoption underscores its effectiveness in balancing realism and creative flexibility within PBR frameworks. However, real-world materials are more than a simple surface model. Due to the complexity of the materials, researchers use specific appearance models to represent them, e.g., layered objects, hair, cloth, and iridescence material.
[0038] Recent advancements in image synthesis have explored alternative paradigms, mainly through generative models, which diverge from classical rendering methodologies. Notably, large-scale diffusion models have demonstrated remarkable success in producing highly realistic images by iteratively refining noise into structured visual content, e.g., DALL E and Stable Diffusion. These models extend the neural denoising approach to its extreme, leveraging probabilistic image generation from pure Gaussian noise. Unlike traditional PBR, which necessitates explicit scene descriptions, diffusion -based approaches learn complex visual distributions from extensive datasets, synthesizing diverse and photorealistic imagery without explicit physical simulation. However, such models are hard to train, and generated images are hard to control. Most of the other solutions try to fine-tune the pre-trained models for various domains and conditioning. ControlNet has been widely adopted for tasks requiring precise layout control, including sketch-to-image synthesis, depth-aware image generation, and pose-conditioned rendering. IC-Light changes the illumination of the image but preserves the FW 6000752PCT03 6underlying image details and maintains intrinsic properties, such as albedos, unchanged. More works that use diffusion models for appearance and illumination Manipulation are summarized in the related works of IC-Light and survey.
[0039] Accurate rendering realistic images from explicit scene descriptions is often computationally expensive and labor-intensive. Conversely, generating images from learned data through deep generative models offers efficiency but lacks precise control over scene structure. Intrinsic representation provides a promising middle ground by balancing accuracy and flexibility, enabling structured yet editable scene decomposition. A well-designed intrinsic representation should capture fundamental physical properties such as geometry, materials, and illumination while remaining intuitive for manipulation. In synthetic image generation, particularly in real-time rendering pipelines, intrinsic properties can be efficiently extracted from render buffers, making them readily available for downstream tasks such as relighting and material editing. These representations have also been widely integrated into modern neural rendering and inverse graphics frameworks, facilitating explicit control over scene components for high-quality synthesis and manipulation. By bridging physically based rendering with data-driven approaches, intrinsic representations play a crucial role in achieving both photorealism and user-controllable scene generation.
[0040] Intrinsic geometry representations focus on recovering structural scene information such as normal maps and depth maps. Traditional methods rely on photometric constraints, while neural approaches use large-scale datasets to predict detailed geometric properties from single images.
[0041] Intrinsic lighting representation aims to estimate scene illumination separately from geometry and reflectance. There are several representations used in different tasks, such as environment maps, spherical harmonics, and lighting fields.
[0042] Reflectance-shading is one of the most studied forms of intrinsic representation. This approach separates an image into reflectance (albedo) and shading components. Reflectance captures the inherent color of surfaces, independent of lighting, while shading encodes the interaction of light with surface geometry. Other techniques further separate it to diffuse shading and specular residual.
[0043] The most common representation in recent research is using parametric microfacet Bidirectional Reflectance Distribution Functions (BRDFs) with PBR. Other than albedo, intrinsic representations also include additional material properties such as roughness, metallicity or specularity, which are also called PBR materials. It is widely used in single object tasks, like 3D reconstruction and texture generation. It is also used FW 6000752PCT03 7in estimating spatially varying BRDFs for complex materials but with metallicity replaced by specularity. Now the representation is expanded to indoor image decomposition and editing and even for videos.
[0044] Intrinsic representations play a key role in bridging physically based rendering and generative models, enabling interpretable, editable, and physically consistent scene decomposition. Their continued development contributes to advances in inverse rendering, neural rendering, and material estimation, expanding the capabilities of both traditional and learning-based graphics pipelines.
[0045] Limited types of materials with only diffuse and specular reflectance are used in indoor image manipulation. This disclosure can extend the materials to transparent materials for both light reflection and transmittance. Only a few works consider glasslike materials. Materialist supports adding transparent objects in the image. Alchemist can control the transparency of a single object. Some techniques take transparency as an intrinsic map to support fabrics better.
[0046] The rendering equation for a non-emissive surface point p is below,L(p, u?o) = / / (p, ujj) L(p, wj) - np| du?j (1)Js2". ' '!
[0047] where L(p, co0) is the outgoing radiance at position p in the viewing direction of a>o (p to camera origin). L(p, «;) is the irradiance at p with the lighting direction leaving from p. The radiance at p is the integration of irradiance in all the directions of a sphere with weights of f(p, oo0, w;), which is the BSDF at p. And npis the surface normal of p. p may be dropped for simplification.
[0048] Disney Principled BSDF is awidely used appearance model in all kinds of rendering engines since it can support many types of materials and lighting effects, including diffuse, subsurface scattering, retroreflective, specular reflectance, clear coating, transmissive surface with refraction, and sheen compensation for retroreflective.
[0049] The BRDF used in recent research works is a simplified Disney Principled BSDF model, which can only represent a surface’s diffuse and specular reflection. In this section, this disclosure extends it to a thin-surface model that can handle both reflection and transmission, / = kdfd + kafs+ kt ft, (2)FW 6000752PCT03
[0050] where ka, ksand ktare the coefficients of diffuse (fa), specular reflectance (fs) and specular transmittance (ft) terms.Reflection only surface
[0051] The described technique can use the Lambertian model to estimate the diffuse reflectance (a is the albedo of the surface) without considering subsurface scattering and grazing retroreflective.
[0052] This disclosure could ignore the clearcoat and only use one microfacet model for specular reflectance,
[0053] hr is the half vector between woand or, which is hr= (or, + CDi) / (l | cco+ CDi| |).
[0054] Normal distribution function (D), also known as the specular distribution, describes the normal distribution of micro-facets for the surface. This disclosure may use Ground Glass Unknown (GGX) distribution which is defined as follow (r is the surface roughness).
[0055] Fresnel reflection coefficient (F) describes the amount of light that reflects from a mirror surface given its Index of Refraction (IOR). For the Fresnel term, this disclosure could use Schlick’s approximation, (Fo = (1 - q)2 / (i + h)2)- F(hnu>o) = Fo + (1 - FQ) (1 ~ |u7o* (6)
[0056] Geometric attenuation (G) term describes the shadowing from the microfacets. This disclosure could use Smith’s method (independent of h) with Schlick approximation for it (k = r2 / 2).FW 6000752PCT03 9|n ♦ u?G| in • n • uzoj ( 1 — k) + k |n - u?i | ( 1 — Zc) 4-(7) Transparent thin surface
[0057] For a general transparent object, like a bottle of orange juice, light refracts into the liquid and out from the other side after absorption and scattering. This process relies on the properties of two surfaces and the volumes in between.
[0058] This disclosure may only focus on thin surfaces with two parallel surfaces and zero thickness, which approximate a real transparent surface, such as windows or a glass table.
[0059] FIG. 2 shows an example of thin surface assumption, according to some implementations. Ignoring the internal reflection, light traveling through a transparent thin surface refracts twice as it enters and exits, and reflects once only on the top surface. For a smooth surface, light exits with the same direction as it enters and the offset could be ignored. With the assumption of a thin surface (e.g., shown in FIG. 2), it can be observed that light bending due to refraction approximately cancels, and the offset of incoming and outgoing light could be ignored. The specular transmission could be modeled by the microfacet distribution, the same as the specular lobe (fs), but reflected to the other side. If the internal reflections between two surfaces are not modeled while only considering two roughnesses, the transmission lobe could be written as,I)( h% ) ) £T( u / f ) ft - - -TT - 1 - - (^)4 ■ n| - n|
[0060] D is the Extended Normal Distribution Function (eNDF) and can be estimated by joint spherical warping strategy. htis the half vector between woand w;, which is ht=—( o + r|toi) / ( | |too + Titoil |).The ePBR material model
[0061] This disclosure can combine the three terms Equation (3), Equation (4) and Equation (8) and weight them to get the final BSDF,f= (1 — t) (1 ™ + f$ + t fti(9) FW 6000752PCT03 10
[0062] where m is metallic and t is transparency. Albedo (a) is shared with the incident specular response to support metallic materials. So the Fois modified accordingly, Fo= lerp(F0, a, m).
[0063] FIG. 3 illustrates examples of three typical materials, according to some implementations. From left to right, the examples of metal (t = o, m = 1), dielectric (t = o, m = o) and glass (t = 1, m = o), respectively. FIG. 3 presents some special cases, and the linear mixture of them can model all the other materials.
[0064] When t = o and m = 0, f = fd+ fs(Fo= 0.04), which indicates most dielectric materials in real life.
[0065] When t = o and m = 1, f = fs(Fo= a), which is conductor / metal.
[0066] When t = 1 and m = o, f = fs(Fo= 0.04) + ft(F0= 0.04), which represents transparent glass.
[0067] When t = 1 and m = 1, which is transparent metal, an invalid material.
[0068] In some implementations, this disclosure automatically sets m = o if t > o to avoid getting the invalid material.Screen-space image synthesis
[0069] This section describes how to use intrinsic channels X to synthesize the final image I.
[0070] First Equation (9) could be put back to Equation (1), and the result could be split into three individual components,
[0071] where kd= (1-t)(1-m), ks= 1, kt= t, and p is dropped for simplification.Energy conservation is not considered here.FW 6000752PCT03 11
[0072] There’s no analytical solution for the radiance integrals. Monte Carlo simulation with importance sampling is widely used to compute the color of a shading point p seen from a pixel x.
[0073] Inspired by Unreal Engine’s Split-Sum method, and followed by real-time denoising using BRDF pre-integration factorization, each radiance integral in Equation (10) can be demodulated into two independent integrals, material (Fβ) and weighted-lighting (Lβ),
[0074] where β ∈ (b, s, t) is short for diffuse reflectance, specular reflectance and specular transmittance.Diffuse reflectance
[0075] For diffuse component, Fa can be calculated directly.
[0076] And its corresponding La is actually the diffuse irradiance,
[0077] which is a commonly used geometry buffer (G-buffer) from deferred rendering. It represents the amount of light reaching a shading point integrated over the upper cosine-weighted hemisphere and can be directly estimated from input image.
[0078] So the diffuse reflectance Idiffis written as below.FW 6000752PCT03 12Idiff= AE (15)Specular reflectance
[0079] For specular component, this disclosure can convert Fsinto a linear function of Fo.Fs= ∫ fs(ωo, ωi) |ωi· n| dωi= AF0+ B (16)
[0080] The values of A and B depend on roughness R. They could be precomputed and saved to a lookup table.
[0081] The corresponding Lsis the integral of the incident lighting weighted by the fs| Wi • n| value. Instead of using the importance sampling to convolve the environment map with GGX distribution, this disclosure could define a normalized filtering kernel K(R, d) based on D(h) | Wi • n |. It has been verified that terms other than D have relatively little effect on the shape of BSDF. The kernel shape varies with the roughness and the shading distance, which can create blurring effects (see FIG. 4) when applying it to the mirror-like reflection image (Amr).Ls= CONV(K, Amr)
[0082] FIG. 4 shows example roughness evaluation, according to some implementations. The drawings labelled with IMPL (implementations) show the described technique directly applying filtering kernel to the specular reflection image, according to some implementations. The drawings labelled with GT show path tracing with Monte Carlo sampling.
[0083] To get the reflection layer Amr, this disclosure could use Screen Space Ray Tracer (SSRT) to find the reflection color for each pixel x. This disclosure could take depth map D and first compute the viewing direction for each point p. With the normal map N, the reflection ray can be obtained. And the described technique could trace the ray in the screen space to find the corresponding color and the distance (see FIG. 5). Amr= SSRT(D, N) (18)FW 6000752PCT03 13
[0084] The Ainr may have holes since the reflection ray may hit the back face of an object in the scene or leave the screen space. Image inpainting could be applied here if needed.
[0085] FIG. 5 shows example generation of Amrfrom SSRT, according to some implementations. To get a mirror-like reflection image for a region of interest, SSRT could be used to trace the ray and find the corresponding color.
[0086] At the end, the specular reflectance Ispec can be written as below.Ispec= (AF0+ B) CONV(K, Amr) (19) Specular transmittance
[0087] The way to compute transmittance (Ft) is similar to Fs. The main difference is the kernel K (R, d) = D(h)|wi • n| that this disclosure applies to the background image (Abg) has a wider distribution since the light will go through both top and bottom surfaces. In general, it may be assumed that the two surfaces have the same roughness, so instead of computing the accurate K, this disclosure could apply it twice.Lt= CONV(K, CONV(K, Abg)) (20)
[0088] To add a transparent surface in a scene, the described technique can get Abg, but if an opaque surface is to be changed to transparent, image inpainting can also be utilized here to estimate what is behind the surface.
[0089] So, Itran can be obtained based on Equation (21) below.Itran = (AFQ4- B) • CO V. CON V( C?Abg)) ‘ A (21 )
[0090] The surface albedo (A) is multiplied here to approximate the absorption after light goes through the thin surface.
[0091] Finally, the three layers can be combined to get the final image, as shown in Equation (22).I = (1 - T)(1 - M)Idiff+ Ispec+ TItran(22) ResultsIntrinsic representation for ePBR materialsFW 6000752PCT03 14
[0092] This section summarizes the intrinsic channels X used in the described model (See Table 1 in FIG. 7 and FIG. 8 for example). FIG.7 shows example intrinsic channels used in the described model, according to some implementations. The geometry channels may include Normal (N) and / or Depth (D) channels. The material channels may include Albedo (A), Roughness (R), Metallic (M), and / or Transparency (T) channels. The lighting channels (illumination channels) may include diffuse irradiance (E), mirror reflectance (Amr), and / or background color (Abg) channels. FIG. 8 shows examples of image composition, according to some implementations. The described implementation technique (IMPL) decomposes reference images into intrinsic channels and then recompose them back. Compared to RGB^X, the results from the described technique perform better in the high specular regions (marked with dotted line boxes).
[0093] To be chosen as an intrinsic channel, some principles may be followed. The X could have the exact resolution of the image I; each X could have its unique physical meaning, which enables the user to precisely edit the image; the value of X could be uniformly distributed. One example may cause potential issues: if an intrinsic value ranges in [o, 1], the appearance changes dramatically when the value changes from o to 0.1, but there is almost no change if the value is greater than 0.1.
[0094] A 3-channel image may be used to save PBR material. The Red channel refers to roughness, and the Green channel refers to metallic, while Blue channel could be always 0, according to some implementations. The described ePBR model may store the transparency map in the Blue channel without additional memory cost.Intrinsic evaluation
[0095] How the final appearance is altered when the material intrinsic changes can be demonstrated. FIGs. 6A-6D illustrates example ePBR material intrinsic evaluation, according to some implementations. FIG. 6A shows only opaque surface in the figure, and as M increases, the reflectance can tint from light color to metal color. FIG. 6B shows that roughness R influences how blurry both reflection and transmission appear. FIG. 6C shows transparency T controls how clear the background can be seen. FIG.6D shows albedo A indicates the color of the thin slab, and the background color would also be influenced. In FIGs. 6A-6D, the described technique uses a flat surface illuminated by an environment map to avoid light interaction between different surfaces. With the red color of albedo, the difference can be seen more clearly.
[0096] With respect to metallic M, with the increase in metallicity, the material transitions from non-metal to metal. The diffuse layer gradually disappears, and the highlights tint from light to metal color. Metallic changes may only happen on an opaque FW 6000752PCT03 15surface. In other words, the metallic value could always be zero for a transparent or translucent surface. The described technique could synthesize a surface that looks like transparent metal, but it may be an invalid material in the real world.
[0097] With respect to roughness R, for a metal surface, it can be more accurate to use separate roughness values and direction for the anisotropy effect. For a transparent surface like glass, two roughness values can be used for the top and bottom surfaces. However, for simplicity and consistency with other materials, the described technique may only use a single roughness for all different cases, according to some implementations. As shown in FIGs. 6A-6D, roughness controls the blurriness of both reflection and transmission.
[0098] With respect to transparency T, the transparency controls how much light can pass through the thin slab, which is the opposite of the slab density. As the density increases (transparency decreases), more lights get scattered, making the background harder to see. The reflection on the top surface will remain unchanged.
[0099] With respect to albedo A, the albedo indicates non-absorbed light.Transparency becomes ineffective with the A = (o, o, o) since no light can scatter or go through the surface. And if A = (1, 1, 1), it represents pure glass with no energy loss. Image composition
[0100] FIG. 8 can validate that the described image composition method is more faithful to the input intrinsic channels compared to the diffusion-based method RGB«-> X. The tested examples are from InteriorVerse dataset, which is not contained in the training set of RGB«-> X. Parts of the intrinsic channels (N, D, A, R, M) are directly from the ground truth. E is estimated using RGB«-> X, T is the inverse of the ground truth mask, since the non-masked areas are glass. Amris generated using SSRT and out-of-space colors are set to gray.
[0101] The results shown in this disclosure match the path-traced reference well regarding high specular regions.
[0102] This disclosure extends intrinsic representations to incorporate both reflection and transmission properties, enabling the synthesis of transparent and translucent materials such as glass and windows. By introducing an explicit intrinsic compositing framework, the described technique achieves deterministic and interpretable image synthesis, offering precise control over material properties.Compared to diffusion-based rendering methods, the described technical solution provides an efficient and memory friendly solution, making it well-suited for real-time applications and high-resolution image generation. Furthermore, the integration of FW 6000752PCT03 16extended PBR materials allows for flexible material editing while maintaining physical plausibility. The described technique can further improve downstream applications and allow for improved control over low-level properties of objects.
[0103] FIG. 9 illustrates an example of a flowchart of a method 900 performed by an apparatus, in accordance with some implementations. The apparatus may include computer-readable code or instructions executing on one or more processors of the device. Coding of the software for carrying out or performing the method 900 is well within the scope of a person of ordinary skill in the art having regard to the present disclosure. The method 900 may include additional or fewer operations than those shown and described and may be carried out or performed in a different order.Computer-readable code or instructions of the software executable by the one or more processors may be stored on at least one non-transitory computer-readable medium, such as for example, at least one memory of the apparatus. In some embodiments, the method 900 may be performed by one or more of units or modules (e.g., an integrated circuit) of the apparatus, such as field programmable gate arrays (FPGAs) or applicationspecific integrated circuits (ASICs).
[0104] The method 900 starts at the operation 902, where the apparatus receives image data representing a scene. At the operation 904, the apparatus extracts intrinsic channels from the image data. The intrinsic channels comprise geometry channels, material channels, and illumination channels. The material channels include a transparency channel. The illumination channels include a mirror reflectance channel and a background channel. At the operation 906, the apparatus composites a synthesized image based on the geometry channels, the material channels, and the illumination channels.
[0105] In accordance with some implementations, the transparency channel may represent light transmission properties of surfaces in the scene. The mirror reflectance channel may represent reflected scene content. The background channel may represent scene content visible through transparent surfaces in the scene.
[0106] In accordance with some implementations, the mirror reflectance channel may be generated using screen-space ray tracing based on the geometry channels.
[0107] In accordance with some implementations, the background channel may be generated using image inpainting when scene content behind a transparent surface is not directly visible.
[0108] In accordance with some implementations, to composite the synthesized image, the apparatus may compute a diffuse reflection component based on an albedo FW 6000752PCT03 17channel in the material channels and a diffuse irradiance channel in the illumination channels. The apparatus may compute a specular reflection component based on the mirror reflectance channel. The apparatus may compute a transmission component based on the transparency channel and the background channel.
[0109] In accordance with some implementations, to composite the synthesized image, the apparatus may combine the diffuse reflection component weighted by an inverse of the transparency channel, the specular reflection component, and the transmission component weighted by the transparency channel.
[0110] In accordance with some implementations, the apparatus may composite the synthesized image is based on: I = (1-T)(1-M)Idiff+ Ispec+ TItran. I denotes the synthesized image. T denotes the transparency channel. M denotes a metallic channel in the material channels. (t-T) denotes the inverse of the transparency channel. (1-M) denotes an inverse of the metallic channel. Idiff denotes the diffuse reflection component. Ispec denotes the specular reflection component. Itrandenotes the transmission component.
[0111] In accordance with some implementations, Idiff= A·E, and A denotes the albedo channel, E denoting the diffuse irradiance channel. Ispec= (A·F0+ B)·CONV(K, Amr), and F0denotes a Fresnel coefficient. B denotes a roughness-dependent value. K denotes a filtering kernel. Amr denotes the mirror reflectance channel. CONV denotes a convolution operation. Itran= (A·F0+ B)·CONV(K, CONV(K, Abg))·A, and Abgdenotes the background channel.
[0112] In accordance with some implementations, the transparency channel enables modeling of thin transparent surfaces in the scene.
[0113] FIG. 10 is a block diagram of a computing system 1000 that may be used for implementing the devices and methods disclosed herein. For example, the computing system can be any entity of UE, access network (AN), mobility management (MM), session management (SM), user plane gateway (UPGW), or access stratum (AS). Specific devices may utilize all of the components shown or only a subset of the components, and levels of integration may vary from device to device. Furthermore, a device may contain multiple instances of a component, such as multiple processing units, processors, memories, transmitters, receivers, etc. The computing system 1000 includes a processing unit 1002. The processing unit includes a central processing unit (CPU) 1014, memory 1008, and may further include a mass storage device 1004, a video adapter 1010, and an I / O interface 1012 connected to a bus 1020.
[0114] The bus 1020 may be one or more of any type of several bus architectures including a memory bus or memory controller, a peripheral bus, or a video bus. The CPU FW 6000752PCT03 181014 may comprise any type of electronic data processor. The memory 1008 may comprise any type of non-transitory system memory such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous DRAM (SDRAM), read-only memory (ROM), or a combination thereof. In an embodiment, the memory 1008 may include ROM for use at boot-up, and DRAM for program and data storage for use while executing programs.
[0115] The mass storage 1004 may comprise any type of non-transitory storage device configured to store data, programs, and other information and to make the data, programs, and other information accessible via the bus 1020. The mass storage 1004 may comprise, for example, one or more of a solid state drive, hard disk drive, a magnetic disk drive, or an optical disk drive.
[0116] The video adapter 1010 and the I / O interface 1012 provide interfaces to couple external input and output devices to the processing unit 1002. As illustrated, examples of input and output devices include a display 1018 coupled to the video adapter 1010 and a mouse, keyboard, or printer 1016 coupled to the I / O interface 1012. Other devices may be coupled to the processing unit 1002, and additional or fewer interface cards maybe utilized. For example, a serial interface such as Universal Serial Bus (USB) (not shown) may be used to provide an interface for an external device.
[0117] The processing unit 1002 also includes one or more network interfaces 1006, which may comprise wired links, such as an Ethernet cable, or wireless links to access nodes or different networks. The network interfaces 1006 allow the processing unit 1002 to communicate with remote units via the networks. For example, the network interfaces 1006 may provide wireless communication via one or more transmitters / transmit antennas and one or more receivers / receive antennas. In an embodiment, the processing unit 1002 is coupled to a local-area network 1022 or a wide-area network for data processing and communications with remote devices, such as other processing units, the Internet, or remote storage facilities.
[0118] It should be appreciated that one or more steps of the embodiment methods provided herein may be performed by corresponding units or modules. For example, image data may be receiving by a receiving unit or a receiving module. The image data may be extracted by an extracting unit or an extracting module. A compositing unit or a compositing module may compose the synthesized image. The respective units or modules may be hardware, software, or a combination thereof. For instance, one or more of the units or modules may be an integrated circuit, such as field programmable gate arrays (FPGAs) or application-specific integrated circuits (ASICs).FW 6000752PCT03 19
[0119] Although the description has been described in detail, it should be understood that various changes, substitutions and alterations can be made without departing from the spirit and scope of this disclosure as defined by the appended claims. Moreover, the scope of the disclosure is not intended to be limited to the particular embodiments described herein, as one of ordinary skill in the art will readily appreciate from this disclosure that processes, machines, manufacture, compositions of matter, means, methods, or steps, presently existing or later to be developed, may perform substantially the same function or achieve substantially the same result as the corresponding embodiments described herein. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or steps.FW 6000752PCT03 20
Claims
WHAT IS CLAIMED IS:
1. A method comprising:receiving, by an apparatus, image data representing a scene;extracting, by the apparatus, intrinsic channels from the image data, wherein the intrinsic channels comprise geometry channels, material channels including a transparency channel, and illumination channels including a mirror reflectance channel and a background channel; andcompositing, by the apparatus, a synthesized image based on the geometry channels, the material channels, and the illumination channels.
2. The method of claim 1, wherein the transparency channel represents light transmission properties of surfaces in the scene, the mirror reflectance channel represents reflected scene content, and the background channel represents scene content visible through transparent surfaces in the scene.
3. The method of any of claims 1-2, wherein the mirror reflectance channel is generated using screen-space raytracing based on the geometry channels.
4. The method of any of claims 1-3, wherein the background channel is generated using image inpainting when scene content behind a transparent surface is not directly visible.
5. The method of any of claims 1-4, the compositing the synthesized image comprising:computing a diffuse reflection component based on an albedo channel in the material channels and a diffuse irradiance channel in the illumination channels, computing a specular reflection component based on the mirror reflectance channel, andcomputing a transmission component based on the transparency channel and the background channel.
6. The method of claim 5, the compositing the synthesized image further comprising:combining the diffuse reflection component weighted by an inverse of the transparency channel, the specular reflection component, and the transmission component weighted by the transparency channel.
7. The method of claim 6, wherein the compositing the synthesized image is based on:FW 6000752PCT03 21I = (i-T)(i-M)Idiff+ Ispec + Itran,wherein:I denotes the synthesized image,T denotes the transparency channel,M denotes a metallic channel in the material channels,(i-T) denotes the inverse of the transparency channel,(i-M) denotes an inverse of the metallic channel,Idiff denotes the diffuse reflection component,Ispec denotes the specular reflection component, andItran denotes the transmission component.
8. The method of claim 7,wherein Idiff = A-E, A denoting the albedo channel, E denoting the diffuse irradiance channel,wherein Ispec = (A-Fo+ B)-CONV(K, Amr), Fodenoting a Fresnel coefficient, B denoting a roughness-dependent value, K denoting a filtering kernel, Amrdenoting the mirror reflectance channel, CONV denoting a convolution operation, andwherein Itran = (A-Fo+ B)-CONV(K, CONV(K, Abg))-A, Abgdenoting the background channel.
9. The method of any of claims 1-8, wherein the transparency channel enables modeling of thin transparent surfaces of the scene.
10. An apparatus comprising:at least one processor; anda non-transitory computer readable storage medium storing programming, the programming including instructions that, when executed by the at least one processor, cause the apparatus to perform a method according to any of claims 1-9.
11. A non-transitory computer-readable medium having instructions stored thereon that, when executed by an apparatus, cause the apparatus to perform a method according to any of claims 1-9.FW 6000752PCT03 22