Method and apparatus for generating a transparency field
By constructing multidimensional boundary proxy data to generate transparency fields, the problem of boundary information degradation in existing keying methods is solved, and higher quality and more stable transparency field generation is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUNAN HAPPLY SUNSHINE INTERACTIVE ENTERTAINMENT MEDIA CO LTD
- Filing Date
- 2026-05-22
- Publication Date
- 2026-06-19
Smart Images

Figure CN122244196A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image processing technology, and in particular to a method and apparatus for generating a transparency field. Background Technology
[0002] Although most existing keying methods rely on color or brightness information for boundary determination, the boundary information is easily degraded when the subject color is close to the background, strong reflection occurs, or compression distortion occurs. Furthermore, it is impossible to distinguish between "real foreground boundaries" and "false boundaries caused by lighting or reflection," resulting in abnormal expansion of the foreground area or white edge residue. In scenes with fast movement, limb waving, or camera transitions, the foreground edges are prone to being cut off, ghosting, and abrupt changes in transparency, leading to poor quality of the final generated transparency field. Summary of the Invention
[0003] In view of this, the present invention provides a method and apparatus for generating a transparency field, which significantly improves the quality and dynamic stability of the transparency field.
[0004] The first aspect of this invention provides a method for generating a transparency field, comprising:
[0005] Boundary proxy data is constructed based on target pixels in the image; wherein, the boundary proxy data includes: local color gradient field, brightness reflection field, micronormal perturbation field and motion boundary prediction field;
[0006] Based on the boundary proxy data for each dimension, candidate transparency values for the target pixel corresponding to that dimension are generated.
[0007] Based on the transparency candidate values of the target pixel corresponding to all the dimensions, a transparency estimate of the target pixel is generated;
[0008] A transparency field is generated based on the transparency estimates of all the target pixels.
[0009] Optionally, constructing boundary proxy data based on target pixels in the image includes:
[0010] A local color gradient field is constructed based on the local intensity changes of the color components of the target pixel in the horizontal direction and the local intensity changes in the vertical direction.
[0011] A brightness reflection field is constructed based on the diffuse reflection and specular reflection components of the target pixel;
[0012] Based on the local intensity changes of the structural components of the target pixel in the horizontal direction and the local intensity changes in the vertical direction, a micro-normal perturbation field is constructed.
[0013] Based on the position information of the target pixel and the optical flow displacement of the target pixel between adjacent frames, a motion boundary prediction field is constructed.
[0014] Optionally, the method for generating the transparency field further includes:
[0015] Based on the local color gradient field and the average gradient of the target pixel's neighborhood, it is determined whether the target pixel's location is an abnormally compressed region; wherein, the local color gradient field includes the local gradient and the gradient direction;
[0016] If the target pixel location is an abnormally compressed region, then the local gradient is repaired for continuity to obtain the recovered gradient, and the gradient direction is constrained for continuity to obtain the corrected gradient direction.
[0017] Based on the restored gradient and the corrected gradient direction, the corrected local color gradient field is obtained.
[0018] Optionally, the method for generating the transparency field further includes:
[0019] Based on the brightness gradient of the target pixel, it is determined whether the target pixel is a brightness anomaly point; wherein, the brightness gradient is determined based on the brightness component and the average brightness component of the neighboring pixels;
[0020] If the target pixel is a brightness anomalous point, then the brightness reflection field is corrected based on the brightness of the reflection field to obtain a brightness-corrected brightness reflection field; wherein, the brightness of the reflection field is determined based on the brightness gradient of the target pixel, the diffuse reflection component and the specular reflection component of the target pixel;
[0021] The brightness-corrected brightness reflection field is then processed to become a continuous field, resulting in a corrected brightness reflection field.
[0022] Optionally, the method for generating the transparency field further includes:
[0023] Taking the target pixel as the center, the consistency evaluation is performed on the orientation difference of all neighboring pixels to obtain the consistency evaluation result;
[0024] If the consistency evaluation result indicates that there is directional perturbation at the target pixel position, then the consistency of the neighborhood direction is corrected based on the edge direction of the target pixel to obtain the corrected edge direction; wherein, the edge direction of the target pixel is determined based on the local intensity change in the horizontal direction and the local intensity change in the vertical direction of the structural features of the target pixel.
[0025] Optionally, the method for generating the transparency field further includes:
[0026] Based on the optical flow displacement of the target pixel between adjacent frames, the candidate positions of the motion boundary are determined;
[0027] The position of the first target motion boundary is determined based on the previous frame boundary position and optical flow displacement of the candidate motion boundary position;
[0028] Based on the first target motion boundary position and the motion boundary candidate position, determine whether the motion boundary candidate position is an abnormal boundary region;
[0029] If the candidate position of the motion boundary is an abnormal boundary region, the motion boundary position is reconstructed based on the first target motion boundary position and the second target motion boundary position; wherein, the second target motion boundary position is determined based on the next frame boundary position and optical flow displacement of the candidate position of the motion boundary.
[0030] Optionally, the method for generating the transparency field further includes:
[0031] Based on the local color gradient field and the original gradient, determine whether the target pixel location is a structural fracture region;
[0032] If the target pixel location is a structural fracture region, the recovery direction angle is determined based on the boundary direction angle provided by the local color gradient field at the target pixel and the local principal direction angle of the boundary at the target pixel;
[0033] Based on the micronormal vector and unit direction vector provided by the micronormal perturbation field, the direction constraint term is determined; wherein, the unit direction vector is determined based on the local color gradient field;
[0034] Based on the restored direction angle and the direction constraint term, local reconstruction is performed to obtain the reconstructed local boundary region.
[0035] Optionally, the method for generating the transparency field further includes:
[0036] Based on the reflected field brightness and the original brightness, determine whether the target pixel belongs to the bright edge interference area;
[0037] If the target pixel belongs to the bright edge interference area, then the original brightness is corrected based on the reflected field brightness to obtain the brightness-corrected pixel brightness;
[0038] Bright edge stripping is performed based on brightness reflection field and material category weights to obtain the boundary region after bright edge stripping;
[0039] The boundary position is corrected based on the boundary region after the bright edge is stripped, the local color gradient field, and the micro-normal perturbation field to obtain the corrected boundary region.
[0040] Optionally, the method for generating the transparency field further includes:
[0041] Based on the target pixel position and the third target motion boundary position, it is determined whether the target pixel belongs to a motion abnormal region; wherein, the third target motion boundary position is determined by the previous frame boundary position of the target pixel position and the optical flow displacement; the motion abnormal region includes cut-off regions and ghosting regions;
[0042] If the target pixel belongs to the cut-off region, the missing edge segment is filled in based on the position of the third target motion boundary to obtain the processed boundary region.
[0043] If the target pixel belongs to the trailing region, the target pixel is removed to obtain the processed boundary region;
[0044] The processed boundary region is then subjected to time continuity correction to obtain the corrected boundary region.
[0045] A second aspect of the present invention provides an apparatus for generating a transparency field, comprising:
[0046] A boundary proxy data construction unit is used to construct boundary proxy data based on target pixels in an image; wherein, the boundary proxy data includes: a local color gradient field, a brightness reflection field, a micro-normal perturbation field, and a motion boundary prediction field;
[0047] The transparency candidate value generation unit is used to generate transparency candidate values for the target pixel corresponding to each dimension based on the boundary proxy data of each dimension.
[0048] A transparency estimation generation unit is used to generate a transparency estimate of the target pixel based on the transparency candidate values of the target pixel corresponding to all the dimensions.
[0049] A transparency field generation unit is used to generate a transparency field based on the transparency estimates of all the target pixels.
[0050] Optionally, the boundary proxy data construction unit includes:
[0051] A local color gradient field construction unit is used to construct a local color gradient field based on the local intensity changes of the color components of the target pixel in the horizontal direction and the local intensity changes in the vertical direction.
[0052] A brightness reflection field construction unit is used to construct a brightness reflection field based on the diffuse reflection component and specular reflection component of the target pixel;
[0053] The micro-normal perturbation field construction unit is used to construct a micro-normal perturbation field based on the local intensity changes of the structural components of the target pixel in the horizontal direction and the local intensity changes in the vertical direction.
[0054] The motion boundary prediction field construction unit is used to construct a motion boundary prediction field based on the position information of the target pixel and the optical flow displacement of the target pixel between adjacent frames.
[0055] Optionally, the apparatus for generating the transparency field further includes:
[0056] An abnormal compression region determination unit is used to determine whether the location of the target pixel is an abnormal compression region based on the local color gradient field and the average gradient of the target pixel's neighborhood; wherein, the local color gradient field includes a local gradient and a gradient direction;
[0057] The gradient recovery determination unit is used to perform continuity repair on the local gradient if the target pixel position is an abnormally compressed region, so as to obtain the recovered gradient.
[0058] A gradient direction correction unit is used to apply continuity constraints to the gradient direction to obtain a corrected gradient direction.
[0059] A local color gradient field correction unit is used to obtain a corrected local color gradient field based on the recovered gradient and the corrected gradient direction.
[0060] Optionally, the apparatus for generating the transparency field further includes:
[0061] A brightness anomaly point determination unit is used to determine whether a target pixel is a brightness anomaly point based on the brightness gradient of the target pixel if the target pixel is a brightness anomaly point; wherein the brightness gradient is determined based on the brightness component and the average brightness component of the neighboring pixels;
[0062] A brightness correction unit is used to perform brightness correction on the brightness reflection field based on the brightness of the reflection field to obtain a brightness-corrected brightness reflection field; wherein, the brightness of the reflection field is determined based on the brightness gradient of the target pixel, the diffuse reflection component and the specular reflection component of the target pixel;
[0063] A brightness reflection field correction unit is used to perform continuous processing on the brightness-corrected brightness reflection field to obtain a corrected brightness reflection field.
[0064] Optionally, the apparatus for generating the transparency field further includes:
[0065] The consistency evaluation unit is used to evaluate the directional differences of all neighboring pixels with the target pixel as the center, and obtain the consistency evaluation result.
[0066] An edge direction correction unit is used to correct the consistency of the neighborhood direction based on the edge direction of the target pixel if the consistency evaluation result indicates that there is a directional disturbance at the target pixel position, thereby obtaining the corrected edge direction; wherein, the edge direction of the target pixel is determined based on the local intensity change in the horizontal direction and the local intensity change in the vertical direction of the structural features of the target pixel.
[0067] Optionally, the apparatus for generating the transparency field further includes:
[0068] The motion boundary candidate position determination unit is used to determine the motion boundary candidate position based on the optical flow displacement of the target pixel between adjacent frames;
[0069] The first target motion boundary position determination unit is used to determine the position of the first target motion boundary based on the previous frame boundary position and optical flow displacement of the motion boundary candidate position;
[0070] An abnormal boundary determination unit is used to determine whether the motion boundary candidate position is an abnormal boundary region based on the first target motion boundary position and the motion boundary candidate position;
[0071] The motion boundary position reconstruction unit is used to reconstruct the motion boundary position based on the first target motion boundary position and the second target motion boundary position if the motion boundary candidate position is an abnormal boundary region; wherein the second target motion boundary position is determined based on the next frame boundary position and optical flow displacement of the motion boundary candidate position.
[0072] Optionally, the apparatus for generating the transparency field further includes:
[0073] The structural fracture region determination unit is used to determine whether the target pixel position is a structural fracture region based on the local color gradient field and the original gradient.
[0074] The orientation angle determination unit is used to determine the restored orientation angle based on the boundary orientation angle provided by the local color gradient field at the target pixel and the local principal orientation angle of the boundary at the target pixel if the target pixel position is a structural fracture region.
[0075] The orientation constraint term determination unit is used to determine the orientation constraint term based on the micronormal vector and unit orientation vector provided by the micronormal perturbation field; wherein the unit orientation vector is determined based on the local color gradient field.
[0076] The local boundary region reconstruction unit is used to perform local reconstruction based on the restored direction angle and the direction constraint term to obtain the reconstructed local boundary region.
[0077] Optionally, the apparatus for generating the transparency field further includes:
[0078] The bright edge interference determination unit is used to determine whether a target pixel belongs to the bright edge interference area based on the reflected field brightness and the original brightness.
[0079] A pixel brightness correction unit is used to perform brightness correction on the original brightness based on the reflected field brightness if the target pixel belongs to the bright edge interference area, so as to obtain the brightness-corrected pixel brightness.
[0080] The bright edge stripping unit is used to strip bright edges based on the brightness reflection field and material category weights to obtain the boundary region after bright edge stripping.
[0081] The boundary region correction unit is used to correct the boundary position based on the boundary region after the bright edge is stripped, the local color gradient field, and the micro-normal perturbation field, so as to obtain the corrected boundary region.
[0082] Optionally, the apparatus for generating the transparency field further includes:
[0083] A motion anomaly region determination unit is used to determine whether a target pixel belongs to a motion anomaly region based on the target pixel position and the third target motion boundary position; wherein, the third target motion boundary position is determined by the previous frame boundary position of the target pixel position and optical flow displacement; the motion anomaly region includes a cut-off region and a trailing region;
[0084] The completion unit is used to complete the missing edge segment based on the position of the third target motion boundary if the target pixel belongs to the cut-off region, so as to obtain the processed boundary region.
[0085] The rejection unit is used to reject the target pixel if it belongs to the trailing region, so as to obtain the processed boundary region.
[0086] A continuity correction unit is used to perform time continuity correction on the processed boundary region to obtain a corrected boundary region.
[0087] A third aspect of the present invention provides an electronic device, comprising:
[0088] One or more processors;
[0089] A storage device on which one or more programs are stored;
[0090] When the one or more programs are executed by the one or more processors, the one or more processors implement the method for generating a transparency field as described in any one of the first aspects.
[0091] A fourth aspect of the present invention provides a computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method for generating a transparency field as described in any one of the first aspects.
[0092] As can be seen from the above solutions, the present invention provides a method and apparatus for generating a transparency field. Under the premise of keeping the main structure of the existing keying process unchanged, it constructs multi-dimensional boundary proxy data to perform a replaceable and correctable structured description of the foreground boundary, and generates a transparency field based on the multi-dimensional boundary proxy data, which significantly improves the quality and dynamic stability of the transparency field. Attached Figure Description
[0093] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.
[0094] Figure 1 A flowchart illustrating a method for generating a transparency field according to an embodiment of the present invention;
[0095] Figure 2 This is a schematic diagram of a transparency field generation device provided in another embodiment of the present invention. Detailed Implementation
[0096] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0097] The term "comprising" and its variations as used herein are open-ended inclusions, meaning "including but not limited to". The term "based on" means "at least partially based on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Definitions of other terms will be given in the description below.
[0098] It should be noted that the concepts of "first" and "second" mentioned in this invention are only used to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units or their interdependencies.
[0099] It should be noted that the terms "a" and "a plurality of" used in this invention are illustrative rather than restrictive. Those skilled in the art should understand that, unless otherwise expressly indicated in the context, they should be understood as "one or more".
[0100] This invention provides a method for generating a transparency field, such as... Figure 1 As shown, the specific steps include:
[0101] S101. Construct boundary proxy data based on target pixels in the image.
[0102] The boundary proxy data includes: local color gradient field, brightness reflection field, micronormal perturbation field, and motion boundary prediction field. The target pixel refers to a pixel in the image that is close to the foreground boundary. In the actual application of this invention, pixels close to the foreground boundary can be obtained from the set of pixels in the foreground boundary or edge transition region obtained in existing keying processes; the method of obtaining these pixels does not constitute a limitation of this invention.
[0103] Specifically, local region analysis is performed on the foreground subject of each frame of the input stream (including but not limited to real-time acquisition, video sequences, etc.) to generate local color gradient fields. Reflectance Field Normal perturbation field Motion-Prediction Field .
[0104] Among them, the local color gradient field is a dataset used to express the color change trend at the real contour, especially providing a back-derived color difference structure when compression, over-sharpening, or anti-aliasing weakens the local gradient; the brightness reflection field is a dataset used to characterize the brightness distribution and reflection behavior of different materials under illumination, used to distinguish between real boundaries and areas affected by illumination interference; the micro-normal perturbation field is a dataset used to express the micro-scale geometric features at the boundary, such as the local principal direction information of hair texture, fabric wrinkles, and weak texture areas, which can provide a structural-level boundary reference when the color is blurred or the gradient is weakened; and the motion boundary prediction field is a dataset used to describe the motion trend of the boundary on the time axis.
[0105] To establish a unified boundary description framework, proxy boundary data (PBF), also known as proxy boundary field, can be represented in its overall form as follows:
[0106] ;
[0107] Where p is the target pixel position, and the four types of subfields describe the boundary state from four perspectives: color structure, illumination characteristics, local geometry, and temporal continuity.
[0108] Optionally, in another embodiment of the present invention, the local color gradient field includes a local gradient and a gradient direction. The local color gradient field can be constructed based on the local intensity changes of the color components of the target pixel in the horizontal direction and the local intensity changes in the vertical direction.
[0109] ;
[0110] ;
[0111] in, For target pixels The local intensity variation of the color component in the horizontal direction. It represents the local intensity change of the color component of the target pixel in the vertical direction. Represents the local gradient. Indicates the gradient direction.
[0112] Among them, color components refer to the channel components used to characterize the color information of image pixels. They can be the R, G, and B components in the RGB color space, or the corresponding components in other color spaces such as YUV and HSV.
[0113] It should be noted that in this invention, the color components are not limited to a specific color space. They can be any color channel that can represent the color information of a pixel, depending on the actual application scenario, and the local intensity change is calculated based on the color channel.
[0114] The local intensity change can be calculated through the difference between adjacent pixels, gradient operators, or other equivalent methods, and is not limited here.
[0115] It is understandable that in the gradient weakening region, more stable color difference information can be obtained by expanding the neighborhood or enhancing the brightness contrast, but this is not limited here.
[0116] Because regions affected by compression artifacts can lead to gradient jumps or breaks when calculated directly based on original pixels, this invention uses the following formula to determine the amount of anomalous drift for neighborhood consistency detection. :
[0117] ;
[0118] in, When the value exceeds the set threshold, the target pixel will be... The location has been determined to be an abnormally compressed area. The neighborhood of the target pixel is used for statistical analysis of local gradient, structure, or brightness features. The size of the neighborhood is not fixed, but is selected based on factors such as the spatial scale of the boundary region, local structural stability, and computational complexity. In specific implementations, the neighborhood range can be adjusted according to the actual image resolution, boundary width, or noise level to achieve a balance between structural stability and detail preservation. No limitation is made here. The mean gradient of the neighborhood of the target pixel.
[0119] In practical applications of this invention, for abnormally compressed regions, the local gradient can be continuously repaired to obtain a restored gradient, but is not limited to this method. Among these methods, performing continuous restoration / smoothing of image gradients within the local neighborhood to obtain more stable gradient estimates is a common engineering implementation technique in image processing. Its purpose is to suppress gradient breaks and jumps caused by compression artifacts, noise, or over-sharpening, thereby obtaining a more continuous gradient representation.
[0120] In the practical application of this invention, for pseudo-edges introduced by over-sharpening, false gradient components are eliminated through orientation consistency constraints:
[0121] ;in, Used to measure pixels Its neighboring pixels The degree of consistency in the gradient or boundary direction. This represents the local boundary or gradient principal direction angle at pixel q within the frame corresponding to time t, and it is used to characterize the directional information of the boundary structure at that pixel. In this formula, the difference in direction angle between pixel p and its neighboring pixel q is used to measure the degree of consistency between the two in the boundary direction.
[0122] It should be noted that pseudo-edges introduced by over-sharpening typically exhibit abnormally increased local gradient magnitudes, but their gradient directions lack consistency within the spatial neighborhood and are difficult to form continuous, stable boundary structures. This invention identifies such pseudo-edges by analyzing the gradient direction consistency, structural continuity, and stability characteristics of pixels within their neighborhoods.
[0123] Specifically, when a pixel exhibits a large gradient response, but its gradient direction changes drastically in the neighborhood and cannot form a consistent direction with the surrounding boundary structure, or fails to form a continuous contour between adjacent pixels, it can be identified as a pseudo-edge introduced by over-sharpening.
[0124] A false gradient refers to a gradient magnitude that is large but does not correspond to the true boundary structure, caused by oversharpening, compression artifacts, noise amplification, or other reasons.
[0125] It should be noted that, It does not directly represent spurious gradient components, but rather measures the consistency between pixel p and its neighboring pixels q in the gradient or boundary direction. When the local gradient has a large magnitude, but its direction varies drastically and lacks consistency within the neighborhood, the corresponding... A lower value indicates that the gradient component may be unreliable or caused by pseudo-edges introduced by factors such as oversharpening. Therefore, It is used to help determine the credibility of gradient components, rather than directly indicating whether a certain gradient component is a false gradient.
[0126] Furthermore, this invention imposes spatial continuity constraints on the direction and magnitude of the color gradient, ensuring smooth local changes in the boundary and reducing abrupt changes in direction caused by noise.
[0127] ;in, This represents the direction angle of the gradient. This formula searches for a direction angle within the neighborhood of pixel p. Make it similar to the existing orientation angles in the neighborhood. The sum of the squared differences is the smallest. This represents the corrected gradient direction.
[0128] In the practical application of this invention, for highly curved edges, the curvature is estimated by second-order difference and the direction field is corrected to make it conform to the true geometric shape of the edge; this is not limited here.
[0129] Finally, based on the recovered gradient and the corrected gradient direction, the corrected local color gradient field is obtained. ,Right now .
[0130] Optionally, in another embodiment of the invention, the brightness reflection field can be constructed based on the diffuse reflection component and specular reflection component of the target pixel:
[0131] ;in, Indicates diffuse reflection component, Indicates specular reflection or highlight component. , The weight is set based on the material. and It can be estimated by distinguishing between areas of stable brightness and areas of abnormal brightness. , It can be adaptively set according to the material characteristics of the pixel area to adjust the influence ratio of different reflective components in brightness modeling; no limitation is made here. For example, the more the material is biased towards diffuse reflection, The larger the value, the more it tends to be specularly reflective. The larger.
[0132] Optionally, in another embodiment of the present invention, before constructing the brightness reflection field, potential reflection regions can be determined by, but not limited to, the brightness gradient of the target pixel to complete the initial brightness screening. The brightness gradient can be defined as... :
[0133] ;
[0134] in, The luminance component is used to characterize a pixel. The brightness information in the frame corresponding to time t can be obtained by performing brightness mapping or color space conversion on the color components of the input image. For example, in a specific implementation, the corresponding brightness channel can be extracted as the brightness component based on the RGB image through channel weighting or conversion to a color space such as YUV; this is not limited here. When the threshold is exceeded, the pixel is marked as a candidate point for brightness interference.
[0135] In the specific implementation of this invention, the optical flow direction can be combined with... Based on the local color distribution, it can be further determined whether the brightness change is distributed along the main body outline, which is used to mark the backlight bright band and the highlight area.
[0136] in, This indicates the predicted field of the motion boundary at time t, and the pixel... The value at a given location describes the boundary movement direction or displacement information corresponding to that pixel. Local color distribution is a general description of the color feature analysis based on the foreground boundary neighborhood, primarily corresponding to the color differences, color change trends, and their distribution within the local neighborhood involved in the construction of the local color gradient field. Local color distribution describes the spatial variation characteristics of color information within the boundary neighborhood, providing a basic color basis for gradient calculation, false edge identification, and subsequent boundary repair.
[0137] It should be noted that this invention does not rely solely on sudden changes in brightness for judgment, but rather uses a combined approach of spatial structural consistency and temporal motion consistency for judgment:
[0138] (1) In the spatial dimension, compare the direction of the brightness gradient with the direction of the local boundary. If the direction of the brightness change is consistent with the boundary direction, it indicates that it may be distributed along the main outline.
[0139] (2) In the time dimension, the field is predicted by the direction of optical flow and the moving boundary. Determine whether the brightness change region changes continuously with the movement of the subject. If it moves along the boundary movement trend between adjacent frames, the credibility of it as a real outline is enhanced.
[0140] (3) Combining the local color distribution characteristics, if the brightness change area is supported by color difference and forms a continuous structure in the neighborhood, it is determined to be a real brightness change distributed along the main body outline; otherwise, it is regarded as reflection or specular interference.
[0141] The aforementioned spatial-temporal-color multi-source consistency determination mechanism enables the differentiation of brightness abrupt changes.
[0142] In this invention, the marking of backlight bright bands and highlight areas is not based on a single brightness threshold, but rather on a comprehensive judgment based on the degree of brightness anomaly, color feature changes, and boundary structure consistency. Specifically, this includes:
[0143] (1) Brightness anomaly detection: Detection of pixel brightness components Neighborhood statistics are performed, and if the brightness is significantly higher than the neighborhood average and exhibits local peak characteristics, it is considered a candidate bright area.
[0144] (2) Saturation and color compression characteristics analysis: Highlight areas are usually accompanied by color components tending to be saturated or color distortion. By detecting whether the RGB components are close to the upper limit or the color difference is weakened, the nature of the highlight can be further confirmed.
[0145] (3) Boundary structure consistency judgment: Combine the local color gradient field and micro normal perturbation field to judge whether the brightness change has the support of a real boundary structure; if the brightness change lacks the corresponding color gradient or geometric direction consistency, it is more likely to be a false bright edge formed by illumination interference.
[0146] (4) Time stability judgment: predict the field by combining the direction of optical flow and the moving boundary. If the brightness area does not move with the subject outline over time or exhibits flickering changes, it is determined to be a reflection or highlight area.
[0147] Through the above-mentioned multi-feature joint judgment mechanism, objective marking of backlight bright bands and highlight areas can be achieved.
[0148] Because different materials exhibit significantly different reflective behaviors (e.g., fabric may show diffused bright edges under strong light, while metal surfaces are prone to specular reflection and overexposure), after initial brightness screening, the reflection judgment threshold is further adjusted according to material type (e.g., fabric, skin, metal, plastic, etc.), and the reflection components are decomposed and modeled.
[0149] ;
[0150] This modeling method can distinguish between "normal brightness caused by material" and "bright edge interference caused by lighting".
[0151] Specifically, for brightness interference candidate points ( When the set threshold is exceeded, the pixel p is marked as a candidate point for luminance interference. If its original luminance L(p) is relative to the reference reflection response composed of diffuse and specular components, the pixel p is considered to have luminance interference. If the deviation exceeds the brightness deviation judgment threshold, the pixel is identified as a brightness anomaly point; otherwise, the pixel is considered to belong to a normal brightness variation area caused by the material itself. The brightness deviation judgment threshold can be adaptively set based on local brightness distribution, material type, or neighborhood statistical characteristics, and its specific value is not a limitation of this invention.
[0152] Because the bright edges near the foreground outline may cause edge enlargement during keying if the original brightness is used directly, local normalization or brightness compression is performed on the brightness anomalies to bring the brightness back to the range expected by the material, thus providing a "light-free" brightness reference for subsequent bright edge stripping.
[0153] Brightness correction can be expressed as:
[0154] ;in, For the brightness of the reflected field, This is the bright edge suppression coefficient, used to control the compression amplitude. This is the brightness reflection field after brightness correction.
[0155] Determined based on the brightness gradient of the target pixel, the diffuse reflection component of the target pixel, and the specular reflection component:
[0156] ;in, , To adjust the weights of the two types of factors, they can be set based on experience, determined by statistical results of sample data, or adaptively calculated based on local image features; no specific limitations are imposed here. The value used to characterize the intensity of bright edge artifacts at a pixel is higher, indicating a greater likelihood of bright edges. This is a very small positive number introduced into the denominator to ensure the numerical stability of the calculation process and to prevent abnormal amplification effects when the brightness or reflection components are weak. Through Contrast values can be suppressed in low-brightness or weak-reflection areas, so that the brightness of the reflected field only produces an effective response when there are significant brightness changes or reflection characteristics, thereby improving the stability and accuracy of bright edge detection. Indicates the current frame The global maximum value.
[0157] The results can be obtained based on factors such as the difference between the brightness component and the neighborhood average brightness, the degree of saturation or color compression, the consistency of local color gradient with geometric structure, and the consistency with the predicted field of the moving boundary. No specific limitations are imposed here. For example, when there are significant brightness abrupt changes and a lack of support from the actual boundary structure, A larger value is chosen to enhance the suppression effect; when the brightness change is consistent with the true boundary structure... Take the smaller value.
[0158] In the practical application of this invention, to avoid noise blocks or local jumps in the brightness reflection field, the brightness reflection field can be further processed to achieve spatial continuity, resulting in a smooth brightness distribution of the reflection field on the same material surface.
[0159] ;
[0160] in, The corrected brightness reflection field, q is One pixel in the image. This is a weighted function based on color similarity and spatial distance, used to comprehensively consider spatial distance and brightness similarity, so that the reflection field remains smooth within regions with consistent materials. Its definition is:
[0161] ;
[0162] in, For the luminance component, For pixels neighborhood window, This is a spatial distance adjustment parameter used to control the smoothing range; its value can be determined based on image resolution, boundary scale, or neighborhood window size. This is a brightness difference adjustment parameter used to maintain material boundaries. Its value can be determined based on the statistical characteristics of image brightness distribution or the magnitude of local brightness changes. For the summation variable, it belongs to This is used to normalize all pixels within a neighborhood.
[0163] In the formula, the numerator represents the similarity determined by both spatial distance and brightness difference, and the denominator represents the normalization factor, making... This weighting method ensures that the brightness of the reflected field within the same material area remains continuous, while avoiding brightness diffusion across the foreground geometric boundary, thus guaranteeing that the reflected field is consistent with the actual boundary position.
[0164] The final brightness reflection field is output in structured form as follows: .
[0165] Optionally, in another embodiment of the present invention, the micronormal perturbation field can be constructed based on the intensity variation of the structural components of the target pixel in the horizontal direction and the local intensity variation in the vertical direction:
[0166] ;in, For target pixels The amount of local intensity variation in the horizontal direction of structural components (such as hair strands, folds, or the outline of an accessory). This represents the local intensity variation of the structural components of the target pixel in the vertical direction. These are the eigenvalues of the matrix. To select the eigenvector corresponding to the largest eigenvalue as the result of local principal direction estimation, i.e. .
[0167] The local intensity change can be calculated through the difference between adjacent pixels, gradient operators, or other equivalent methods, and is not limited here.
[0168] In practical applications of this invention, local texture statistics are performed on pixels near the foreground boundary to extract high-frequency components or directional features, which are used to identify small structures that may constitute boundaries, such as hair fibers, clothing folds, or jewelry outlines. In areas where color changes are not obvious, this texture information can serve as the only valid boundary clue.
[0169] For pixels near the foreground boundary Perform local texture statistics to extract high-frequency components and directional features to characterize fine structures (such as hair strands, wrinkles, or jewelry outlines).
[0170] Local texture response It can be represented as:
[0171] Among them, when the color change is not obvious, Provides texture intensity criteria that can be used for boundary determination.
[0172] Analyze edge directions based on local textures to detect directional perturbations caused by compression, blurring, or sharpening.
[0173] Edge direction It can be represented as:
[0174] ;
[0175] With target pixels Centered on, for all neighboring pixels ( belong Consistency assessment of the directional differences:
[0176] ;
[0177] in, Represents pixels The directional consistency within the neighborhood, when When the value is below a set threshold, it is determined that there is a directional disturbance at that location. It needs to be corrected by the consistency of the neighborhood direction to obtain the corrected edge direction, so that the edge direction field remains continuous and smooth. The calculation method and same.
[0178] In practical applications of this invention, the correction direction can be obtained by, but is not limited to, minimizing the sum of squared neighborhood direction differences:
[0179] ;in, This represents the neighborhood weight.
[0180] The above optimization can be achieved through weighted averaging or directional fusion based on sine-cosine components, thereby ensuring that the corrected direction is consistent with the neighborhood structure and suppressing anomalous direction jumps.
[0181] In the practical application of this invention, micronormal basis estimation It can be represented as:
[0182] ;
[0183] To enhance the geometric distinction of detailed areas, perturbation terms are introduced at locations where the orientation changes drastically:
[0184] ;
[0185] in, For the disturbance amplitude, The perturbation direction is orthogonal to the main direction and is used to amplify the changes in normals at the detail boundaries, making the true contours and background noise easier to distinguish in geometric description. This is the basic estimate of the micronormal after perturbation.
[0186] It should be noted that the direction of the disturbance The direction is not pre-set by humans, but is calculated based on the direction of the main boundary at pixel p. Specifically, if the main direction vector n = The direction of the disturbance Defined as a unit vector orthogonal to it: = This definition ensures that the disturbance unfolds along the tangential direction of the boundary.
[0187] Disturbance amplitude This is used to describe the intensity of local fine structures, and its value can be determined based on statistical measures such as local texture intensity, gradient variance, or structural tensor eigenvalues. Regions with more pronounced structural changes exhibit greater perturbation amplitudes; regions with smooth structures show smaller perturbation amplitudes, and no such limitation is imposed here.
[0188] To avoid spatial jumps in micronormals, one can also... Perform continuous processing:
[0189] ,in, The normal direction after continuous processing. It is a weighted coefficient based on spatial distance and texture similarity, which keeps the normal changes within the same detailed structure area consistent.
[0190] ;
[0191] Where p is the target pixel position and q is the neighboring pixel of p; Represents the L2 norm; The feature vector representing the local texture / microstructure extracted at pixel p (e.g., it may be composed of local high-frequency response, gradient direction histogram, structural tensor features, or a combination thereof). This is a spatial distance attenuation parameter used to control the range of influence from the neighborhood; its value can be set based on image resolution, neighborhood window size, or boundary scale. For example, It can be set proportionally to the neighborhood radius, or scaled proportionally according to the image size. This is a texture similarity attenuation parameter used to control the sensitivity of texture differences to weights; its value can be determined based on local texture statistical characteristics. For example, It can be set based on the variance of local texture feature vectors, gradient intensity distribution, or statistical range of structural tensor eigenvalues.
[0192] , is the normalization factor, making .
[0193] Simultaneously considering spatial proximity and texture similarity, neighboring pixels that are closer to p and have more similar textures contribute more to microstructure estimation, thereby suppressing cross-boundary diffusion while preserving detailed structure.
[0194] Finally, the normal direction, amplitude, and perturbation information are structured and encoded to form micro-normal perturbations. This micronormal perturbation field provides a fine-grained geometric reference for subsequent edge structure recovery and transparency calculation.
[0195] Optionally, in another embodiment of the present invention, the motion boundary prediction field can be constructed based on the position information of the target pixel and the optical flow displacement of the target pixel between adjacent frames:
[0196] ;
[0197] in This represents the optical flow displacement of the target pixel between adjacent frames. , i.e., pixels From the Frame to the The frame's displacement vector, i.e., the motion vector. and These are the displacement components in the horizontal and vertical directions, respectively.
[0198] Within the set of pixels at the foreground boundary, based on the amplitude of motion Filter high-motion areas:
[0199] ,when When the threshold is exceeded, the pixel is marked as a candidate location for motion boundary for subsequent time series analysis.
[0200] After identifying the candidate motion boundary locations, the consistency of the boundary locations in adjacent frames is compared. The boundary locations from the previous frame are then compared. Extrapolate the motion vector to the current frame:
[0201] ;
[0202] in, Indicates that it is located at the th The location of a boundary point in the set of foreground boundary pixels in a frame, where the foreground boundary pixels are obtained from the previous frame using the existing keying process. This indicates the predicted position obtained by extrapolation from the previous frame, i.e., the position of the first target's motion boundary.
[0203] And calculate the position relative to the current frame boundary. offset :
[0204] ;
[0205] when When the threshold is exceeded, it is determined that there is a missing boundary or a discontinuous time sequence at that location, which needs to be predicted or repaired, i.e., an abnormal boundary region.
[0206] For regions identified as anomalous, boundary reconstruction is performed using temporal interpolation and local motion trends. The reconstructed boundary positions are then obtained. It can be represented as:
[0207] ;
[0208] in, This indicates the position obtained by backward extrapolation from the next frame, i.e., the position of the second target motion boundary. Backward extrapolation means using the foreground boundary pixels and their corresponding motion vectors in the next frame to map the boundary position backward along the time axis to the current frame, as a supplementary reference for the boundary prediction of the current frame. This represents a time weighting coefficient used to balance information from consecutive frames. It can be adaptively determined by, but is not limited to, motion intensity, transparency residuals, or boundary stability. For example, when the motion is intense or the current frame information is more reliable, Take the larger value; when the motion is stable or historical information is more stable, Take the smaller value; no specific limit is imposed here.
[0209] For the trailing region, this invention reduces repetition or stretched edges by suppressing boundary components that are inconsistent with the main motion direction and retaining only the prediction results that are consistent with the main motion trend.
[0210] In the practical application of this invention, the reconstructed motion boundary position and the optical flow displacement of the target pixel between adjacent frames are structurally encoded to form a motion boundary prediction field. :
[0211] ;
[0212] Furthermore, to ensure consistency in space and time, the prediction results are processed to achieve neighborhood continuity:
[0213] ;
[0214] in, The continuous weights of the motion boundary prediction field are used to measure the similarity between pixel p and its neighboring pixels q in terms of spatial location and motion trend. They are defined as follows:
[0215] ;in, The motion vectors of pixels p and q in frame t; The spatial distance adjustment parameter is used to control the continuous spatial range. It can be set according to spatial scale information such as the size of the neighboring window, image resolution and boundary width, or it can be adaptively adjusted according to the stability of the local structure. The present invention does not limit its specific value. This represents the motion similarity adjustment parameter, used to control the constraint strength of motion consistency. It can be determined based on the statistical characteristics of motion intensity between video frames, such as the mean, variance, or neighborhood motion variation range of the optical flow vector amplitude. In scenes with intense motion, The intensity can be appropriately increased to avoid excessive inhibition; in scenarios with smooth motion, The constraint can be reduced to enhance motion consistency, but this is not limited here. This invention does not... The constant is defined as a fixed value, which can be obtained empirically or adaptively; no specific limitation is made here. The numerator in the formula represents the joint similarity between pixels in terms of spatial distance and motion trend. The denominator is used for normalization, making... .
[0216] The final predicted field of the motion boundary is:
[0217] This prediction field can be used in conjunction with the local color gradient field, material / brightness reflection field, and micronormal perturbation field to provide reliable temporal boundary compensation in fast-moving or feature-degraded scenes, reducing edge misalignment, ghosting, and transparency jumps caused by violent movements.
[0218] It should be noted that the local color gradient field, brightness reflection field, micro-normal perturbation field, and motion boundary prediction field in this invention are designed to be independent of each other and can be calculated independently. They form a complementary and synergistic relationship in the subsequent boundary repair and transparency solution stages. The local color gradient field is used to restore the color structure and determine the main contour change direction; the brightness reflection field is used to remove bright edge interference and calibrate the foreground area; the micro-normal perturbation field is used to enhance details and ensure the geometric consistency of local boundary morphology; and the motion boundary prediction field is used to ensure the temporal continuity of cross-frame boundaries and avoid clipping and ghosting.
[0219] This invention provides a comprehensive boundary description covering the causes of complex edges through the aforementioned boundary proxy data, laying a unified and stable feature foundation for subsequent boundary anomaly repair and transparency calculation.
[0220] It should be noted that the step of constructing boundary proxy data based on target pixels in the image is an independent pre-stage and does not affect the original matting model architecture. It only provides supplementary features required for edge determination in subsequent modules.
[0221] In practical applications of this invention, when abnormal areas such as artifacts, bright edges, clipping, and ghosting are detected, this invention replaces the original pixel features with boundary descriptions from the boundary proxy data that have higher reliability. For example, in bright edge areas, the reflectance brightness is used instead. Alternative pixel brightness In the gradient weakening region, to estimate the gradient. Replace the original gradient In the motion cutoff region, to predict the boundary location. Replace the critical position of the current frame.
[0222] In practical applications, this invention addresses the problem of boundary structure damage caused by factors such as compression artifacts, over-sharpening, and edge blurring. It combines local color gradient fields and micro-normal perturbation fields to reconstruct and correct the shape of damaged edges, restoring a boundary structure that closely resembles the contour of a real object.
[0223] First, this invention can determine whether the target pixel location is a structural fracture region based on the local color gradient field and the original gradient. Let the original gradient be... The local color gradient field is Its difference measurement Defined as:
[0224] ;
[0225] when When the threshold is exceeded, the target pixel p is determined to be located in a structural fracture area, and structural restoration processing is required.
[0226] If the target pixel location is a structural fracture region, the restored orientation angle is determined based on the boundary orientation angle provided by the local color gradient field at the target pixel and the local principal orientation angle of the boundary at the target pixel. :
[0227] ;in, This represents the local principal orientation angle of the boundary at the target pixel location, used to characterize the trend of the boundary within the image plane. The orientation angle can be obtained by projecting the gradient direction or normal direction at the corresponding location onto the image plane, and is usually expressed in radians or angles. This represents the boundary orientation angle provided by the local color gradient field at pixel q, and represents the orientation information provided by the local color gradient field in the neighborhood. Wherein, if the gradient vector of the local color gradient field at pixel q is... Then the boundary direction angle can be expressed as: =arctan2 , where arctan2 is a two-parameter arctangent function used to ensure that the direction angle is calculated correctly within the complete angle range.
[0228] By minimizing the above formula, the orientation angle that best matches the direction of the surrounding local color gradient field is selected in the neighborhood of pixel p, and used as the principal boundary direction after structure restoration. For pseudo-edges introduced by over-sharpening, false contours are suppressed by removing gradient components with unstable orientations and retaining only edge pixels with high orientation consistency.
[0229] Specifically, Let I represent the original gradient vector of the image at pixel q in frame t, which can be obtained from the original gradient components I of the input image in the horizontal and vertical directions. x(p) and I y (p) constitutes, i.e., ∇ =(I x (p), I y (p)).
[0230] Original gradient component I x (p), I y (p) can be obtained by performing first-order difference operations in the horizontal and vertical directions on the input image. Specifically, it can be implemented using the Sobel operator, Prewitt operator, Scharr operator or other equivalent methods. This invention does not limit this.
[0231] Based on the original gradient components, to address potential issues such as breakage, abnormal compression, noise amplification, or directional instability in the original gradient in the boundary region, continuity repair, anomaly suppression, and directional consistency constraints are applied to the original gradient components, resulting in a surrogate gradient vector of the local color gradient field at pixel q. .
[0232] Among them, G x (q) and G y (q) represents the components of the modified surrogate gradient in the horizontal and vertical directions, respectively, instead of the original gradient components calculated directly from the input image.
[0233] In practical applications of this invention, when gradient information is insufficient to stably describe fine-scale structures (such as hair edges, fabric wrinkles, etc.), a micro-normal perturbation field is introduced as a geometric supplement. Let the micro-normal vector be... During edge reconstruction, it is used as a directional constraint term. Participate in boundary updates:
[0234] ;
[0235] in, It is a unit direction vector, which is calculated by taking the gradient component of the local color gradient field at pixel p: Then, normalization is performed: This yields the corresponding unit direction vector. .
[0236] This is a weighting coefficient used to balance the influence of color structure and geometric structure. In detailed areas where color changes are not obvious, it is reduced... The value makes the boundary more dependent on the geometric normal trend, ensuring the continuity of the boundary. The settings can be determined empirically or adaptively based on local structural confidence or feature stability; no specific limitations are imposed here.
[0237] This invention performs local reconstruction based on the restored orientation angle and orientation constraint terms to obtain the reconstructed local boundary region.
[0238] Local reconstruction refers to: for the boundary pixels identified as structural fracture regions, guided by the main boundary extension direction determined by the recovery direction angle, searching for candidate pixels in the local neighborhood of the target pixel that are consistent with the direction and satisfy the direction constraint, and filling, connecting or interpolating the missing or fractured boundary segments to obtain the reconstructed local boundary region.
[0239] Specifically, the restored direction angle can be used as the main direction of boundary extension, and the direction constraint term determined by the local color gradient field and the micro-normal perturbation field can be used as the constraint condition for boundary update. Candidate pixels with high consistency with the main direction and continuous with the existing boundary structure are selected in the local window and added to the reconstruction result as new boundary pixels. For gap regions that cannot be directly connected, interpolation can be performed along the main direction to restore the connectivity and local morphology of the boundary.
[0240] It should be noted that the reconstructed local boundary region can be represented as an updated set of boundary pixels, a boundary orientation field, or a boundary mask, and its specific output form does not constitute a limitation of this invention.
[0241] After completing the local reconstruction, the edge direction and amplitude can be made continuous and the curvature smoothed to avoid abrupt changes in direction or abnormal bending. Edge direction smoothing It can be represented as:
[0242] ;
[0243] in, The weighted coefficients representing edge structure continuity are used to measure the similarity between pixel p and its neighboring pixels q in spatial location and edge structure direction, and are defined as follows:
[0244] ;
[0245] in, This represents the spatial distance adjustment parameter, used to control the continuous spatial range; This represents the orientation consistency adjustment parameter, used to control the strength of edge orientation constraints. The numerator term represents the joint similarity between pixels in terms of spatial distance and edge orientation; the denominator term is used for normalization, making... .
[0246] It can be determined based on the dispersion of the neighborhood orientation distribution, for example, based on the variance of the neighborhood orientation angles and orientation consistency index. Alternatively, local structural stability can be set. When the consistency of neighborhood directions is high... A smaller value can be chosen to strengthen the directional constraint; when the directional distribution is relatively discrete, It can be appropriately increased to avoid excessive inhibition; no limit is set here.
[0247] The parameters can be set according to the boundary width, gradient distribution range, or neighborhood radius; no specific limit is set here.
[0248] This invention, through the aforementioned weight definition, assigns higher weights to pixels that are spatially close and have consistent edge directions during the continuum reconstruction process. This maintains the smoothness of the boundary orientation during the structure restoration stage and suppresses directional jumps introduced by local noise or pseudo-edges. Through this continuum reconstruction process, the restored boundary maintains the true local structure while possessing a stable and coherent global shape.
[0249] This invention restores the boundary structure by jointly using gradients and normals, resulting in a structure that is coherent, detailed, and effectively suppresses false edges. This restored structure serves as the boundary input for subsequent bright edge stripping and transparency inference, effectively reducing the impact of hard contours and structural damage on the matting quality.
[0250] In practical applications, this invention addresses issues such as bright edge diffusion and false foreground expansion that occur under backlight, high light, or strong reflection environments. It utilizes a brightness reflection field to replace and correct the original brightness signal, performs bright edge stripping on boundary areas affected by illumination interference, and restores the true foreground contour.
[0251] First, within the neighborhood of the foreground boundary, bright edge interference regions are identified by comparing the difference between the brightness component and the brightness of the reflected field. Let the brightness component be... The brightness of the reflected field is Its brightness deviation Defined as:
[0252] ;
[0253] when If the pixel exceeds a set threshold and is located within the neighborhood of the foreground boundary, it is determined to belong to the bright edge interference zone. In the bright edge interference zone, the original brightness is significantly affected by illumination and cannot stably reflect the true boundary structure.
[0254] To distinguish between true contour lighting and specular artifacts, the brightness of the reflection field in the brightness reflection field and material characteristics are introduced for joint judgment.
[0255] Let the brightness of the reflected field in the brightness reflection field be... Material category weight is Then the credibility of the bright edge It can be represented as:
[0256] ;
[0257] in, It is used to characterize the differences in the reflectivity of different materials under strong light conditions. It can be obtained based on the analysis results of local reflection modes, brightness distribution characteristics and texture characteristics in the material / brightness reflection field. Different judgment intensities are used for diffuse reflection materials and specular reflection materials during the bright edge stripping process, thereby reducing the risk of accidentally stripping the real foreground edge.
[0258] In the practical application of this invention, the differences in reflectivity are characterized by calculable brightness statistics, color saturation variation characteristics, and local morphology / texture directionality characteristics.
[0259] Specifically, during the construction of the brightness reflection field, statistical quantities such as kurtosis / sharpness are calculated for the brightness distribution in the pixel neighborhood to characterize whether the highlights are peaked; changes in saturation or chromaticity components are measured to characterize whitening and color compression under strong light conditions; and the anisotropy of the highlight morphology is characterized by methods such as the ratio of eigenvalues of the structure tensor. Based on these characteristics, a reflection mode score can be formed, and further differenced with the score of the main reference area to obtain the "reflection characteristic difference" index. This index is used to adjust the threshold or intensity of bright edge stripping and brightness correction, thereby achieving adaptive processing of the reflection differences of different materials under strong light conditions.
[0260] Within the bright edge interference area, the original brightness is replaced and compressed using the reflected field brightness. (Pixel brightness after brightness correction) Defined as:
[0261] ;
[0262] in, This is the brightness suppression coefficient, whose magnitude varies with the brightness of the reflected field. It increases as it grows.
[0263] This correction method compresses the brightness overflow caused by excessive reflection back to a range consistent with the main material, thereby suppressing abnormal expansion of the foreground region. For pixels whose brightness significantly exceeds the reflection field trend, their foreground confidence can be further reduced for subsequent foreground separation judgment.
[0264] Then, based on the brightness reflection field and material category weights, bright edge stripping is performed to obtain the boundary region after bright edge stripping.
[0265] Specifically, the bright edge stripping method can be achieved by calculating the deviation between the brightness of each pixel and the brightness of the reflected field, marking it as a bright edge interference area; for the bright edge interference area, brightness correction is performed by combining the brightness of the reflected field and the material category weight to remove the influence of bright edge interference.
[0266] After the bright edge is stripped, the remaining boundary can be locally corrected to ensure the continuity and stability of the stripped boundary with the surrounding boundary.
[0267] It should be noted that the above process can be achieved through techniques such as brightness difference calculation, interpolation repair, and boundary optimization. Specific details can be adjusted according to the actual image characteristics.
[0268] After completing the bright edge stripping, the true outline of the foreground can be restored based on the reflection field intensity gradient and boundary variation trend. By combining the local color gradient field and the micro-normal perturbation field, the boundary region after bright edge stripping is jointly corrected to obtain the corrected boundary region. This ensures that the stripped outline is not affected by bright edge diffusion and also avoids boundary depressions or fragmentation caused by excessive compression.
[0269] The intensity gradient of the reflected field is a local variation calculated based on the brightness changes of the reflected field, used to describe the brightness variation trend of the foreground boundary. It can reflect the shape changes of the boundary and the influence of illumination on the boundary in different areas.
[0270] Specifically, the intensity gradient of the reflected field can be obtained by performing a local difference operation on the brightness value R(p) of the reflected field:
[0271] ;
[0272] in, and These represent the gradient components of the reflected field brightness in the horizontal and vertical directions, respectively. This gradient indicates the intensity and direction of brightness changes within the boundary region, and is used for subsequent boundary restoration and foreground recovery.
[0273] The boundary change trend refers to the direction and magnitude of change in the boundary of foreground objects in the image, which is determined by the reflection field intensity gradient and local image information. Specifically, the boundary change trend can be determined in the following two ways:
[0274] 1. Gradient change direction: The direction of the boundary trend is determined by the direction of the reflection field intensity gradient. That is, the brightness change trend of the boundary in the adjacent region determines the direction of the boundary.
[0275] 2. Boundary Change Amplitude: The greater the brightness change at the boundary, the more drastic the boundary change. The intensity of the boundary change can be quantified by calculating the amplitude of the reflected field intensity gradient. If the amplitude exceeds a preset threshold, the boundary is considered to have undergone a significant change.
[0276] The methods for restoring the true outline of the foreground based on the reflection field intensity gradient and boundary change trend include: for regions with broken boundaries, using interpolation algorithms (such as bilinear interpolation and cubic spline interpolation) to restore the missing boundary parts; inferring the location of the missing boundary based on the reflection field intensity gradient and boundary change trend; and using neighborhood information for interpolation filling.
[0277] Optionally, in another embodiment of the present invention, after boundary restoration, in order to ensure the smoothness of the foreground contour, a smoothing algorithm (such as Gaussian smoothing) is used to optimize the boundary to avoid unnatural jumps or discontinuous regions.
[0278] This invention imposes constraints on directional consistency and edge continuity, so that the restored foreground outline maintains a natural and coherent structure in both geometric shape and visual representation.
[0279] The final output boundary region has effectively removed bright edge artifacts caused by highlights, backlighting, or reflections. Its brightness distribution is consistent with the characteristics of the main material and matches the true foreground geometry. This result can be directly used as the boundary input for subsequent transparency calculations, avoiding problems such as white edges, halos, or false foreground expansion under complex lighting conditions.
[0280] In practical applications, this invention addresses common issues in fast-moving scenes such as edge clipping, ghosting, and cross-frame transparency jumps by utilizing motion boundary prediction fields to perform temporal compensation and consistency correction on edges, ensuring that boundaries in dynamic scenes remain stable and continuous in the time dimension.
[0281] First, within the neighborhood of the foreground boundary, the consistency between the predicted field of the motion boundary and the original optical flow result is compared to locate the motion anomaly region.
[0282] Let the current frame boundary position (target pixel position) be... The predicted position (the position of the third target's motion boundary) obtained by extrapolation from the previous frame is: Its timing deviation Defined as:
[0283] Among them, when If the pixel exceeds a set threshold, or if the transparency of the corresponding location shows an abnormal abrupt change between adjacent frames, the pixel is determined to belong to a motion anomaly region and requires timing compensation processing. Motion anomaly regions include, but are not limited to, cut-off regions and ghosting regions.
[0284] In the cut-off region, missing edge segments are filled in based on the position of the third target's motion boundary, resulting in a processed boundary region. By comparing the predicted boundary positions of previous and subsequent frames, the boundary points are extrapolated along the time axis to ensure that the missing edge segments remain continuous in time. In this process, the predicted boundary is used as the dominant information, and unstable color or brightness features are downweighted to ensure that the cut-off recovery is based on the motion trend.
[0285] Within the cut-off region, the boundary position provided by the motion boundary prediction field is used as the dominant information to fill in the missing edge segments in the time direction.
[0286] Completed boundary location (processed boundary region) It can be represented as:
[0287] ;in, The time-compensated weights are used to enhance the dominance of the predicted boundary within the cutoff region. Motion anomaly can be calculated based on optical flow vector differences, transparency residuals, or boundary position changes, and adjustments can be made accordingly. The value of . When the motion abnormality is large, increase To enhance the dominant role of the prediction boundary; when the motion is stable, reduce To avoid overcompensation, no restrictions are set here.
[0288] During the temporal completion of missing edge segments, the weight of unstable color or brightness features is reduced, so that boundary restoration is mainly based on motion trends, thereby avoiding false completion caused by the failure of appearance features.
[0289] For the trailing regions caused by motion blur, the motion direction and velocity characteristics in the prediction field are analyzed to identify residual edge pixels that are inconsistent with the main motion trend. Target pixels belonging to the trailing regions are then removed to obtain the processed boundary regions.
[0290] In the practical application of this invention, the true boundary position can also be re-estimated by combining the local color gradient field and the micro-normal perturbation field to avoid excessive edge shrinkage. This is not limited here.
[0291] The criteria for determining motion blur can be expressed as:
[0292] ;in, This represents the local extension direction of the boundary at pixel p within the image plane, used to characterize the spatial orientation of the edge. The edge extension direction can be determined by directional information provided by the local color gradient field or the micro-normal perturbation field, and its direction is usually consistent with the boundary tangent direction. Specifically, the local color gradient field provides the gradient vector at pixel p. The gradient vector's direction is consistent with the edge normal direction. Therefore, the edge extension direction (tangential direction) is... It can be obtained by orthogonal transformation of the gradient direction, that is:
[0293] ;in, and These represent the components of the micronormal perturbation field at pixel p, which are calculated using the gradient information of the image. These two components reflect the orientation of the boundary or object surface in the image.
[0294] Specifically, the micronormal perturbation field can be derived from the local color gradient field. The calculation is performed using the following formula:
[0295] ;
[0296] in, These are the gradient components in an image, representing the direction of pixel brightness changes, and are used to extract boundaries. , From The obtained normal direction describes the position and orientation of the boundary or object surface.
[0297] Similarly, the micronormal perturbation field provides a local geometric normal trend vector. The corresponding extension direction is also obtained through orthogonal transformation.
[0298] It is understandable that when two types of directional information exist simultaneously, they can be weighted and fused based on their respective confidence levels.
[0299] Temporal continuity correction is performed on the processed boundary region to obtain the corrected boundary region, which maintains the consistency of position, orientation and rate of change between adjacent frames.
[0300] Boundary position after time smoothing Defined as:
[0301] ;in, These are time-weighted coefficients used to measure the contribution of adjacent time frames to the boundary position of the current frame, and are defined as follows:
[0302] ;in, This represents the index within the time step. Its function is to perform a weighted calculation on the values of frames several times before and after the current frame's corresponding timestamp t. This represents the time offset relative to the current frame t; Indicates the first The boundary position in the frame after temporal completion; Indicates the position of the completion boundary of the current frame; This represents the time-distance adjustment parameter, used to control the range of time smoothing; The term represents the boundary position difference adjustment parameter, used to suppress time frames with large spatial offsets; the numerator represents the joint similarity composed of temporal distance and boundary position consistency; the denominator is used for normalization, making... ,in, The temporal neighborhood window size is used to define the range of adjacent frames participating in temporal smoothing; the temporal offset. This represents the frame indices relative to the current frame, used for weighted smoothing of boundary positions within a finite time frame. By setting a finite time window, it avoids unreasonable influences from boundary information spanning too far in time on the current frame.
[0303] in, The value can be determined based on the average motion velocity of the boundary region or inter-frame motion statistics. When the motion velocity is high, the value should be appropriately reduced. To avoid cross-frame fusion; when the motion is relatively smooth, the size can be appropriately increased. No restrictions are imposed here. This is a boundary position difference adjustment parameter used to suppress time frames with large spatial offsets. Its value can be adaptively determined based on the statistical distribution of boundary position changes or the variance of optical flow residuals, and is not limited here.
[0304] By defining the above, this invention gives higher weight to frames that are closer to the current frame in time and have smaller changes in their boundary positions during the time smoothing process. This helps to suppress jitter while avoiding cross-frame mispropagation and reducing boundary jitter and jumps caused by prediction errors or instantaneous occlusion.
[0305] S102. Based on the boundary proxy data of each dimension, generate candidate transparency values for the target pixels corresponding to each dimension.
[0306] Among them, the candidate values of transparency of the target pixel It can be obtained through local comparison or consistency analysis based on the structural, brightness or temporal information provided by the corresponding proxy boundary field at pixel p. Its value is used to reflect the degree of support of the feature for foreground attribution judgment.
[0307] Specifically, Based on the response of the corresponding proxy field and its local neighborhood information, the values can be calculated through structural consistency analysis, brightness contrast analysis, or temporal residual propagation. For example, based on the local color gradient field, candidate values for structural transparency can be obtained by mapping the gradient magnitude; based on the brightness reflection field, candidate values for brightness correction can be obtained by mapping the brightness difference; and based on the motion boundary prediction field, temporal candidate values can be obtained by cross-frame residual propagation. No specific limitations are imposed here.
[0308] S103. Generate the transparency estimate of the target pixel based on the transparency candidate values of the target pixel corresponding to all dimensions.
[0309] In the practical application of this invention, the transparency estimate of the target pixel can be calculated using the following formula. :
[0310] Where is the weight corresponding to the boundary proxy data of dimension i. , Represents the local color gradient field. Represents the brightness reflection field. Represents the micronormal perturbation field. This represents the predicted field of the motion boundary.
[0311] Weight The following results were obtained by normalizing the confidence scores of the boundary proxy data for each dimension:
[0312] ;in, This indicates that the confidence level of all proxy boundary fields is normalized. This represents the confidence level of the boundary proxy data of dimension i at the target pixel p, used to measure the reliability of the boundary information provided by the proxy field at that location. The pixel is determined by comparing the magnitudes of the confidence levels. The dominant feature source is used for subsequent transparency estimation weight allocation.
[0313] It should be noted that the confidence levels are not pre-set constants, but are calculated based on the response intensity, local consistency, or statistical stability of each proxy field. For example, the local color gradient field can be determined based on the gradient magnitude and direction consistency; the confidence level of the brightness reflection field can be determined by the brightness of the reflection field and the material weight; the confidence level of the micro-normal perturbation field can be determined based on the local normal stability; and the confidence level of the motion boundary prediction field can be determined based on motion consistency or optical flow residuals, without any restrictions here.
[0314] Through the above, the present invention automatically reduces the influence of the original brightness features in the bright edge region, increases the weight of the motion prediction field in the motion cutoff region, and enhances the weight of the micro-normal perturbation field in the detail region.
[0315] In the practical application of this invention, when there is a significant conflict between the candidate transparency values given by different features, they can be distinguished by local continuity.
[0316] Define the continuity cost of transparency candidates for:
[0317] For features with high continuity costs, their corresponding weights are reduced. This increases the proportion of features with better continuity in the fusion process, thereby avoiding instability or jumps in transparency estimation caused by conflicts between multiple sources of features.
[0318] In the practical application of this invention, when obtaining the initial transparency value... Afterwards, local smoothing can be applied to the edge areas to suppress noise, while enhancing the transparency changes along the main direction of the boundary.
[0319] The smoothed transparency can be expressed as: ;
[0320] in The transparency smoothing weight can be defined in the same way as the aforementioned spatial weight. Similarly, the settings are based on the spatial distance between pixels and the consistency of the boundary direction, which will not be elaborated here.
[0321] This invention introduces geometric direction information provided by a micro-normal perturbation field in the detail region to guide the transparency variation along the structural direction, preventing excessive averaging of fine structures. After the above processing, the resulting initial transparency map exhibits good continuity and structural consistency in the edge region and remains stable under complex conditions such as specular highlights, compression artifacts, and rapid motion. This initial transparency value serves as the input for subsequent cross-frame stabilization and global transparency fusion, providing a reliable foundation for the final foreground separation result.
[0322] S104. Generate a transparency field based on the transparency estimates of all target pixels.
[0323] Understandably, in dynamic scenes, even if the transparency estimate has high spatial structure accuracy, it may still flicker, jump, or drift in the temporal dimension due to changes in illumination, rapid movements, or local edge instability. This invention introduces cross-frame consistency constraints to perform temporal smoothing and structural correction on the transparency estimate, ensuring that the transparency field remains continuous and stable between adjacent frames.
[0324] Let the current frame be frame t, and its initial transparency value be... .
[0325] First, calculate the transparency residual between the current frame and the previous frame. By propagating the transparency residual along the direction of motion, the transparency changes continuously with the movement of the subject, reducing transient flicker caused by pixel-level errors.
[0326] Within the edge region, the changes in transparency are smoothed under structural constraints.
[0327] For pixels located near the boundary, transparency is smoothed only in the direction aligned with the boundary normal to avoid cross-edge diffusion that could cause a soft or trailing edge. Directional smoothing. It can be represented as:
[0328] ;in, Represents the target pixel along the boundary normal direction. The set of neighboring pixels, The transparency smoothing weight can be defined in the same way as the aforementioned spatial weight. Similarly, this will not be elaborated upon here.
[0329] When there is a significant abrupt change in transparency between adjacent frames (such as a sudden contraction or expansion of edges due to changes in brightness), multi-frame historical information is introduced to constrain the range of transparency changes:
[0330] Based on time neighborhood Weighted fusion of transparency:
[0331] ;in, This method is used to suppress the impact of frames with large time spans or significant spatial offsets on the current transparency. By doing so, the transparency changes are kept within a continuous trajectory, avoiding "jagged" or abrupt boundary contraction and expansion.
[0332] For regions where transparency judgment is unstable due to blurring, occlusion, or local reflection, temporal compensation for transparency is performed by combining information from the gradient field, reflection field, and motion boundary prediction field.
[0333] Specifically, for regions where transparency judgment is unstable due to blurring, occlusion, or local reflection, the weight of temporal consistency constraints is increased, making transparency more dependent on cross-frame structural stability rather than single-frame instantaneous features, thereby reducing temporal noise and jitter.
[0334] After the above-mentioned temporal stabilization and residual propagation processing, the transparency field obtained by this invention maintains good continuity and structural consistency in cross-frame changes. Its boundary changes are consistent with the movement trend of the main body, which can effectively suppress transparency flickering, jumping and drifting phenomena caused by fast action or complex lighting conditions, and provide stable and reliable transparency input for real-time keying and subsequent compositing processes.
[0335] Furthermore, the transparency field after time-stabilized processing can be globally fused and optimized at the pixel level, so that the output foreground transparency remains continuous and natural in both spatial and temporal dimensions, and meets the requirements of actual keying compositing.
[0336] Specifically, the initial transparency value, time-series stabilization results, and structural constraint information provided by boundary proxy data are comprehensively utilized to generate the final transparency field through global weight integration. Within different regions, the dominance of various constraints is adaptively adjusted based on the boundary structural characteristics.
[0337] In regions with clear structures, the structural information is mainly provided by the local color gradient field and the micro-normal perturbation field; in regions with high reflectivity or bright edges, the constraints are mainly provided by the brightness reflection field; in regions with rapid movement, the temporal consistency constraint of the motion boundary prediction field is introduced to correct the transparency, but no restrictions are imposed here.
[0338] In this way, different types of boundary problems are supported by the most suitable feature information.
[0339] During the global fusion process, boundary consistency constraints are applied to the transparency fields inferred from different features to keep the transparency distribution aligned with the restored boundary structure.
[0340] For example, when a slight shift or local discontinuity in transparency is detected near the boundary, it is corrected through mechanisms such as directional consistency and curvature smoothing to ensure that the boundary shape is stable and continuous, and to avoid edge skipping or jaggedness.
[0341] For isolated noise pixels or outliers that may exist in the transparency field, they can be detected and suppressed at the pixel level to reduce edge flickering or foreground holes caused by noise. At the same time, in the detailed areas, the directional information provided by the micro-normal perturbation field is combined to moderately enhance the transparency changes, so that fine structures such as hair strands and clothing wrinkles are clearly preserved in the final transparency.
[0342] After completing local optimization, spatial continuity checks can be performed on the entire transparency field to ensure that there are no unnatural abrupt changes in transparency across the entire image. Simultaneously, by combining the transparency change trends of adjacent frames, temporal consistency corrections are applied to the global results, ensuring smooth and stable transparency changes in dynamic scenes and meeting the visual quality requirements for full-scale output.
[0343] After the above global fusion and optimization processing, the resulting transparency field exhibits stable performance in terms of spatial structure, temporal continuity, and boundary visual quality. It can be directly used in application scenarios such as green screen keying, foreground separation, and real-time compositing, significantly improving the keying effect under complex lighting and fast motion conditions.
[0344] As can be seen from the above scheme, the present invention provides a method for generating transparency fields. Under the premise of keeping the main structure of the existing keying process unchanged, by constructing multi-dimensional boundary proxy data, the foreground boundary is given a replaceable and correctable structured description, so that the restoration, separation and transparency inference of the edge region can be completed in a more stable and robust manner, thereby significantly reducing the edge line perception and improving the keying quality in dynamic scenes.
[0345] Another embodiment of the present invention provides an apparatus for generating a transparency field, such as... Figure 2 As shown, it specifically includes:
[0346] Boundary proxy data construction unit 201 is used to construct boundary proxy data based on target pixels in the image.
[0347] The boundary proxy data includes: local color gradient field, brightness reflection field, micronormal perturbation field, and motion boundary prediction field.
[0348] The transparency candidate value generation unit 202 is used to generate transparency candidate values for the target pixels corresponding to each dimension based on the boundary proxy data of each dimension.
[0349] Transparency estimation generation unit 203 is used to generate transparency estimation of target pixels based on the transparency candidate values of target pixels corresponding to all dimensions.
[0350] Transparency field generation unit 204 is used to generate a transparency field based on the transparency estimate of all target pixels.
[0351] For details on the specific operation of the units disclosed in the above embodiments of the present invention, please refer to the corresponding method embodiments, such as... Figure 1 As shown, it will not be elaborated further here.
[0352] As can be seen from the above solutions, the present invention provides a transparency field generation device. Under the premise of keeping the main structure of the existing keying process unchanged, it constructs multi-dimensional boundary proxy data to perform a replaceable and correctable structured description of the foreground boundary, and generates a transparency field based on the multi-dimensional boundary proxy data, which significantly improves the quality and dynamic stability of the transparency field.
[0353] Another embodiment of the present invention provides an electronic device, comprising:
[0354] One or more processors.
[0355] A storage device on which one or more programs are stored.
[0356] When the one or more programs are executed by the one or more processors, the one or more processors implement the method for generating transparency fields as described in the above embodiments.
[0357] Another embodiment of the present invention provides a computer storage medium storing a computer program thereon, wherein the computer program, when executed by a processor, implements the method for generating a transparency field as described in the above embodiments.
[0358] Another embodiment of the present invention provides a computer program product, which, when executed, is used to perform the above-described method for generating a transparency field.
[0359] In particular, according to embodiments of the present invention, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments of the present invention include a computer program product comprising a computer program carried on a non-transitory computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via a communication device, or installed from a storage device, or installed from a ROM. When the computer program is executed by a processing device, it performs the functions defined in the methods of the embodiments of the present invention.
[0360] Although the subject matter has been described using language specific to structural features and / or methodological logic, it should be understood that the subject matter defined in this invention is not necessarily limited to the specific features or actions described above. Rather, the specific features and actions described above are merely exemplary forms for implementing the invention.
[0361] While several specific implementation details are included in the foregoing discussion, these should not be construed as limiting the scope of the invention. Certain features described in the context of individual embodiments may also be implemented in combination in a single embodiment. Conversely, various features described in the context of a single embodiment may also be implemented individually or in any suitable sub-combination in multiple embodiments.
[0362] The above description is merely a preferred embodiment of the present invention and an explanation of the technical principles employed. Those skilled in the art should understand that the scope of the invention is not limited to the specific combination of the above-described technical features, but also includes other technical solutions formed by arbitrary combinations of the above-described technical features or their equivalents without departing from the inventive concept. For example, technical solutions formed by substituting the above-described features with technical features of the present invention (but not limited to) that have similar functions.
Claims
1. A method for generating a transparency field, characterized in that, include: Boundary proxy data is constructed based on target pixels in the image; wherein, the boundary proxy data includes: local color gradient field, brightness reflection field, micronormal perturbation field and motion boundary prediction field; Based on the boundary proxy data for each dimension, candidate transparency values for the target pixel corresponding to that dimension are generated. Based on the transparency candidate values of the target pixel corresponding to all the dimensions, a transparency estimate of the target pixel is generated; A transparency field is generated based on the transparency estimates of all the target pixels.
2. The method for generating a transparency field according to claim 1, characterized in that, The construction of boundary proxy data based on target pixels in the image includes: A local color gradient field is constructed based on the intensity changes of the color components of the target pixel in the horizontal and vertical directions. A brightness reflection field is constructed based on the diffuse reflection and specular reflection components of the target pixel; Based on the intensity variation of the structural components of the target pixel in the horizontal direction and the local intensity variation in the vertical direction, a micro-normal perturbation field is constructed. Based on the position information of the target pixel and the optical flow displacement of the target pixel between adjacent frames, a motion boundary prediction field is constructed.
3. The method for generating a transparency field according to claim 2, characterized in that, Also includes: Based on the local color gradient field and the average gradient of the target pixel's neighborhood, it is determined whether the target pixel's location is an abnormally compressed region; wherein, the local color gradient field includes the local gradient and the gradient direction; If the target pixel location is an abnormally compressed region, then the local gradient is repaired for continuity to obtain the recovered gradient, and the gradient direction is constrained for continuity to obtain the corrected gradient direction. Based on the restored gradient and the corrected gradient direction, the corrected local color gradient field is obtained.
4. The method for generating a transparency field according to claim 2, characterized in that, Also includes: Based on the brightness gradient of the target pixel, it is determined whether the target pixel is a brightness anomaly point; wherein, the brightness gradient is determined based on the brightness component and the average brightness component of the neighboring pixels; If the target pixel is a brightness anomalous point, then the brightness reflection field is corrected based on the brightness of the reflection field to obtain a brightness-corrected brightness reflection field; wherein, the brightness of the reflection field is determined based on the brightness gradient of the target pixel, the diffuse reflection component and the specular reflection component of the target pixel; The brightness-corrected brightness reflection field is then processed to become a continuous field, resulting in a corrected brightness reflection field.
5. The method for generating a transparency field according to claim 2, characterized in that, Also includes: Taking the target pixel as the center, the consistency evaluation is performed on the orientation difference of all neighboring pixels to obtain the consistency evaluation result; If the consistency evaluation result indicates that there is directional perturbation at the target pixel position, then the consistency of the neighborhood direction is corrected based on the edge direction of the target pixel to obtain the corrected edge direction; wherein, the edge direction of the target pixel is determined based on the local intensity change in the horizontal direction and the local intensity change in the vertical direction of the structural features of the target pixel.
6. The method for generating a transparency field according to claim 2, characterized in that, Also includes: Based on the optical flow displacement of the target pixel between adjacent frames, the candidate positions of the motion boundary are determined; The position of the first target motion boundary is determined based on the previous frame boundary position and optical flow displacement of the candidate motion boundary position; Based on the first target motion boundary position and the motion boundary candidate position, determine whether the motion boundary candidate position is an abnormal boundary region; If the candidate position of the motion boundary is an abnormal boundary region, the motion boundary position is reconstructed based on the first target motion boundary position and the second target motion boundary position; wherein, the second target motion boundary position is determined based on the next frame boundary position and optical flow displacement of the candidate position of the motion boundary.
7. The method for generating a transparency field according to claim 2, characterized in that, Also includes: Based on the local color gradient field and the original gradient, determine whether the target pixel location is a structural fracture region; If the target pixel location is a structural fracture region, the recovery direction angle is determined based on the boundary direction angle provided by the local color gradient field at the target pixel and the local principal direction angle of the boundary at the target pixel; Based on the micronormal vector and unit direction vector provided by the micronormal perturbation field, the direction constraint term is determined; wherein, the unit direction vector is determined based on the local color gradient field; Based on the restored direction angle and the direction constraint term, local reconstruction is performed to obtain the reconstructed local boundary region.
8. The method for generating a transparency field according to claim 4, characterized in that, Also includes: Based on the reflected field brightness and the original brightness, determine whether the target pixel belongs to the bright edge interference area; If the target pixel belongs to the bright edge interference area, then the original brightness is corrected based on the reflected field brightness to obtain the brightness-corrected pixel brightness; Bright edge stripping is performed based on brightness reflection field and material category weights to obtain the boundary region after bright edge stripping; The boundary position is corrected based on the boundary region after the bright edge is stripped, the local color gradient field, and the micro-normal perturbation field to obtain the corrected boundary region.
9. The method for generating a transparency field according to claim 2, characterized in that, Also includes: Based on the target pixel position and the third target motion boundary position, it is determined whether the target pixel belongs to a motion abnormal region; wherein, the third target motion boundary position is determined by the previous frame boundary position of the target pixel position and the optical flow displacement; the motion abnormal region includes cut-off regions and ghosting regions; If the target pixel belongs to the cut-off region, the missing edge segment is filled in based on the position of the third target motion boundary to obtain the processed boundary region. If the target pixel belongs to the trailing region, the target pixel is removed to obtain the processed boundary region; The processed boundary region is then subjected to time continuity correction to obtain the corrected boundary region.
10. A device for generating a transparency field, characterized in that, include: A boundary proxy data construction unit is used to construct boundary proxy data based on target pixels in an image; wherein, the boundary proxy data includes: a local color gradient field, a brightness reflection field, a micro-normal perturbation field, and a motion boundary prediction field; The transparency candidate value generation unit is used to generate transparency candidate values for the target pixel corresponding to each dimension based on the boundary proxy data of each dimension. A transparency estimation generation unit is used to generate a transparency estimate of the target pixel based on the transparency candidate values of the target pixel corresponding to all the dimensions. A transparency field generation unit is used to generate a transparency field based on the transparency estimates of all the target pixels.