Image recognition-based automatic optimization method and system for clothing matching
By performing semantic segmentation and illumination transformation on user human body images and clothing item images, scene images under different lighting conditions are generated, and color stability and harmony scores are calculated. This solves the problem of insufficient evaluation of lighting environment changes and clothing color interaction in existing technologies, and improves the accuracy of clothing matching recommendations and user experience.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- QINSILK COM
- Filing Date
- 2026-03-20
- Publication Date
- 2026-06-19
AI Technical Summary
Existing clothing matching recommendation systems lack consideration for changes in lighting environment when evaluating the effectiveness of clothing matching, resulting in insufficient universality and robustness of the recommended solutions in real-world application scenarios. Furthermore, they fail to fully consider the color interaction and visual contrast effects of clothing when actually worn on the human body, leading to room for improvement in the accuracy of recommendation results and user experience.
By acquiring user images and candidate clothing item images, semantic segmentation and spatial deformation are performed to generate synthetic clothing images. Multiple color temperature lighting transformations are applied to extract color feature vectors of clothing areas, calculate color stability and coordination scores, and comprehensively evaluate and select clothing matching schemes with high scores.
It enables objective evaluation of clothing colors in complex environments, enhances the personalization, rationality, and practicality of clothing recommendations, and improves the realism of the user experience and the accuracy of matching schemes.
Smart Images

Figure CN122243609A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to image processing technology, and more particularly to an automatic optimization method and system for clothing matching based on image recognition. Background Technology
[0002] In existing clothing matching recommendation systems, the conventional approach typically relies on analyzing the static attributes of individual clothing items. These systems often extract features such as color, texture, and style from clothing images, combine them with a pre-set fashion rule library or user historical preference data, perform matching calculations, and generate matching suggestions. Another common approach is to build virtual fitting models, which overlay two-dimensional clothing images onto a standard human body model through simple geometric transformations to generate preliminary composite images for users to preview. These technical solutions constitute the mainstream implementation path in this field.
[0003] However, the aforementioned conventional approaches have significant limitations. On the one hand, existing methods, when evaluating the effectiveness of clothing combinations, tend to focus on the inherent properties of individual garments or their performance under ideal standard lighting, lacking consideration for the visual stability of combinations under varying lighting conditions. Clothing colors change significantly under different color temperatures; an outfit that appears harmonious under warm indoor lighting may look jarring or discordant under cool outdoor lighting. This difference in color perception caused by changes in ambient lighting is generally ignored in current recommendation logic, resulting in insufficient universality and robustness of recommended combinations in real-world applications. On the other hand, existing technologies tend to mechanically judge the overall color harmony of multiple garment combinations, typically based on fixed color theory formulas (such as color wheel angles). This fails to fully consider the complex color interactions and visual contrasts that occur when clothing is actually worn on the body, due to contact, occlusion, and body posture between adjacent areas of different garments (such as collars and outerwear, cuffs and bottoms). This evaluation method, which is detached from the actual spatial relationship of wearing clothes, makes it difficult to accurately reflect the harmony of the outfit in the real visual presentation, resulting in room for improvement in the accuracy of the recommendation results and user experience satisfaction. Summary of the Invention
[0004] This invention provides an automatic clothing matching optimization method and system based on image recognition, which can solve the problems in the prior art.
[0005] A first aspect of this invention provides an automatic clothing matching optimization method based on image recognition, comprising: The process involves acquiring a user's human body image and candidate clothing item images, performing semantic segmentation on the user's human body image to obtain the human body contour region, spatially deforming the candidate clothing item images based on the human body contour region, and then superimposing the deformed candidate clothing item images onto the user's human body image to generate a composite wearing image. Apply lighting transformations of various color temperatures to the synthesized clothing image to generate scene images under different lighting conditions; Extract the color feature vectors of the clothing area from the synthesized clothing image and each scene image, calculate the distance value between the color feature vector of each scene image and the color feature vector of the synthesized clothing image, and generate a color stability score for each clothing item based on the distance value. Extract color values of adjacent areas between multiple clothing items in a composite image of clothing, calculate the contrast and saturation differences between the color values of adjacent clothing items, and generate a color coordination score for the combination of clothing items based on the contrast and saturation differences. The candidate clothing images are comprehensively evaluated based on color stability and color coordination scores, and clothing combinations with scores higher than the preset values are selected to output clothing matching schemes.
[0006] The process involves acquiring a user's body image and candidate clothing item images, performing semantic segmentation on the user's body image to obtain the body contour region, spatially deforming the candidate clothing item images based on the body contour region, and then superimposing the deformed candidate clothing item images onto the user's body image to generate a composite wearing image. The process involves acquiring user body images and candidate clothing item images, performing pose detection on the user body image to identify human body joints, correcting the pose of the user body image based on the human body joints to obtain a corrected user body image, and performing background separation on the candidate clothing item images to obtain the main clothing image. Extract multi-scale feature maps from the corrected user human body image and perform feature fusion to obtain a fused feature map. Then, perform pixel-level classification on the fused feature map to obtain the human body contour region. Extract the boundary contour lines of the human body contour region, perform curvature analysis on the boundary contour lines to determine control points, establish the mapping relationship between the control points and the corresponding points of the main body image of the clothing, calculate the spatial transformation matrix, and use the spatial transformation matrix to perform pixel remapping on the main body image of the clothing to obtain the deformed candidate clothing single image. A depth mask is generated based on the human body contour region. The depth mask is used to determine the front-to-back occlusion relationship between the deformed candidate clothing item image and the corrected user human body image. Pixel fusion is performed according to the front-to-back occlusion relationship to generate a composite wearing image.
[0007] Curvature analysis is performed on the boundary contour to determine control points. A mapping relationship is established between the control points and corresponding points in the main garment image. The spatial transformation matrix is calculated, and pixel remapping is performed on the main garment image using the spatial transformation matrix to obtain deformed candidate garment images, including: Discrete curvature calculation is performed on the pixels on the boundary contour line to obtain the curvature distribution sequence. The curvature distribution sequence is then filtered and smoothed. Curvature extreme points are identified based on the smoothed curvature distribution sequence. The curvature extreme points are then clustered and grouped to determine the control points that characterize the human contour features. Extract the edge contour of the main body image of the clothing and calculate its curvature. Identify the corresponding control points of the main body image of the clothing. Establish a point-to-point mapping relationship based on the spatial positional relationship between the control points representing the human body contour area and the corresponding control points of the main body image of the clothing. Based on the point-to-point mapping relationship, a deformation energy function reflecting the degree of deformation is constructed. The spatial transformation matrix is solved by minimizing the deformation energy function. The spatial transformation matrix is then applied to the pixel coordinates of each pixel in the main image of the clothing to calculate the transformed target coordinates. Based on the transformed pixel coordinates, pixel values are sampled from the main image of the clothing and filled into the positions corresponding to the transformed pixel coordinates to complete pixel remapping and obtain the deformed candidate clothing item image.
[0008] Applying various color temperature lighting transformations to the synthetic clothing image generates multiple scene images under different lighting conditions, including: Extract the luminance and chrominance components of the synthesized clothing image, and perform semantic segmentation on the synthesized clothing image to identify the clothing region and skin region; Extract the fabric texture features of the clothing area and the surface reflection features of the skin area, and calculate the color temperature sensitivity coefficients of the clothing area and the skin area respectively based on the fabric texture features and surface reflection features; Multiple color temperature parameters are generated based on a preset color temperature range. Each color temperature parameter is multiplied by the color temperature sensitivity coefficient of the clothing area and the color temperature sensitivity coefficient of the skin area to obtain the regional adaptive color temperature parameters of the clothing area and the skin area. The adaptive color temperature parameters of the clothing area and the skin area are applied to the corresponding chromaticity components of the clothing area and the skin area respectively to perform color space transformation, resulting in multiple transformed chromaticity components. The illumination intensity adjustment coefficient is calculated based on the color difference between the transformed chromaticity components and the original chromaticity components. The transformed chromaticity components are then adjusted based on the illumination intensity adjustment coefficient. Finally, the adjusted chromaticity components are fused with the luminance components to reconstruct scene images under different illumination conditions.
[0009] Extract color feature vectors of clothing areas from the synthesized clothing image and each scene image; calculate the distance between the color feature vector of each scene image and the color feature vector of the synthesized clothing image; and generate a color stability score for each garment based on the distance value, including: A multi-scale Gaussian pyramid is constructed for the clothing regions in the synthesized clothing image and various scene images. The hue gradient direction distribution features, saturation region distribution features, and brightness level distribution features are extracted at each level of the multi-scale Gaussian pyramid. The hue gradient direction distribution features, saturation region distribution features, and brightness level distribution features of the multi-level pyramid are fused to form a color feature vector. Extract the deviation values of the color feature vectors of each scene image relative to the color feature vector of the synthesized clothing image in the hue dimension, saturation dimension, and brightness dimension, and normalize the deviation values to obtain the hue weight factor, saturation weight factor, and brightness weight factor. The weight vector is formed by combining the hue weight factor, saturation weight factor, and brightness weight factor. Extract the difference components in the hue, saturation and brightness dimensions between the color feature vector of the synthesized clothing image and the color feature vector of each scene image. Then, perform weighted fusion of the difference components and the weight vector, and sum the weighted fusion results to obtain the distance value of each scene image. The distance values of images in each scene are statistically analyzed to calculate the mean distance and standard deviation of the distance. Based on the mean distance and standard deviation of the distance, a color stability score is generated for each garment item.
[0010] Extract color values from adjacent areas of multiple clothing items in a composite image, calculate the contrast and saturation differences between adjacent clothing item color values, and generate a color coordination score for the clothing item combination based on the contrast and saturation differences, including: In the synthesized image of clothing, the boundary contours between multiple clothing items are detected. Sampling band regions are formed by extending to both sides along the normal direction of the boundary contours. The hue value, saturation value and brightness value of the pixels are extracted along the normal direction within the sampling band regions. The rate of change of hue values along the normal direction is calculated to form the hue transition gradient; the rate of change of saturation values along the normal direction is calculated to form the saturation transition gradient; and the rate of change of luminance values along the normal direction is calculated to form the luminance transition gradient. The color values of adjacent regions are formed by combining the hue transition gradient, saturation transition gradient and brightness transition gradient. The peak position and direction consistency of the hue transition gradient are extracted from the color values of adjacent regions. The contrast between the color values of adjacent clothing items is calculated. The rate of change and smoothness of the saturation transition gradient are extracted from the color values of adjacent regions. The saturation difference between the color values of adjacent clothing items is calculated. The differences in contrast and saturation are divided into corresponding harmony level intervals. Based on the harmony level intervals corresponding to the differences in contrast and saturation, the harmony scoring table is consulted to generate the color harmony score of the clothing item combination.
[0011] Candidate clothing item images are comprehensively evaluated based on color stability and color harmony scores. Clothing item combinations with scores higher than a preset value are selected to output clothing matching schemes, including: The color stability scores of multiple candidate clothing item images are statistically analyzed to form a stability score distribution, and the color coordination scores of multiple candidate clothing item images are statistically analyzed to form a coordination score distribution. Abnormal stability score samples that deviate from the center of the stability score distribution are identified from the stability score distribution, and abnormal coordination score samples that deviate from the center of the coordination score distribution are identified from the coordination score distribution. The abnormal stability score samples and abnormal coordination score samples are corrected to obtain the corrected color stability score and the corrected color coordination score. The hue and brightness differences between candidate clothing images are extracted to form color complementary features, and the style type and pattern features between candidate clothing images are extracted to form style complementary features. The matching complementarity is calculated based on the color complementary features and style complementary features. The corrected color stability score, corrected color coordination score and matching complementarity are integrated to form a comprehensive score. Select candidate clothing images with a comprehensive score higher than the preset score, and combine the selected candidate clothing images to output clothing matching schemes.
[0012] A second aspect of this invention provides an image recognition-based automatic clothing matching optimization system, comprising: The image synthesis unit is used to acquire user human body images and candidate clothing item images, perform semantic segmentation on user human body images to obtain human body contour regions, perform spatial deformation on candidate clothing item images based on human body contour regions, and superimpose the deformed candidate clothing item images onto user human body images to generate a synthesized wearing image. The illumination transformation unit is used to apply illumination transformations of various color temperatures to the synthesized clothing image to generate scene images under different lighting conditions; The color stabilization unit is used to extract the color feature vectors of the clothing area in the synthesized clothing image and each scene image, calculate the distance value between the color feature vector of each scene image and the color feature vector of the synthesized clothing image, and generate a color stability score for each clothing item based on the distance value. The color coordination unit is used to extract the color values of adjacent areas between multiple clothing items in the synthesized clothing image, calculate the contrast and saturation differences between the color values of adjacent clothing items, and generate a color coordination score for the clothing item combination based on the contrast and saturation differences. The comprehensive evaluation unit is used to comprehensively evaluate candidate clothing item images based on color stability score and color coordination score, and select clothing item combinations with scores higher than the preset score to output clothing matching schemes.
[0013] A third aspect of the present invention provides an electronic device, comprising: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.
[0014] A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0015] In this embodiment, semantic segmentation of the user's human body image is performed to accurately extract the human body contour region. This allows candidate clothing items to be spatially deformed and naturally superimposed according to the human body contour, generating a realistic synthetic wearing image and enhancing the realism of virtual try-on and user experience. Furthermore, by applying various color temperature lighting transformations to the synthetic image, scene images under different lighting conditions are generated. Color feature vectors of the clothing areas are extracted, and the distance of color feature changes under different lighting conditions is calculated, thereby generating a color stability score for each clothing item, achieving an objective evaluation of clothing color performance in complex environments. Simultaneously, the solution analyzes the color values of adjacent areas of multiple clothing items in the synthetic image, calculates color contrast and saturation differences, and generates a color coordination score for clothing combinations, thus achieving a scientific color matching evaluation of multiple clothing combinations. Finally, a comprehensive selection is performed based on the color stability and coordination scores to output high-scoring clothing matching schemes, improving the personalization, rationality, and practicality of clothing recommendations. Attached Figure Description
[0016] Figure 1 This is a flowchart illustrating the automatic clothing matching optimization method based on image recognition, according to an embodiment of the present invention. Figure 2 This is a flowchart illustrating the illumination reconstruction process of synthesized clothing images according to an embodiment of the present invention. Detailed Implementation
[0017] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0018] The technical solution of the present invention will be described in detail below with reference to specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.
[0019] Figure 1 This is a flowchart illustrating the automatic clothing matching optimization method based on image recognition, as described in an embodiment of the present invention. Figure 1 As shown, the method includes: The process involves acquiring a user's human body image and candidate clothing item images, performing semantic segmentation on the user's human body image to obtain the human body contour region, spatially deforming the candidate clothing item images based on the human body contour region, and then superimposing the deformed candidate clothing item images onto the user's human body image to generate a composite wearing image. Apply lighting transformations of various color temperatures to the synthesized clothing image to generate scene images under different lighting conditions; Extract the color feature vectors of the clothing area from the synthesized clothing image and each scene image, calculate the distance value between the color feature vector of each scene image and the color feature vector of the synthesized clothing image, and generate a color stability score for each clothing item based on the distance value. Extract color values of adjacent areas between multiple clothing items in a composite image of clothing, calculate the contrast and saturation differences between the color values of adjacent clothing items, and generate a color coordination score for the combination of clothing items based on the contrast and saturation differences. The candidate clothing images are comprehensively evaluated based on color stability and color coordination scores, and clothing combinations with scores higher than the preset values are selected to output clothing matching schemes.
[0020] The process involves acquiring a user's body image and candidate clothing item images, performing semantic segmentation on the user's body image to obtain the body contour region, spatially deforming the candidate clothing item images based on the body contour region, and then superimposing the deformed candidate clothing item images onto the user's body image to generate a composite wearing image. The process involves acquiring user body images and candidate clothing item images, performing pose detection on the user body image to identify human body joints, correcting the pose of the user body image based on the human body joints to obtain a corrected user body image, and performing background separation on the candidate clothing item images to obtain the main clothing image. Extract multi-scale feature maps from the corrected user human body image and perform feature fusion to obtain a fused feature map. Then, perform pixel-level classification on the fused feature map to obtain the human body contour region. Extract the boundary contour lines of the human body contour region, perform curvature analysis on the boundary contour lines to determine control points, establish the mapping relationship between the control points and the corresponding points of the main body image of the clothing, calculate the spatial transformation matrix, and use the spatial transformation matrix to perform pixel remapping on the main body image of the clothing to obtain the deformed candidate clothing single image. A depth mask is generated based on the human body contour region. The depth mask is used to determine the front-to-back occlusion relationship between the deformed candidate clothing item image and the corrected user human body image. Pixel fusion is performed according to the front-to-back occlusion relationship to generate a composite wearing image.
[0021] In the stage of generating synthetic clothing images, the user's body image and candidate clothing item images are first acquired. The user's body image can be a full-body photo uploaded by the user or an image captured in real time by a camera device. A deep learning-based pose detection network is used to process the user's body image. This network can identify 18 joints of the human body, including key positions such as the head, shoulders, elbows, wrists, hips, knees, and ankles. By analyzing the spatial relationship of the joints, it is determined whether there is any tilt or distortion in the human posture. When a posture deviation is detected, an affine transformation is performed based on the spinal centerline to adjust the human body to a standard frontal standing posture, resulting in the corrected user body image.
[0022] For candidate clothing item images, a foreground extraction algorithm based on color and texture features is used for background separation. This algorithm first determines the approximate outline of the clothing through edge detection, and then uses a graph cut algorithm to segment the image foreground and background. During segmentation, pixels in the clothing area are marked as foreground, while other areas are marked as background and set to transparent, thus obtaining an image containing only the main body of the clothing.
[0023] In the semantic segmentation stage, a multi-scale feature extraction network is constructed to process the corrected user human body image. This network contains three parallel branches, which extract feature maps at the original resolution, 1 / 2 resolution, and 1 / 4 resolution, respectively. Upsampling is used to unify the feature maps of different scales to the same size, and then they are concatenated along the channel dimension. Convolutional layers and activation functions are applied to the fused feature maps to predict the category of each pixel, classifying pixels into categories such as torso, upper limbs, lower limbs, head, and background, ultimately obtaining an accurate human body contour region.
[0024] During spatial deformation, morphological algorithms are used to extract the boundary contour lines of the human body contour region. Curvature calculations are performed on the boundary contour lines; the curvature value represents the degree of bending at that point. Points with curvature values exceeding a set threshold are selected as control points; these control points are typically located at locations where the human body contour changes significantly, such as the shoulders, waist, and hips. Based on the clothing type, corresponding feature points are marked on the main clothing image, such as the shoulder line endpoints, neckline positions, and hem edges of a top. A one-to-one mapping relationship is established between the human body contour control points and the clothing feature points, and the spatial transformation matrix is calculated using thin-plate spline interpolation. This matrix describes the new coordinate position of each pixel in the clothing image after deformation. Pixel remapping is performed on the main clothing image to ensure the shape of the clothing precisely matches the human body contour, generating candidate clothing images after deformation.
[0025] To achieve a realistic wearing effect, the occlusion relationship between the clothing and the human body needs to be addressed. A depth mask map is generated based on the positional information of the human body contour region, and a depth value is assigned to each pixel in this mask map. A smaller depth value is set for the human torso region, indicating its proximity to the camera, while a larger depth value is set for the background region to be occluded. The depth values at the same location are compared between the deformed candidate clothing image and the corrected user body image; pixels with smaller depth values are retained in the final image. For pixels in the boundary regions, a weighted blending strategy is used for fusion, with weights dynamically calculated based on depth differences and pixel color similarity to ensure a natural transition. After pixel fusion, a synthesized wearing image is output, which realistically reflects the visual effect of the user wearing the clothing item.
[0026] Curvature analysis is performed on the boundary contour to determine control points. A mapping relationship is established between the control points and corresponding points in the main garment image. The spatial transformation matrix is calculated, and pixel remapping is performed on the main garment image using the spatial transformation matrix to obtain deformed candidate garment images, including: Discrete curvature calculation is performed on the pixels on the boundary contour line to obtain the curvature distribution sequence. The curvature distribution sequence is then filtered and smoothed. Curvature extreme points are identified based on the smoothed curvature distribution sequence. The curvature extreme points are then clustered and grouped to determine the control points that characterize the human contour features. Extract the edge contour of the main body image of the clothing and calculate its curvature. Identify the corresponding control points of the main body image of the clothing. Establish a point-to-point mapping relationship based on the spatial positional relationship between the control points representing the human body contour area and the corresponding control points of the main body image of the clothing. Based on the point-to-point mapping relationship, a deformation energy function reflecting the degree of deformation is constructed. The spatial transformation matrix is solved by minimizing the deformation energy function. The spatial transformation matrix is then applied to the pixel coordinates of each pixel in the main image of the clothing to calculate the transformed target coordinates. Based on the transformed pixel coordinates, pixel values are sampled from the main image of the clothing and filled into the positions corresponding to the transformed pixel coordinates to complete pixel remapping and obtain the deformed candidate clothing item image.
[0027] After semantic segmentation of the user's human body image and obtaining the human body contour region, the candidate clothing item images need to be spatially deformed to naturally conform to the human body contour. Specifically, a discrete curvature calculation method is used for each pixel on the boundary contour line. The curvature value is obtained by calculating the rate of change of the vector angle formed by the adjacent pixels before and after the point, forming a curvature distribution sequence along the contour line. Since the curvature value obtained by direct calculation is easily affected by image noise and fluctuates, a Gaussian filter is used to smooth the curvature distribution sequence. The size of the filter window is adaptively determined according to the total length of the contour line, usually set to 5% to 10% of the total number of contour points. The smoothed curvature distribution sequence can more accurately reflect the geometric features of the human body contour. Curvature extreme points are identified by detecting local maxima and local minima of curvature. These points usually correspond to key turning points of the human body such as shoulders, waist, and hips. The identified curvature extreme points are grouped using a spatial distance-based clustering algorithm. Extreme points with a distance less than a preset threshold are grouped into the same cluster. The centroid of each cluster is used as the final control point, and the number of control points is usually 8 to 15.
[0028] For candidate garment images, the edge contours of the main garment image are first extracted using edge detection algorithms, commonly including the Canny or Sobel operators. Curvature calculations are then performed on the extracted garment edge contours, identifying the extreme curvature points as corresponding control points. A correspondence between human control points and garment control points is established based on the semantic information of the garment type. For example, for a top, human shoulder control points correspond to garment shoulder control points, and waist control points correspond to garment hem control points. This point-to-point mapping describes how the garment deforms to fit the geometric constraints of the human body contour.
[0029] To achieve a smooth and natural deformation effect, a deformation energy function is constructed to quantify the degree of deformation. The deformation energy function consists of two parts: a fitting term and a smoothing term. The fitting term measures the sum of squared Euclidean distances between the deformed clothing control points and the target human body control points, ensuring accurate alignment of the control points. The smoothing term penalizes excessive distortion by calculating the difference in deformation gradients between adjacent pixels, maintaining the continuity of the deformation. The deformation accuracy and smoothness are balanced by adjusting the weighting coefficients of the two terms; the weighting coefficients are typically set to 1.0 for the fitting term and 0.3 to 0.5 for the smoothing term. The deformation energy function is minimized using gradient descent or conjugate gradient methods, and the spatial transformation matrix is obtained through iterative solution. This matrix is either an affine transformation matrix or a thin-plate spline transformation parameter.
[0030] The obtained spatial transformation matrix is applied to each pixel coordinate of the clothing subject image to calculate the target coordinates of that pixel in the deformed image. Since the transformed target coordinates may be non-integer coordinates, a bilinear interpolation method is used to sample pixel values from the original clothing subject image. Specifically, for the four nearest integer coordinate positions around the target coordinates, the interpolated pixel values are calculated based on the weighted average of the distances between the target coordinates and these four positions. The interpolated pixel values are then filled into the corresponding positions of the transformed target coordinates, completing the pixel remapping of the entire clothing subject image. Finally, a deformed candidate clothing image that accurately fits the user's body contour is obtained. The deformation process preserves the texture and color information of the clothing, laying the foundation for the subsequent generation of realistic and natural synthetic wearing images.
[0031] like Figure 2 The diagram illustrates the lighting reconstruction process of the synthesized clothing image in this embodiment.
[0032] Applying various color temperature lighting transformations to the synthetic clothing image generates multiple scene images under different lighting conditions, including: Extract the luminance and chrominance components of the synthesized clothing image, and perform semantic segmentation on the synthesized clothing image to identify the clothing region and skin region; Extract the fabric texture features of the clothing area and the surface reflection features of the skin area, and calculate the color temperature sensitivity coefficients of the clothing area and the skin area respectively based on the fabric texture features and surface reflection features; Multiple color temperature parameters are generated based on a preset color temperature range. Each color temperature parameter is multiplied by the color temperature sensitivity coefficient of the clothing area and the color temperature sensitivity coefficient of the skin area to obtain the regional adaptive color temperature parameters of the clothing area and the skin area. The adaptive color temperature parameters of the clothing area and the skin area are applied to the corresponding chromaticity components of the clothing area and the skin area respectively to perform color space transformation, resulting in multiple transformed chromaticity components. The illumination intensity adjustment coefficient is calculated based on the color difference between the transformed chromaticity components and the original chromaticity components. The transformed chromaticity components are then adjusted based on the illumination intensity adjustment coefficient. Finally, the adjusted chromaticity components are fused with the luminance components to reconstruct scene images under different illumination conditions.
[0033] To simulate scenes under different lighting conditions, the synthesized clothing images were first converted from the RGB color space to the YCbCr color space, where the Y component represents the luminance component, and the Cb and Cr components represent the chrominance components. A semantic segmentation algorithm based on a convolutional neural network was then used to perform pixel-level classification of the synthesized clothing images, marking the pixel positions of clothing and skin regions. Clothing region identification was based on the texture characteristics of the textile material, while skin region identification was based on the distribution range of skin tone in the color space.
[0034] For the clothing area, the gray-level co-occurrence matrix (GLCM) is used to calculate the roughness and directionality parameters of the fabric texture when extracting its features. Different fabric materials respond differently to color temperature changes; smooth silk fabrics are more sensitive to color temperature changes, while rough cotton fabrics respond more slowly. Based on the fabric texture roughness value and a preset mapping relationship, the color temperature sensitivity coefficient for the clothing area is calculated, with a value range of 0.3 to 1.2, where smooth fabrics correspond to higher coefficients. For the skin area, the ratio of specular reflection to diffuse reflection components of skin pixels is analyzed when extracting surface reflection features. The distribution of oil on the skin surface affects light reflection characteristics; the color temperature sensitivity coefficient for the skin area is calculated based on the proportion of specular reflection, with a value range of 0.5 to 0.9.
[0035] The preset color temperature range covers typical lighting environments in everyday scenarios, set from 2800K to 6500K, generating eight color temperature parameters at 500K intervals. Each color temperature parameter is converted into a chromaticity shift vector in the color space, describing the direction and magnitude of the chromaticity component shift caused by color temperature changes. The color temperature parameters are then multiplied by the color temperature sensitivity coefficients of the clothing and skin regions respectively, yielding region-adaptive color temperature parameters for different areas. This region-adaptive processing ensures that clothing materials and skin surfaces exhibit physically consistent color responses when color temperature changes.
[0036] For each pixel in the clothing area, the region-adaptive color temperature parameter is applied to the corresponding Cb and Cr components, achieving chromaticity shift through a color space transformation matrix. The transformation matrix is constructed based on the linear relationship between color temperature change and chromaticity component shift, ensuring that the hue shifts towards blue when the color temperature increases and towards yellow when the color temperature decreases. The same operation is performed on the skin area, resulting in 8 sets of transformed chromaticity component images.
[0037] The calculation of the illumination intensity adjustment coefficient is based on the overall brightness deviation that may occur after chromaticity transformation. The color difference value between each transformed chromaticity component and the original chromaticity component in the CIE color difference formula is calculated. A larger color difference value indicates a significant color temperature change, requiring corresponding brightness adjustments to maintain the visual naturalness of the image. The illumination intensity adjustment coefficient is set as a non-linear function of the color difference value. When the color difference value is within the range of 0 to 30, the adjustment coefficient linearly changes from 1.0 to 1.15, ensuring that brightness is appropriately increased to simulate a warm light effect when the color temperature is low, and that the original brightness is maintained to simulate a cool light effect when the color temperature is high. The adjustment coefficient is multiplied by the Y component at the corresponding position of each transformed chromaticity component to complete the brightness adjustment. Finally, the adjusted chromaticity components Cb and Cr are fused with the adjusted brightness component Y, converted back to the RGB color space, and eight scene images under different lighting conditions are generated, simulating typical lighting environments such as warm morning light, indoor incandescent light, sunlight, cloudy day, and fluorescent light.
[0038] Extract color feature vectors of clothing areas from the synthesized clothing image and each scene image; calculate the distance between the color feature vector of each scene image and the color feature vector of the synthesized clothing image; and generate a color stability score for each garment based on the distance value, including: A multi-scale Gaussian pyramid is constructed for the clothing regions in the synthesized clothing image and various scene images. The hue gradient direction distribution features, saturation region distribution features, and brightness level distribution features are extracted at each level of the multi-scale Gaussian pyramid. The hue gradient direction distribution features, saturation region distribution features, and brightness level distribution features of the multi-level pyramid are fused to form a color feature vector. Extract the deviation values of the color feature vectors of each scene image relative to the color feature vector of the synthesized clothing image in the hue dimension, saturation dimension, and brightness dimension, and normalize the deviation values to obtain the hue weight factor, saturation weight factor, and brightness weight factor. The weight vector is formed by combining the hue weight factor, saturation weight factor, and brightness weight factor. Extract the difference components in the hue, saturation and brightness dimensions between the color feature vector of the synthesized clothing image and the color feature vector of each scene image. Then, perform weighted fusion of the difference components and the weight vector, and sum the weighted fusion results to obtain the distance value of each scene image. The distance values of images in each scene are statistically analyzed to calculate the mean distance and standard deviation of the distance. Based on the mean distance and standard deviation of the distance, a color stability score is generated for each garment item.
[0039] In the color feature vector extraction stage, a multi-scale Gaussian pyramid is used to decompose the clothing regions in the synthesized clothing image and various scene images. The Gaussian pyramid is constructed through progressive downsampling, with a 5×5 Gaussian kernel used for smoothing before each downsampling step, and the pyramid has 4 levels. At each level, the color space of the clothing region is converted from RGB to HSV, and features for the hue, saturation, and brightness channels are calculated. The hue gradient direction distribution feature is obtained by calculating the gradient values of the hue channels in the horizontal and vertical directions, and statistically analyzing the histogram distribution of the gradient direction within the range of 0 to 360 degrees. The circumferential direction is divided into 36 intervals, and the statistical value of each interval is used as a feature component. The saturation region distribution feature divides the saturation values into 10 equally spaced intervals, and the pixel proportion within each interval is statistically analyzed to form a distribution vector. The brightness level distribution feature similarly divides the brightness values into 10 equally spaced intervals, and the pixel distribution ratio of each interval is calculated. The hue gradient direction distribution features, saturation region distribution features, and brightness level distribution features extracted from the four levels are concatenated in hierarchical order to form a color feature vector with a dimension of 224.
[0040] The deviation value is calculated for each scene image by extracting the corresponding components of its color feature vector and the color feature vector of the synthesized clothing image in three dimensions. The deviation value for the hue dimension is the sum of the absolute values of the differences between the components of the hue gradient direction distribution feature; the deviation value for the saturation dimension is the sum of the absolute values of the differences between the components of the saturation region distribution feature; and the deviation value for the brightness dimension is the sum of the absolute values of the differences between the components of the brightness level distribution feature. The deviation values for each dimension of all scene images are normalized to the interval between 0 and 1 using maximum and minimum value standardization. The normalized deviation values are directly used as weighting factors for that dimension: hue weighting factor, saturation weighting factor, and brightness weighting factor. These three weighting factors are combined sequentially to form a three-dimensional weight vector.
[0041] In the distance calculation process, for each scene image, its color feature vector is subtracted dimension-by-dimensionally from the color feature vector of the synthesized clothing image, resulting in a 224-dimensional difference vector. This difference vector is then grouped by feature type: the first 144 dimensions correspond to the difference components distributed along the hue gradient direction, the middle 40 dimensions correspond to the difference components distributed along the saturation region, and the last 40 dimensions correspond to the difference components distributed along the brightness level. The difference components along the hue dimension are multiplied by a hue weighting factor, the difference components along the saturation dimension are multiplied by a saturation weighting factor, and the difference components along the brightness dimension are multiplied by a brightness weighting factor. The sum of the squares of each of the three weighted components is then calculated, and the square root of the sum of the three squares is taken to obtain the distance value for that scene image.
[0042] After obtaining the distance values for all scene images, the arithmetic mean of these distance values is calculated as the distance mean, and the square root of the variance of each distance value relative to the distance mean is calculated as the distance standard deviation. The color stability score is calculated using a formula: the reciprocal of the sum of the distance mean and the distance standard deviation is multiplied by 100 to obtain the score. The smaller the distance mean and distance standard deviation, the more stable the color performance of the clothing item under different lighting conditions, and the higher the score. When the sum of the distance mean and distance standard deviation is less than 0.01, the score is set to 100.
[0043] Extract color values from adjacent areas of multiple clothing items in a composite image, calculate the contrast and saturation differences between adjacent clothing item color values, and generate a color coordination score for the clothing item combination based on the contrast and saturation differences, including: In the synthesized image of clothing, the boundary contours between multiple clothing items are detected. Sampling band regions are formed by extending to both sides along the normal direction of the boundary contours. The hue value, saturation value and brightness value of the pixels are extracted along the normal direction within the sampling band regions. The rate of change of hue values along the normal direction is calculated to form the hue transition gradient; the rate of change of saturation values along the normal direction is calculated to form the saturation transition gradient; and the rate of change of luminance values along the normal direction is calculated to form the luminance transition gradient. The color values of adjacent regions are formed by combining the hue transition gradient, saturation transition gradient and brightness transition gradient. The peak position and direction consistency of the hue transition gradient are extracted from the color values of adjacent regions. The contrast between the color values of adjacent clothing items is calculated. The rate of change and smoothness of the saturation transition gradient are extracted from the color values of adjacent regions. The saturation difference between the color values of adjacent clothing items is calculated. The differences in contrast and saturation are divided into corresponding harmony level intervals. Based on the harmony level intervals corresponding to the differences in contrast and saturation, the harmony scoring table is consulted to generate the color harmony score of the clothing item combination.
[0044] In the synthesized images of clothing, edge detection algorithms are used to identify the boundary contours between multiple garment items, such as tops and trousers, skirts and coats. Boundary contour extraction employs morphological gradient operations, determining the boundary lines between garment items by the difference between dilation and erosion operations. After determining the boundary contours, the normal direction of each pixel on the contour is calculated. The normal direction is obtained by differentiating the contour curve; positive normals point inwards from the garment item, and negative normals point outwards. A sampling band region with a width of 16 to 30 pixels is formed by extending 8 to 15 pixels to both sides along the normal direction of each pixel. This sampling band region covers the transition areas of the garment items on both sides of the boundary.
[0045] Within the sampling band, the component values of the HSV color space are extracted pixel by pixel along the normal direction. The hue value H ranges from 0 to 360 degrees, representing the angular position of the color on the color wheel. The saturation value S ranges from 0 to 1, representing the purity of the color. The brightness value V ranges from 0 to 255, representing the lightness or darkness of the color. For each normal within the sampling band, a sequence of HSV values arranged along the normal direction is obtained.
[0046] The hue value sequence is differentially analyzed to calculate the hue change between adjacent pixels. The hue transition gradient is defined as the hue change per unit pixel distance, expressed in degrees per pixel. When the absolute value of the hue transition gradient is greater than 15 degrees per pixel, it indicates a significant hue jump between adjacent clothing items. The saturation value sequence is also differentially analyzed, with the saturation transition gradient defined as the saturation change per unit pixel distance. Finally, the brightness value sequence is differentially analyzed, with the brightness transition gradient defined as the brightness change per unit pixel distance, expressed in gray levels per pixel.
[0047] A three-dimensional gradient vector is formed by combining the hue transition gradient, saturation transition gradient, and brightness transition gradient corresponding to each normal. This vector represents the color value change characteristics of adjacent regions. The gradient vectors of all normals are statistically analyzed, and the peak position of the hue transition gradient is extracted. The peak position refers to the pixel position with the largest hue change, marking the color boundary in visual perception. The distribution consistency of peak positions on different normals is analyzed, and the standard deviation of the peak position deviation is calculated. A standard deviation less than 3 pixels is considered to indicate good directional consistency. The contrast ratio is calculated based on the peak amplitude of the hue transition gradient and the peak amplitude of the brightness transition gradient. The contrast ratio is defined as the weighted sum of the two amplitudes, with weighting coefficients of 0.6 and 0.4, respectively.
[0048] The rate of change of the saturation transition gradient is extracted by taking the second derivative of the saturation transition gradient sequence. The rate of change reflects the abruptness of the saturation transition. The smoothness of the saturation transition gradient sequence is calculated using local variance as a measure; a local variance less than 0.02 is considered a smooth transition. The saturation difference is defined as the product of the peak value of the saturation transition gradient and the rate of change; the larger the peak value and the more abrupt the change, the more significant the saturation difference.
[0049] The preset harmony level range includes five levels: contrast ratio less than 20 corresponds to "very weak contrast," contrast ratio between 20 and 50 corresponds to "weak contrast," contrast ratio between 50 and 100 corresponds to "moderate contrast," contrast ratio between 100 and 150 corresponds to "strong contrast," and contrast ratio greater than 150 corresponds to "very strong contrast." Saturation difference less than 0.15 corresponds to "harmonious," saturation difference between 0.15 and 0.30 corresponds to "basically harmonious," and saturation difference greater than 0.30 corresponds to "disharmonious." The harmony scoring table can be consulted based on the combination of contrast and saturation difference levels. In the harmony scoring table, the combination of "moderate contrast" and "harmonious" scores 90 points, the combination of "strong contrast" and "basically harmonious" scores 75 points, and the combination of "very strong contrast" and "disharmonious" scores 40 points. This score serves as the color harmony score for clothing item combinations.
[0050] Candidate clothing item images are comprehensively evaluated based on color stability and color harmony scores. Clothing item combinations with scores higher than a preset value are selected to output clothing matching schemes, including: The color stability scores of multiple candidate clothing item images are statistically analyzed to form a stability score distribution, and the color coordination scores of multiple candidate clothing item images are statistically analyzed to form a coordination score distribution. Abnormal stability score samples that deviate from the center of the stability score distribution are identified from the stability score distribution, and abnormal coordination score samples that deviate from the center of the coordination score distribution are identified from the coordination score distribution. The abnormal stability score samples and abnormal coordination score samples are corrected to obtain the corrected color stability score and the corrected color coordination score. The hue and brightness differences between candidate clothing images are extracted to form color complementary features, and the style type and pattern features between candidate clothing images are extracted to form style complementary features. The matching complementarity is calculated based on the color complementary features and style complementary features. The corrected color stability score, corrected color coordination score and matching complementarity are integrated to form a comprehensive score. Select candidate clothing images with a comprehensive score higher than the preset score, and combine the selected candidate clothing images to output clothing matching schemes.
[0051] After obtaining the color stability score and color coordination score for each individual garment, a comprehensive evaluation of the candidate garment images is performed. For the set of candidate garments to be evaluated, the color stability score for each garment is calculated, and all stability scores are arranged in ascending order of value, plotting a frequency distribution curve to form a stability score distribution. Simultaneously, the color coordination scores for the candidate garment combinations are calculated, forming a coordination score distribution in the same manner. The mean and standard deviation of the stability score distribution are calculated, and scores deviating from the mean by more than twice the standard deviation are marked as abnormal stability score samples. The same processing is applied to the coordination score distribution, identifying abnormal coordination score samples that deviate from the center of the coordination score distribution by more than a set threshold.
[0052] For abnormal stability score samples, a truncation method is used for correction. If the abnormal score is higher than the mean plus two standard deviations, it is corrected to the mean plus two standard deviations; if the abnormal score is lower than the mean minus two standard deviations, it is corrected to the mean minus two standard deviations. The same correction rule is applied to abnormal color harmony score samples to obtain corrected color stability scores and corrected color harmony scores.
[0053] When extracting complementary color features between candidate clothing images, the clothing images are converted to the HSV color space. The hue difference between the top and bottom garments is calculated separately. When the hue difference is between 150 and 210 degrees, they are considered complementary colors, and a complementary color weighting coefficient of 0.9 is assigned. When the hue difference is between 30 and 60 degrees or between 270 and 330 degrees, they are considered analogous colors, and an analogous color weighting coefficient of 0.7 is assigned. The lightness difference between the top and bottom garments is calculated. When the lightness difference is greater than 0.3 and less than 0.6, it is considered a suitable contrast, and a lightness contrast weighting coefficient of 0.8 is assigned. The weighted sum of the hue difference weight and the lightness difference weight yields the complementary color feature value.
[0054] When extracting complementary style features, the style type is identified based on the contour features of clothing images, categorizing styles into formal, casual, and sporty types, and the fit features into slim, loose, and straight types. When the style types of the top and bottom belong to the same category, a style consistency weight of 0.85 is assigned; when the fit features present a combination of slim and loose or loose and slim, a fit complement weight of 0.8 is assigned. The complementary style feature value is obtained by multiplying the style type weight and the fit feature weight.
[0055] When calculating the complementarity of color combinations, the complementary color feature value and the complementary style feature value are weighted and fused at a ratio of 0.6 and 0.4, respectively. A comprehensive score is formed by fusing the corrected color stability score, the corrected color harmony score, and the complementarity of color combinations. Specifically, the corrected color stability score is multiplied by a weight of 0.35, the corrected color harmony score by a weight of 0.4, and the complementarity of color combinations by a weight of 0.25; these three are then added together to obtain the comprehensive score. A preset score of 0.75 is set, and candidate clothing item combinations with a comprehensive score higher than 0.75 are selected. The selected clothing items are arranged in the order of top, bottom, and accessories, generating a clothing combination scheme output that includes clothing item numbers, a composite wearing image, and the comprehensive score.
[0056] A second aspect of the present invention provides an automatic clothing matching optimization system based on image recognition, the system comprising: The image synthesis unit is used to acquire user human body images and candidate clothing item images, perform semantic segmentation on user human body images to obtain human body contour regions, perform spatial deformation on candidate clothing item images based on human body contour regions, and superimpose the deformed candidate clothing item images onto user human body images to generate a synthesized wearing image. The illumination transformation unit is used to apply illumination transformations of various color temperatures to the synthesized clothing image to generate scene images under different lighting conditions; The color stabilization unit is used to extract the color feature vectors of the clothing area in the synthesized clothing image and each scene image, calculate the distance value between the color feature vector of each scene image and the color feature vector of the synthesized clothing image, and generate a color stability score for each clothing item based on the distance value. The color coordination unit is used to extract the color values of adjacent areas between multiple clothing items in the synthesized clothing image, calculate the contrast and saturation differences between the color values of adjacent clothing items, and generate a color coordination score for the clothing item combination based on the contrast and saturation differences. The comprehensive evaluation unit is used to comprehensively evaluate candidate clothing item images based on color stability score and color coordination score, and select clothing item combinations with scores higher than the preset score to output clothing matching schemes.
[0057] A third aspect of the present invention provides an electronic device, comprising: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.
[0058] A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0059] This invention can be a method, apparatus, system, and / or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for performing various aspects of the invention.
[0060] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. An automatic clothing matching optimization method based on image recognition, characterized in that, include: The process involves acquiring a user's human body image and candidate clothing item images, performing semantic segmentation on the user's human body image to obtain the human body contour region, spatially deforming the candidate clothing item images based on the human body contour region, and then superimposing the deformed candidate clothing item images onto the user's human body image to generate a composite wearing image. Apply lighting transformations of various color temperatures to the synthesized clothing image to generate scene images under different lighting conditions; Extract the color feature vectors of the clothing area from the synthesized clothing image and each scene image, calculate the distance value between the color feature vector of each scene image and the color feature vector of the synthesized clothing image, and generate a color stability score for each clothing item based on the distance value. Extract color values of adjacent areas between multiple clothing items in a composite image of clothing, calculate the contrast and saturation differences between the color values of adjacent clothing items, and generate a color coordination score for the combination of clothing items based on the contrast and saturation differences. The candidate clothing images are comprehensively evaluated based on color stability and color coordination scores, and clothing combinations with scores higher than the preset values are selected to output clothing matching schemes.
2. The method according to claim 1, characterized in that, The process involves acquiring a user's body image and candidate clothing item images, performing semantic segmentation on the user's body image to obtain the body contour region, spatially deforming the candidate clothing item images based on the body contour region, and then superimposing the deformed candidate clothing item images onto the user's body image to generate a composite wearing image. The process involves acquiring user body images and candidate clothing item images, performing pose detection on the user body image to identify human body joints, correcting the pose of the user body image based on the human body joints to obtain a corrected user body image, and performing background separation on the candidate clothing item images to obtain the main clothing image. Extract multi-scale feature maps from the corrected user human body image and perform feature fusion to obtain a fused feature map. Then, perform pixel-level classification on the fused feature map to obtain the human body contour region. Extract the boundary contour lines of the human body contour region, perform curvature analysis on the boundary contour lines to determine control points, establish the mapping relationship between the control points and the corresponding points of the main body image of the clothing, calculate the spatial transformation matrix, and use the spatial transformation matrix to perform pixel remapping on the main body image of the clothing to obtain the deformed candidate clothing single image. A depth mask is generated based on the human body contour region. The depth mask is used to determine the front-to-back occlusion relationship between the deformed candidate clothing item image and the corrected user human body image. Pixel fusion is performed according to the front-to-back occlusion relationship to generate a composite wearing image.
3. The method according to claim 2, characterized in that, Curvature analysis is performed on the boundary contour to determine control points. A mapping relationship is established between the control points and corresponding points in the main garment image. The spatial transformation matrix is calculated, and pixel remapping is performed on the main garment image using the spatial transformation matrix to obtain deformed candidate garment images, including: Discrete curvature calculation is performed on the pixels on the boundary contour line to obtain the curvature distribution sequence. The curvature distribution sequence is then filtered and smoothed. Curvature extreme points are identified based on the smoothed curvature distribution sequence. The curvature extreme points are then clustered and grouped to determine the control points that characterize the human contour features. Extract the edge contour of the main body image of the clothing and calculate its curvature. Identify the corresponding control points of the main body image of the clothing. Establish a point-to-point mapping relationship based on the spatial positional relationship between the control points representing the human body contour area and the corresponding control points of the main body image of the clothing. Based on the point-to-point mapping relationship, a deformation energy function reflecting the degree of deformation is constructed. The spatial transformation matrix is solved by minimizing the deformation energy function. The spatial transformation matrix is then applied to the pixel coordinates of each pixel in the main image of the clothing to calculate the transformed target coordinates. Based on the transformed pixel coordinates, pixel values are sampled from the main image of the clothing and filled into the positions corresponding to the transformed pixel coordinates to complete pixel remapping and obtain the deformed candidate clothing item image.
4. The method according to claim 1, characterized in that, Applying various color temperature lighting transformations to the synthetic clothing image generates multiple scene images under different lighting conditions, including: Extract the luminance and chrominance components of the synthesized clothing image, and perform semantic segmentation on the synthesized clothing image to identify the clothing region and skin region; Extract the fabric texture features of the clothing area and the surface reflection features of the skin area, and calculate the color temperature sensitivity coefficients of the clothing area and the skin area respectively based on the fabric texture features and surface reflection features; Multiple color temperature parameters are generated based on a preset color temperature range. Each color temperature parameter is multiplied by the color temperature sensitivity coefficient of the clothing area and the color temperature sensitivity coefficient of the skin area to obtain the regional adaptive color temperature parameters of the clothing area and the skin area. The adaptive color temperature parameters of the clothing area and the skin area are applied to the corresponding chromaticity components of the clothing area and the skin area respectively to perform color space transformation, resulting in multiple transformed chromaticity components. The illumination intensity adjustment coefficient is calculated based on the color difference between the transformed chromaticity components and the original chromaticity components. The transformed chromaticity components are then adjusted based on the illumination intensity adjustment coefficient. Finally, the adjusted chromaticity components are fused with the luminance components to reconstruct scene images under different illumination conditions.
5. The method according to claim 1, characterized in that, Extract color feature vectors of clothing areas from the synthesized clothing image and each scene image; calculate the distance between the color feature vector of each scene image and the color feature vector of the synthesized clothing image; and generate a color stability score for each garment based on the distance value, including: A multi-scale Gaussian pyramid is constructed for the clothing regions in the synthesized clothing image and various scene images. The hue gradient direction distribution features, saturation region distribution features, and brightness level distribution features are extracted at each level of the multi-scale Gaussian pyramid. The hue gradient direction distribution features, saturation region distribution features, and brightness level distribution features of the multi-level pyramid are fused to form a color feature vector. Extract the deviation values of the color feature vectors of each scene image relative to the color feature vector of the synthesized clothing image in the hue dimension, saturation dimension, and brightness dimension, and normalize the deviation values to obtain the hue weight factor, saturation weight factor, and brightness weight factor. The weight vector is formed by combining the hue weight factor, saturation weight factor, and brightness weight factor. Extract the difference components in the hue, saturation and brightness dimensions between the color feature vector of the synthesized clothing image and the color feature vector of each scene image. Then, perform weighted fusion of the difference components and the weight vector, and sum the weighted fusion results to obtain the distance value of each scene image. The distance values of images in each scene are statistically analyzed to calculate the mean distance and standard deviation of the distance. Based on the mean distance and standard deviation of the distance, a color stability score is generated for each garment item.
6. The method according to claim 1, characterized in that, Extract color values from adjacent areas of multiple clothing items in a composite image, calculate the contrast and saturation differences between adjacent clothing item color values, and generate a color coordination score for the clothing item combination based on the contrast and saturation differences, including: In the synthesized image of clothing, the boundary contours between multiple clothing items are detected. Sampling band regions are formed by extending to both sides along the normal direction of the boundary contours. The hue value, saturation value and brightness value of the pixels are extracted along the normal direction within the sampling band regions. The rate of change of hue values along the normal direction is calculated to form the hue transition gradient; the rate of change of saturation values along the normal direction is calculated to form the saturation transition gradient; and the rate of change of luminance values along the normal direction is calculated to form the luminance transition gradient. The color values of adjacent regions are formed by combining the hue transition gradient, saturation transition gradient and brightness transition gradient. The peak position and direction consistency of the hue transition gradient are extracted from the color values of adjacent regions. The contrast between the color values of adjacent clothing items is calculated. The rate of change and smoothness of the saturation transition gradient are extracted from the color values of adjacent regions. The saturation difference between the color values of adjacent clothing items is calculated. The differences in contrast and saturation are divided into corresponding harmony level intervals. Based on the harmony level intervals corresponding to the differences in contrast and saturation, the harmony scoring table is consulted to generate the color harmony score of the clothing item combination.
7. The method according to claim 1, characterized in that, Candidate clothing item images are comprehensively evaluated based on color stability and color harmony scores. Clothing item combinations with scores higher than a preset value are selected to output clothing matching schemes, including: The color stability scores of multiple candidate clothing item images are statistically analyzed to form a stability score distribution, and the color coordination scores of multiple candidate clothing item images are statistically analyzed to form a coordination score distribution. Abnormal stability score samples that deviate from the center of the stability score distribution are identified from the stability score distribution, and abnormal coordination score samples that deviate from the center of the coordination score distribution are identified from the coordination score distribution. The abnormal stability score samples and abnormal coordination score samples are corrected to obtain the corrected color stability score and the corrected color coordination score. The hue and brightness differences between candidate clothing images are extracted to form color complementary features, and the style type and pattern features between candidate clothing images are extracted to form style complementary features. The matching complementarity is calculated based on the color complementary features and style complementary features. The corrected color stability score, corrected color coordination score and matching complementarity are integrated to form a comprehensive score. Select candidate clothing images with a comprehensive score higher than the preset score, and combine the selected candidate clothing images to output clothing matching schemes.
8. An image recognition-based automatic clothing matching optimization system, used to implement the method of any one of claims 1-7, characterized in that, include: The image synthesis unit is used to acquire user human body images and candidate clothing item images, perform semantic segmentation on user human body images to obtain human body contour regions, perform spatial deformation on candidate clothing item images based on human body contour regions, and superimpose the deformed candidate clothing item images onto user human body images to generate a synthesized wearing image. The illumination transformation unit is used to apply illumination transformations of various color temperatures to the synthesized clothing image to generate scene images under different lighting conditions; The color stabilization unit is used to extract the color feature vectors of the clothing area in the synthesized clothing image and each scene image, calculate the distance value between the color feature vector of each scene image and the color feature vector of the synthesized clothing image, and generate a color stability score for each clothing item based on the distance value. The color coordination unit is used to extract the color values of adjacent areas between multiple clothing items in the synthesized clothing image, calculate the contrast and saturation differences between the color values of adjacent clothing items, and generate a color coordination score for the clothing item combination based on the contrast and saturation differences. The comprehensive evaluation unit is used to comprehensively evaluate candidate clothing item images based on color stability score and color coordination score, and select clothing item combinations with scores higher than the preset score to output clothing matching schemes.
9. An electronic device, characterized in that, include: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the method according to any one of claims 1 to 7.
10. A computer-readable storage medium having computer program instructions stored thereon, characterized in that, When the computer program instructions are executed by the processor, they implement the method described in any one of claims 1 to 7.