A method for generating a cascade camouflage sample

By using the Cascaded Camouflage Sample Generation Method (SKV-CG) and employing adaptive neighborhood feature filling and style transfer techniques, high-quality and diverse camouflage samples are generated. This addresses the shortcomings of existing sample generation methods, improves camouflage effects and background adaptability, and meets the complex environmental requirements of intelligent detection systems.

CN122244581APending Publication Date: 2026-06-19DALIAN UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
DALIAN UNIV OF TECH
Filing Date
2026-02-05
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies struggle to generate high-quality, diverse camouflage samples with visual consistency and concealment, resulting in poor detection performance of intelligent detection systems in complex environments.

Method used

The Cascaded Camouflage Sample Generation Method (SKV-CG) is adopted. By constructing an adaptive neighborhood feature filling model, adaptive high-dimensional clustering and style transfer technology, camouflage samples are generated to ensure that the samples blend naturally with the background.

Benefits of technology

It significantly improves the quality and diversity of camouflaged samples, enhances the concealment and background adaptability of the camouflage effect, reduces the difficulty of data collection and annotation, and meets the needs of real-time camouflage.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122244581A_ABST
    Figure CN122244581A_ABST
Patent Text Reader

Abstract

This invention discloses a cascaded camouflage sample generation method, comprising the following steps: constructing an adaptive neighborhood feature filling model; generating dynamic camouflage patterns through adaptive high-dimensional clustering; and constructing style transfer and generating camouflage samples. This invention, through multi-step collaborative optimization, combines neighborhood multi-dimensional sampling, intelligent region filling, adaptive high-dimensional clustering, and style transfer techniques. The generated camouflage samples can naturally blend with the background environment, avoiding the shortcomings of simple pattern mapping in traditional methods. This invention significantly improves the concealment and background adaptability of the camouflage effect, making the generated camouflage samples more realistic in complex environments. This invention, through adaptive high-dimensional clustering dynamic camouflage pattern generation combined with transfer learning methods, can effectively improve the efficiency of the generation process and reduce the computational burden; it can quickly generate high-quality camouflage samples in practical applications, meeting real-time camouflage requirements.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of autonomous driving, and in particular to a cascaded camouflage sample generation method (SKV-CG). Background Technology

[0002] While current intelligent visual perception algorithms have made some progress in image processing and target recognition, the scarcity of camouflaged target samples and the difficulty in collecting and labeling high-quality samples have led to performance bottlenecks in intelligent detection systems. Furthermore, existing methods often struggle to balance the efficiency and diversity of generated samples, resulting in monotonous training data and an inability to effectively address detection needs in complex environments. Therefore, improving the quality of generated camouflage samples, enriching sample diversity, and ensuring a high degree of integration in visual consistency and concealment have become critical issues that urgently need to be addressed.

[0003] Against the backdrop of the rapid development of camouflage technology, Chinese patent "CN116563421A" proposes a method for generating adversarial camouflage patterns based on target camouflage. This method effectively reduces the accuracy of target identification by accurately dividing the target area and background area and optimizing the camouflage pattern through adversarial training. However, in practical applications, insufficient background adaptability remains a problem. This method uses an initial adversarial camouflage pattern, extracts the final adversarial camouflage for the background area, and then updates the final camouflage pattern through 400 to 600 iterations based on the initial adversarial camouflage. While this iterative optimization method can improve the camouflage effect, each iteration mainly optimizes based on the initial camouflage and target area coverage. The generated camouflage pattern is more like a simple pattern map than a natural camouflage effect. The final generated image samples lack realism, appearing as if the camouflage pattern is directly covered on the surface of the target object, rather than like a real car or airplane wearing camouflage. Therefore, the camouflage patterns generated by this method are not suitable for sample generation and cannot provide a natural camouflage effect.

[0004] Currently, most traditional adversarial camouflage generation methods rely on a rule-based optimization process, typically generating camouflage patterns through simple color matching and background overlay. While these methods can be effective in static environments, in complex and dynamic backgrounds, the generated camouflage samples fail to blend naturally into the background, lacking the necessary concealment and realism, and therefore cannot be used for high-quality sample generation tasks. Summary of the Invention

[0005] To address the aforementioned issues, this invention proposes a cascaded camouflage sample generation method (SKV-CG), which significantly improves the quality and diversity of camouflage samples while ensuring that the generated camouflage patterns have high visual realism and concealment. This effectively reduces the difficulty of high-quality data collection and annotation, thereby saving collection and annotation costs and providing richer and more reliable camouflage sample data for the training of intelligent detection models.

[0006] To achieve the above objectives, the technical solution of the present invention is as follows: A method for generating cascaded camouflage samples, comprising the following steps:

[0007] A: Construct an adaptive neighborhood feature filling model; B: Generate dynamic camouflage colors through adaptive high-dimensional clustering; C: Construct style transfer and generate spoofed samples.

[0008] Furthermore, the method for constructing the adaptive neighborhood feature filling model described in step A includes the following steps: A1: Generate semantic segmentation model and bounding boxes Based on the advanced Segmentation All Model (SAM), unmasked or low-quality masked sample images are processed to accurately obtain the bounding boxes of target regions. SAM first performs global semantic analysis on the image to identify potential targets and generate corresponding bounding boxes. After obtaining the target region, the polygonal region corresponding to its bounding box is defined as the target region. The outer edge of the target region is expanded according to preset expansion parameters to generate the target's neighborhood region. The geometric boundaries and coordinate information of the target region and its neighborhood regions are output for subsequent masquerading processing, including localization and constraints.

[0009] A2: Perform neighborhood multidimensional sampling to obtain color patches Based on the semantic segmentation results, the camouflaged target region is determined and extended outwards. r A neighborhood of 100 pixels is formed to represent the target region. The neighborhood encompasses the target itself and the surrounding environmental features. Patches are selected from a pixel library of the target neighborhood for filling using a random sampling method. Let the number of pixels in the neighborhood be 1. m The number of invalid pixels in the target region is n Then the probability of selecting the patch p 1 / ( mn The number of sampling points selected is... s Then the pixel size of each patch is s 2 .

[0010] A3: Generate random patches and stitch them together. Within the neighborhood of the target region, based on the preset patch size range and number threshold, and adjusted according to the resolution of the dataset image, multiple candidate patch regions are randomly sampled. For each patch, its texture / color block is extracted and scaled and edge-trimmed. The patches are then stitched into the target region at random positions and in random order, using an overlapping coverage strategy during stitching, and weighted fusion is performed on the overlapping areas to eliminate seams. After stitching, superpixel-constrained boundary reshaping is performed on the target region boundary, and morphological dilation and erosion are combined to smooth the stitched region and its outer edges, and then filled into the target region to obtain the final filling result.

[0011] Furthermore, the method for generating dynamic camouflage through adaptive high-dimensional clustering described in step B includes the following steps: B1: Determine the initial cluster centers for K-means++ Clustering is performed based on the results of the target region filling in the previous step. A genetic algorithm is introduced to optimize the traditional K-means clustering algorithm. The optimized K-means++ algorithm determines the affiliation of each pixel in the image by calculating the Euclidean distance from each cluster center, as shown in the following formula: (1) in, d(p,c) For pixels p To the cluster center c The distance, where 'a' is the total number of pixels in the image. p i Indicates the first i The specific location of each sample pixel in the cluster. c z Let z represent the z-th cluster center.

[0012] A pixel is assigned to a cluster center when its distance to that cluster center is minimized. This is how the membership matrix is ​​generated. c The membership degree of each pixel reflects the relationship between that pixel and each cluster center.

[0013] B2: Update cluster centers During clustering, the cluster centers are updated by calculating the mean distance between pixels within each category and the cluster centers. This process iterates until a preset convergence condition is met. The cluster center update formula is as follows: (2) in Indicates the first z The cluster centers at the in t+1 Position after the next iteration Indicates the first z The set of all pixels contained in a cluster.

[0014] If the average distance between a cluster center and all pixels in its cluster is less than a preset threshold, the clustering process is considered complete and converged; if it is greater than the threshold, the clustering update continues.

[0015] B3: Determine the final cluster centers When calculating the quality of clustering results, a fitness function is used to evaluate the performance of individuals. Fitness function F The fitness value is used to measure the quality of clustering; a higher fitness value indicates a better clustering result. The fitness function is defined as follows: (3) in, k The number of cluster centers during the clustering process. Indicates the first j The specific location of each sample pixel within the cluster.

[0016] Furthermore, the method for constructing style transfer and generating spoofed samples in step C includes the following steps: C1: Extract content and style features Content features and style features are extracted from the target image using a VGG19 deep convolutional neural network. Content features capture high-level semantic information of the target image, including shape and edges. Style features are low-level features of the target image, including texture and color distribution. The content feature extraction employs convolutional layers of the VGG19 deep convolutional neural network. conv4_2 The output of the convolutional layers is calculated, while style features are obtained by calculating the Gram matrix of different convolutional layers.

[0017] C2: Calculate content loss and style loss Content loss and style loss are crucial components in optimizing the target image. Content loss measures the difference in high-level semantic features between the generated and target images. L content The formula for calculating the mean squared error is as follows: 4 In the formula, and These represent the generated image and the original image, respectively. l The x-coordinate within the convolutional layer is i The vertical axis is j The content information of the pixels.

[0018] Style loss is used to measure the texture consistency between the generated image and the target image in multi-level features. The Gram matrix difference is used as the style loss, and the formula is as follows: (5) in, Is the generated image in the first... l The gram matrix of the layer i Line number j Column elements, Is the target image in the 1st l The gram matrix of the layer i Line number j The elements of the column. w l For the first l Layer weights; N l and M l The first l The number of channels and spatial resolution of the layer.

[0019] C3: Weighted combination of content loss and style loss and optimized generation of image To address the content and style losses, a finite-memory quasi-Newton method, namely the L-BFGS optimization algorithm, is used to optimize the total loss function of the generated image. The generated image is then iteratively optimized. After each iteration, the pixels of the generated image are adjusted to make it approximate the content and texture features of the target image.

[0020] The total loss function combines content features and style features, and uses weights... α and β The formula for balancing the importance of both is as follows: 6 in, and These are the weighting coefficients for content loss and style loss, respectively.

[0021] Compared with the prior art, the present invention has the following beneficial effects: 1. This invention utilizes multi-step collaborative optimization, combining techniques such as neighborhood multidimensional sampling, intelligent region filling, adaptive high-dimensional clustering, and style transfer, to generate camouflage samples that blend naturally with the background environment, avoiding the shortcomings of simple pattern mapping in traditional methods. This invention significantly improves the concealment and background adaptability of the camouflage effect, making the generated camouflage samples more realistic in complex environments.

[0022] 2. This invention utilizes adaptive high-dimensional clustering for dynamic camouflage color generation combined with transfer learning methods, effectively improving the efficiency of the generation process and reducing computational burden. Compared with existing technologies, this invention significantly improves generation efficiency while ensuring sample quality, enabling the rapid generation of high-quality camouflage samples in practical applications to meet real-time camouflage requirements. Attached Figure Description

[0023] Figure 1 This is a flowchart of the present invention.

[0024] Figure 2 This is a diagram illustrating the process of neighborhood multidimensional sampling.

[0025] Figure 3 It is a generated sample image with high-quality camouflage. Detailed Implementation

[0026] The present invention will be further described below with reference to the accompanying drawings, such as... Figure 1 As shown, a cascaded camouflage sample generation method (SKV-CG) includes the following steps: A: Construct an adaptive neighborhood feature filling model A1: Generate semantic segmentation model and bounding boxes Taking advantage of the ease of automated target segmentation for raw samples (such as unmasked or low-quality masked samples), this invention employs a large model for target segmentation to obtain standard target edge contours. These edge contours provide crucial locational information for subsequent neighborhood multidimensional sampling, filling, and final sample generation. This method effectively solves the difficulties in high-quality sample acquisition and annotation, while saving labor costs and improving efficiency, thus enabling large-scale sample generation.

[0027] A2: Perform neighborhood multidimensional sampling to obtain color patches This invention employs a neighborhood multidimensional sampling and random patch splicing method, such as... Figure 2 As shown. Based on the semantic segmentation results, the white area in the image is first determined as the target region to be disguised, and the black area extending outward by r pixels is the target's neighborhood region. The value of r depends on the dataset adjustment. The resulting neighborhood region not only includes the target itself but also considers the environmental features of the surrounding area. Random pixels are selected within the target's neighborhood region, such as... Figure 2 The red pixel in the middle is the sampling center. Based on this, during the neighborhood sampling process, a large number of pixels containing background content information and texture features are collected and stored according to the location information of the target area.

[0028] A3: Generate random patches and stitch them together. A random patch generation and stitching method is used to fill the camouflaged target area. These pixels contain background information and texture features of the target area. Then, appropriate patches are selected from these pixels through random sampling, stitched together, and filled into the target area to ensure a natural and consistent blend between the target area's texture and background. To further enhance the naturalness and visual consistency of the camouflaged area, this invention combines superpixel segmentation and morphological operations. Specifically, the superpixel algorithm SLIC is used to divide the image into irregular regions based on color and spatial information, removing isolated pixels and merging visually similar regions to achieve a smooth transition effect. This operation effectively removes rough boundaries in the image, making the camouflage effect more natural. Next, dilation and erosion operations are used to further optimize the image. Dilation eliminates noise and unnatural boundaries in the image, while erosion fills the gaps between regions, making the camouflage effect smoother and enhancing the natural transition between the target and the background. This process ensures that the color blocks in the target area are more concealed and natural, improving overall visual consistency, and filling the target area to obtain the final filling result.

[0029] B: Dynamic camouflage generation through adaptive high-dimensional clustering B1: Determine the initial cluster centers for K-means++ To address the issues of high randomness and low efficiency in the initial cluster center selection process of the traditional K-means clustering algorithm, a genetic algorithm is introduced to optimize the target area after color patching. The genetic algorithm generates the initial population and calculates the fitness function to ensure high distinguishability between the dominant colors represented by each cluster center. Individuals with high fitness are selected, and new cluster centers are generated through crossover and mutation operations, thereby improving the accuracy and efficiency of image color extraction. This optimization significantly improves the accuracy and efficiency of the clustering process, ensuring that the generated camouflage effectively adapts to complex background environments. The optimized K-means++ algorithm determines the pixel's affiliation by calculating the distance from each pixel to each cluster center. Pixels are assigned to the cluster center with the smallest distance, generating a membership matrix reflecting the relationship between each pixel and each cluster center. This process ensures a high degree of consistency between the color and dominant color of each region in the image, allowing the generated camouflage sample to blend naturally with the surrounding environment, avoiding the color inconsistencies or lack of visual transitions in image regions seen in traditional methods.

[0030] B2: Update cluster centers During clustering, the cluster centers are updated by calculating the average distance from pixels within each category to the cluster centers. This process iterates until a preset convergence condition is met. By setting a distance threshold, the value of which depends on the quality of the generated camouflage, if the pixel distance to a cluster center is less than the preset threshold, the clustering process is considered to have converged; if the distance is greater than the threshold, the update continues until convergence. This mechanism ensures that the cluster centers gradually stabilize, resulting in camouflage patterns with high color consistency and detail fidelity.

[0031] B3: Determine the final cluster centers To evaluate the quality of clustering results, this invention uses a fitness function to measure the clustering quality. A higher fitness function value indicates a better clustering result. This fitness function allows for more precise selection of the optimal cluster centers, ensuring that the generated camouflage colors blend perfectly with the background and optimizing the camouflage effect.

[0032] C: Constructing style transfer for generating spoofed samples C1: Extract content and style features In this implementation, we use a VGG19 deep convolutional neural network to extract features from the target image. Content features are primarily used to capture high-level semantic information of the target image, such as shape and edges, while style features extract low-level features, such as texture and color distribution. This separate extraction method ensures that the generated camouflage samples not only structurally resemble the original in the target region. Figure 1 Furthermore, it can achieve a high degree of matching between the background texture and the target image, thereby enhancing the naturalness and realism of the camouflage samples.

[0033] C2: Calculate content loss and style loss To ensure that the generated image maintains both structural consistency with the target region and textural consistency with the background, we calculate content loss and style loss. Content loss measures the difference in semantic features between the generated and target images, while style loss measures the consistency in texture features between them. By optimizing these two loss functions, we can ensure that the generated camouflage samples are structurally faithful to the target image while effectively blending into the background in terms of texture, resulting in a more natural camouflage effect.

[0034] C3: Weighted combination of content loss and style loss and optimized generation of image During optimization, we balance the impact of content loss and style loss on the generated image by weighting and combining them. This weighted optimization method ensures that the generated image retains the structural features of the target region while fully representing the texture features of the target image. Iterative optimization is performed using the finite-memory quasi-Newton L-BFGS optimization algorithm. Each iteration adjusts the pixel values ​​of the generated image to ensure that the final generated image conforms as closely as possible to the target and style requirements. This process ensures that the final camouflage sample provides effective concealment in a real-world environment while maintaining a natural transition of texture details, avoiding abrupt camouflage effects.

[0035] Through this process, the present invention can generate camouflage samples with high-fidelity colors and textures. Before-and-after optimization results were compared for four common scenarios, such as... Figure 3 As shown.

[0036] Trench scene: Before generation, the green background with white stripes is clearly defined; after generation, the black and white digital camouflage blends into the mud and snow background, greatly reducing its recognizability.

[0037] In open fields: Before generation, the dark brown paint has sharp edges, while after generation, the mesh camouflage softens the outline and blends naturally with the vegetation and soil.

[0038] Jungle Environment: Before spawning, the dark vehicle body contrasts sharply with the green leaves; after spawning, the green and brown camouflage patches blend into the vegetation.

[0039] Factory area hard ground: Before generation, solid color vehicles / containers have prominent outlines, after generation, cement gray camouflage breaks the regular outlines, significantly improving concealment.

[0040] In summary, the SKV-CG model, before and after optimization, achieves a more natural fusion between the target region and the background, and improves the adaptability and concealment of sample generation, providing reliable data support for target detection technology.

[0041] This invention is not limited to this embodiment. Any equivalent concept or modification within the technical scope disclosed in this invention shall be included within the protection scope of this invention.

Claims

1. A method for generating cascaded camouflage samples, characterized in that: Includes the following steps: A: Construct an adaptive neighborhood feature filling model; B: Generate dynamic camouflage colors through adaptive high-dimensional clustering; C: Construct style transfer and generate spoofed samples.

2. The method for generating cascaded camouflage samples according to claim 1, characterized in that: The method for constructing the adaptive neighborhood feature filling model described in step A includes the following steps: A1: Generate semantic segmentation model and bounding boxes Based on the advanced Segmentation All Model (SAM), images of unmasked or low-quality masked samples are input for processing to accurately obtain the bounding boxes of target regions. SAM first performs global semantic analysis on the image to identify potential targets and generate corresponding bounding boxes. After obtaining the target region, the polygonal region corresponding to its bounding box is defined as the target region. The outer edge of the target region is expanded according to preset expansion parameters to generate the target neighborhood region. The geometric boundary and coordinate information of the target region and its neighborhood region are output for the positioning and constraints of subsequent masquerading processing. A2: Perform neighborhood multidimensional sampling to obtain color patches Based on the semantic segmentation results, the camouflaged target region is determined and extended outwards. r A number of pixels are used to form a neighborhood of the target region; the range of the neighborhood includes the target itself and the environmental features of the surrounding area; patches are selected from the pixel library of the target neighborhood for filling using a random sampling method; let the number of pixels in the neighborhood be 1. m The number of invalid pixels in the target region is n Then the probability of selecting the patch p 1 / ( mn The number of sampling points selected is... s Then the pixel size of each patch is s 2 ; A3: Generate random patches and stitch them together. Within the neighborhood of the target region, based on the preset patch size range and number threshold, and adjusted according to the resolution of the dataset image, multiple candidate patch regions are randomly sampled. For each patch, its texture / color block is extracted and scaled and edge-trimmed. The patches are then stitched into the target region at random positions and in random order, using an overlapping coverage strategy during stitching, and weighted fusion is performed on the overlapping areas to eliminate seams. After stitching, superpixel-constrained boundary reshaping is performed on the target region boundary, and morphological dilation and erosion are combined to smooth the stitched region and its outer edges, and then filled into the target region to obtain the final filling result.

3. The method for generating cascaded camouflage samples according to claim 1, characterized in that: Step B, the method for generating dynamic camouflage colors through adaptive high-dimensional clustering, includes the following steps: B1: Determine the initial cluster centers for K-means++ Clustering is performed based on the results of the target region filling in the previous step. A genetic algorithm is introduced to optimize the traditional K-means clustering algorithm. The optimized K-means++ algorithm determines the affiliation of each pixel in the image by calculating the Euclidean distance from each cluster center, as shown in the following formula: ⑴ in, d(p,c) For pixels p To the cluster center c The distance, where 'a' is the total number of pixels in the image. p i Indicates the first i The specific location of each sample pixel in the cluster. c z This represents the z-th cluster center; A pixel is assigned to a cluster center when its distance to that cluster center is minimized; this is how the membership matrix is ​​generated. c The membership degree of each pixel reflects the relationship between that pixel and each cluster center; B2: Update cluster centers During the clustering process, the cluster centers are updated by calculating the mean distance between pixels within each category and the cluster centers; this process is iterated until a preset convergence condition is met; the cluster center update formula is as follows: ⑵ in Indicates the first z The cluster centers at the in t+1 Position after the next iteration Indicates the first z The set of all pixels contained in a cluster; If the average distance between a cluster center and all pixels in its cluster is less than a preset threshold, the clustering process is considered complete and converged; if it is greater than the threshold, the clustering update continues. B3: Determine the final cluster centers When calculating the quality of clustering results, a fitness function is used to evaluate the performance of individuals; fitness function F The fitness value is used to measure the quality of clustering; a higher fitness value indicates a better clustering result. The fitness function is defined as follows: ⑶ in, k The number of cluster centers during the clustering process. Indicates the first j The specific location of each sample pixel within the cluster.

4. The method for generating cascaded camouflage samples according to claim 1, characterized in that: Step C describes a method for constructing style transfer and generating spoofed samples, which includes the following steps: C1: Extract content and style features Content features and style features are extracted from the target image using a VGG19 deep convolutional neural network. Content features capture high-level semantic information of the target image, including shape and edges. Style features are low-level features of the target image, including texture and color distribution. The content feature extraction employs convolutional layers of the VGG19 deep convolutional neural network. conv4_2 The output of the convolutional layers is calculated, while style features are obtained by calculating the Gram matrix of different convolutional layers. C2: Calculate content loss and style loss Content loss and style loss are key components in optimizing the target image; content loss measures the difference in high-level semantic features between the generated and target images. L content The formula for calculating the mean squared error is as follows: ⑷ In the formula, and These represent the generated image and the original image, respectively. l The x-coordinate within the convolutional layer is i The vertical axis is j The content information of the pixels; Style loss is used to measure the texture consistency between the generated image and the target image in multi-level features. The Gram matrix difference is used as the style loss, and the formula is as follows: ⑸ in, Is the generated image in the first... l The gram matrix of the layer i Line number j Column elements, The target image is in the 1st l The gram matrix of the layer i Line number j Column elements; w l For the first l Layer weights; N l and M l The first l The number of channels and spatial resolution of the layer; C3: Weighted combination of content loss and style loss and optimized generation of image. To address the content and style losses, the L-BFGS optimization algorithm, a finite-memory quasi-Newton method, is used to optimize the total loss function of the generated image. The generated image is then iteratively optimized. After each iteration, the pixels of the generated image are adjusted to make the generated image approximate the content and texture features of the target image. The total loss function combines content features and style features, and uses weights... α and β The formula for balancing the importance of both is as follows: ⑹ in, and These are the weighting coefficients for content loss and style loss, respectively.