Robustness enhancement method and system for adaptive directional smoothing
By employing an adaptive directional smoothing method, a combination of saliency-guided semantic masking modules and noise scaling matrices, the problem of balancing model recognition accuracy and adversarial defense capabilities in existing technologies is solved. This achieves a stable training process and an expanded security defense scope, providing formalized mathematical security guarantees.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHANGHAI UNIV
- Filing Date
- 2026-03-27
- Publication Date
- 2026-06-19
Smart Images

Figure CN122244535A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of deep learning technology, and specifically to a robustness enhancement method and system for adaptive directional smoothing. Background Technology
[0002] In recent years, deep learning models in artificial intelligence have been widely used in critical fields such as autonomous driving, intelligent security, and medical image recognition. However, deep learning models suffer from serious security vulnerabilities and are highly susceptible to adversarial attacks. Adversarial attacks refer to attackers adding minute interferences that are imperceptible to the human eye to the input image, which can cause the AI model to produce completely incorrect recognition results. This poses a significant risk to the practical application of AI models in security-critical scenarios.
[0003] To defend against adversarial attacks, the industry widely employs random smoothing security technology, which provides theoretically provable adversarial robustness for AI models. Its core principle is to overlay a layer of random Gaussian noise onto the input image, allowing the AI model to be trained and predict with noisy input. This enables the model to adapt to random interference in the input, thereby resisting malicious adversarial disturbances. Traditional random smoothing methods use isotropic uniform noise addition, applying the same level of random noise at all locations and in all dimensions of the image, without distinguishing differences in image content.
[0004] Existing random smoothing techniques suffer from the following critical shortcomings: First, the "one-size-fits-all" uniform noise addition severely damages key image information, leading to a significant drop in model recognition accuracy. In an image, foreground subjects such as faces, vehicles, and target objects are the core regions determining AI classification results, while background regions such as sky, grass, and walls have minimal impact. Traditional methods indiscriminately add strong noise to critical foreground regions, severely damaging the image's core discriminative features and significantly reducing the model's normal recognition ability. This creates a core bottleneck of "sacrificing basic recognition performance for attack defense," failing to balance robustness and accuracy. Second, existing dynamic noise addition schemes lack clear semantic guidance, resulting in highly unstable model training and an inability to guarantee the scope of security defenses. A few existing studies have attempted to use a data-driven approach to allow models to autonomously learn the noise intensity at different image locations. However, due to the lack of an explicit semantic guidance mechanism, it is impossible to clearly distinguish between the foreground subject and the background region of the image. The model can only blindly optimize through trial and error with the data, which is very easy to produce erroneous optimization results such as "adding high-intensity noise to key foreground regions and low-intensity noise to background regions". This not only fails to improve the defense effect, but also causes the training process to oscillate and not converge, and may even cause the provable security defense range of the model to shrink unexpectedly, thus losing the core security value of stochastic smoothing technology.
[0005] Therefore, there is an urgent need for an adaptive smoothing technique that can balance model recognition accuracy and adversarial robustness, training process stability, and strict mathematical security guarantees, in order to address the aforementioned shortcomings of existing stochastic smoothing techniques. Summary of the Invention
[0006] To address the aforementioned technical shortcomings, this invention provides an adaptive directional smoothing robustness enhancement method and system. This addresses the problems of existing stochastic smoothing techniques, such as the inability to balance model recognition accuracy and adversarial defense capabilities, unstable training processes, and limited security defense scope. Under the premise of maintaining strict formal mathematical security guarantees, it achieves simultaneous improvement in accuracy and robustness.
[0007] This invention is achieved through the following technical solution: An adaptive directional smoothing robustness enhancement method is provided, the method comprising the following steps: Step S10: Input the input image into the saliency-guided semantic masking module, extract the intermediate feature representation of the image and generate a soft mask, and attenuate the effective signal amplitude of the background region through the soft mask to provide structural priors for subsequent steps; Step S20: Input the extracted intermediate features into the input adaptive direction noise module to learn and generate a noise scaling matrix. The noise scaling matrix is used to assign a larger noise tolerance to the background region while limiting the noise scale of the foreground region. Step S30: Construct an overall objective function that combines the main classification loss, the auxiliary guidance loss, and the scaling regularization term, and perform end-to-end joint training on the entire network based on the overall objective function; Step S40: In the inference and robustness authentication phase, sample the underlying isotropic Gaussian noise, apply the learned noise scaling matrix to transform the Gaussian noise, generate input-dependent directional perturbations, and output the smoothed prediction results after noise injection. Step S50: Calculate the initial certification robust radius in the weighted Lp space corresponding to the noise scaling matrix, extract the minimum noise scaling coefficient as the lower bound through analytical derivation, convert the weighted robust region into the equivalent certification radius in the standard Lp space, and output the final mathematical robustness certification result.
[0008] Preferably, the saliency-guided semantic masking module in step S10 includes a feature encoder and an auxiliary classification head, and the specific steps for generating the soft mask include: The intermediate feature representation of the input image is extracted by the feature encoder, and the intermediate features are input into the auxiliary classification head to complete the classification prediction. A gradient-driven class activation mapping method is used to generate a saliency map related to the input image category. The value of the saliency map corresponds to the importance of the image region to the classification result. The saliency map is transformed into a soft mask, which separates the foreground region, which is crucial for classification, from the redundant background features. The formula for calculating the soft mask is shown in Equation (1): m=β+(1-β)σ(m')(1) Where m is the soft mask, β is the preset lower bound coefficient of the mask, σ is the Sigmoid activation function, and m' is the class activation mapping result generated by Grad-CAM++.
[0009] Preferably, the feature encoder adopts a U-Net encoder structure, the gradient-driven class activation mapping method adopts the Grad-CAM++ method, the feature decoder adopts a U-Net decoder structure corresponding to the feature encoder, and the base classifier of the main classification branch adopts a convolutional neural network classification model.
[0010] Preferably, the input adaptive direction noise module in step S20 includes a feature decoder, and the specific steps for generating the noise scaling matrix include: The intermediate features of the input are upsampled and restored by the feature decoder, and a noise scaling matrix matching the size of the input image is generated. The noise scaling matrix is activated using the Softplus activation function to ensure that the noise scaling factor is non-negative. The activated noise scaling matrix is bounded by a preset clipping mechanism to constrain the range of noise scaling amplitude and avoid excessive amplification or suppression of local noise.
[0011] Preferably, in step S30, the overall objective function is calculated using the formula L_total = L_main + α. L_aux+γ L_reg, where L_total is the overall objective function, L_main is the cross-entropy loss of the main classification branch, L_aux is the auxiliary guidance loss of the auxiliary classification head, L_reg is the scaling regularization term, and α and γ are the preset weight coefficients of the auxiliary guidance loss and the scaling regularization term, respectively. The input of the main classification branch is the image after being soft-masked, element-wise reweighted, and injected with adaptive scaling directional noise. The scaling regularization term is used to constrain the overall magnitude of the noise scaling matrix to avoid unlimited amplification of the noise scale.
[0012] Preferably, the specific steps for generating directional disturbances and completing noise injection in step S40 include: The sampled image has the same basic isotropic Gaussian noise z ~ N(0,σ) with the same size as the input image. 2 I), where σ is the preset standard deviation of the base noise, I is the identity matrix, and N is a normal distribution; The isotropic Gaussian noise z is transformed by the noise scaling matrix S to generate a directional perturbation S⊙z, where ⊙ is an element-wise product operation. The input image x is reweighted by a soft mask to obtain x_mask=m⊙x, where x_mask is the reweighted soft mask image and m is a soft mask with the same size as the input image x. A directional perturbation is injected into the reweighted image to obtain the final image x'=x_mask+S⊙z input to the main classifier.
[0013] Preferably, the specific steps of robust radius conversion in step S50 include: Based on the authentication framework of random smoothing, the initial authentication robustness radius R_w is calculated in the weighted Lp space defined by the noise scaling matrix; The minimum value s_min in the noise scaling matrix is extracted as the lower bound coefficient for the robust radius transformation. The initial certified robust radius R_w in the weighted Lp space is converted to the equivalent certified radius R_std in the standard Lp space through analytical derivation. The transformation formula is R_std = s_min. R_w outputs the equivalent authentication radius in the standard Lp space as the final verifiable robustness metric, providing a formalized mathematical security guarantee.
[0014] Furthermore, to achieve the above objectives, the present invention also proposes an adaptive directional smoothing robustness enhancement system, wherein the adaptive directional smoothing robustness enhancement system comprises: Saliency-guided semantic masking unit: used to input the input image into the saliency-guided semantic masking module, extract the intermediate feature representation of the image and generate a soft mask, and attenuate the effective signal amplitude of the background region through the soft mask, providing structural priors for subsequent steps; Input adaptive directional noise unit: used to input the extracted intermediate features into the input adaptive directional noise module, learn to generate a noise scaling matrix, and assign a larger noise tolerance to the background region through the noise scaling matrix, while limiting the noise scale of the foreground region; End-to-end joint training unit: used to construct an overall objective function that combines the main classification loss, the auxiliary guidance loss, and the scaling regularization term, and to perform end-to-end joint training on the entire network based on the overall objective function; Directional perturbation and inference unit: used in the inference and robustness authentication stages, to sample the underlying isotropic Gaussian noise, apply the learned noise scaling matrix to transform the Gaussian noise, generate input-dependent directional perturbations, and output the smoothed prediction results after noise injection; Robust radius certification and conversion unit: It is used to calculate the initial certification robust radius in the weighted Lp space corresponding to the noise scaling matrix, extract the minimum noise scaling coefficient as the lower bound through analytical derivation, convert the weighted robust region into the equivalent certification radius in the standard Lp space, and output the final mathematical robustness certification result.
[0015] Furthermore, to achieve the above objectives, the present invention also proposes an adaptive directional smoothing robustness enhancement device, the device comprising: a memory, a processor, and an adaptive directional smoothing robustness enhancement program stored in the memory and executable on the processor, wherein the adaptive directional smoothing robustness enhancement program comprises the steps of implementing the adaptive directional smoothing robustness enhancement method as described above.
[0016] Furthermore, to achieve the above objectives, the present invention also provides a computer program product, which includes an adaptive directional smoothing robustness enhancement program, etc., which, when executed by a processor, implements the adaptive directional smoothing robustness enhancement method as described above.
[0017] The advantages and effects of this invention are: This invention proposes an adaptive directional smoothing robustness enhancement method and system. Through a saliency-guided semantic masking module, it explicitly distinguishes the foreground key region and the background redundant region of an image. In the foreground region, which is crucial for classification, the noise intensity is limited and the core discriminative features are preserved. In the background region, the noise intensity is increased to enhance the defense capability. This fundamentally solves the problem that the traditional uniform noise addition method destroys key image features and causes a significant drop in model accuracy. It achieves high robustness while maintaining the model's extremely high normal recognition accuracy. This invention significantly expands the provable security defense range of the model. Traditional random smoothing methods can only generate fixed spherical defense regions limited to the weakest areas. However, this invention generates a noise scaling matrix that matches the image content through an adaptive directional noise module, so that the defense region can fit the decision boundary of the model. This avoids the waste of defense range caused by the limitation of weak areas in traditional methods and significantly expands the actual security range that the AI model can resist adversarial attacks. The model training process is stable and retains strict formal mathematical security guarantees. This invention introduces an explicit semantic guidance mechanism, which provides a clear optimization direction for noise allocation, avoiding the blind trial and error problem of purely data-driven dynamic noise addition schemes. This makes the model training process extremely stable and converges faster. At the same time, this invention achieves robust radius conversion from weighted space to standard Lp space through rigorous analytical derivation. The output defense range has an absolute formal mathematical guarantee, ensuring the authenticity and reliability of the defense mechanism and meeting the requirements of provable security of AI models in safety-critical scenarios. Attached Figure Description
[0018] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0019] Figure 1 This is a flowchart of an adaptive directional smoothing robustness enhancement method according to the present invention.
[0020] Figure 2 This is a schematic diagram of the robustness enhancement system structure for adaptive directional smoothing according to the present invention.
[0021] Figure 3 This is a schematic block diagram of an adaptive directional smoothing robustness enhancement electronic device structure according to the present invention.
[0022] Figure 4 This is an overview diagram of the robustness enhancement system framework for adaptive directional smoothing according to the present invention.
[0023] Figure 5 This is a comparison diagram of the decision boundary of an adaptive directional smoothing robustness enhancement method according to the present invention. Detailed Implementation
[0024] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0025] like Figure 1 As shown, in one embodiment of the present invention, an adaptive directional smoothing robustness enhancement method includes the following steps: Step S10: Input the input image into the saliency-guided semantic masking module, extract the intermediate feature representation of the image and generate a soft mask, and attenuate the effective signal amplitude of the background region through the soft mask to provide structural priors for subsequent steps.
[0026] Specifically, the saliency-guided semantic masking module in step S10 includes a feature encoder and an auxiliary classification head, and the specific steps for generating the soft mask include: The intermediate feature representation of the input image is extracted by the feature encoder, and the intermediate features are input into the auxiliary classification head to complete the classification prediction. A gradient-driven class activation mapping method is used to generate a saliency map related to the input image category. The value of the saliency map corresponds to the importance of the image region to the classification result. The saliency map is transformed into a soft mask, which separates the foreground region, which is crucial for classification, from the redundant background features. The formula for calculating the soft mask is shown in Equation (1): m=β+(1-β)σ(m')(1) Where m is the soft mask, β is the preset lower bound coefficient of the mask, σ is the Sigmoid activation function, and m' is the class activation mapping result generated by Grad-CAM++.
[0027] Specifically, the feature encoder adopts the U-Net encoder structure, the gradient-driven class activation mapping method adopts the Grad-CAM++ method, the feature decoder adopts the U-Net decoder structure corresponding to the feature encoder, and the base classifier of the main classification branch adopts the convolutional neural network classification model.
[0028] Specifically, the saliency-guided semantic masking module uses the U-Net encoder as the feature encoder. The input image size is H×W×3, where H is the image height, W is the image width, and 3 is the number of RGB image channels. The input image undergoes multiple convolutional and downsampling operations by the U-Net encoder to extract intermediate feature representations with dimensions C×H / 16×W / 16, where C is the number of feature channels. These intermediate features are then input into an auxiliary classification head, which consists of a global average pooling layer and a fully connected layer. The output class prediction result is used to calculate the auxiliary guidance loss L_aux.
[0029] The Grad-CAM++ method is used to generate a class activation map m' related to the category of the input image based on the prediction results and intermediate features of the auxiliary classification head. The size of the class activation map is consistent with that of the input image, which is H×W×1. The higher the value of each position in the map, the greater the contribution of that region to the classification result, that is, the higher the probability of belonging to the foreground key region.
[0030] The class activation map m' is normalized to the 0-1 interval using the Sigmoid activation function, and then the final soft mask m is generated using the soft mask calculation formula: m = β + (1-β)σ(m'). In this embodiment, β is set to 0.05 to avoid the background region features being completely attenuated to 0, ensuring the integrity of the features. The input image is then reweighted element-wise using the soft mask to obtain x_mask = m⊙x, where ⊙ is the element-wise multiplication operation. This attenuates the effective signal amplitude in the background region, providing a clear structural prior for subsequent noise allocation and explicitly separating the foreground and background regions.
[0031] Step S20: Input the extracted intermediate features into the input adaptive direction noise module to learn and generate a noise scaling matrix. The noise scaling matrix is used to assign a larger noise tolerance to the background region while limiting the noise scale of the foreground region.
[0032] Specifically, the input adaptive direction noise module in step S20 includes a feature decoder, and the specific steps for generating the noise scaling matrix include... The intermediate features of the input are upsampled and restored by the feature decoder, and a noise scaling matrix matching the size of the input image is generated. The noise scaling matrix is activated using the Softplus activation function to ensure that the noise scaling factor is non-negative. The activated noise scaling matrix is bounded by a preset clipping mechanism to constrain the range of noise scaling amplitude and avoid excessive amplification or suppression of local noise.
[0033] Specifically, in step S20, the intermediate features extracted in step S10 are input into the input adaptive direction noise module to learn and generate a noise scaling matrix S. Specifically, the input adaptive direction noise module uses the U-Net decoder corresponding to the U-Net encoder as the feature decoder. The intermediate features undergo multi-layer upsampling and convolution operations by the decoder to be restored to the same size H×W×1 as the input image, generating the initial noise scaling parameter matrix.
[0034] The initial noise scaling parameter matrix is activated using the Softplus activation function, calculated as Softplus(x) = ln(1 + e^x). This ensures that all noise scaling factors are non-negative, conforming to the physical meaning of noise scaling. A pruning mechanism is then used to constrain the upper and lower bounds of the activated matrix. In this embodiment, the lower bound of the noise scaling factor is set to 0.2, and the upper bound to 5.0, to prevent excessive amplification of local noise that could completely overwhelm model features, or excessive suppression that could lead to insufficient defense capabilities. The final noise scaling matrix S is obtained.
[0035] The noise scaling matrix and the soft mask are negatively correlated. That is, for key foreground regions with high soft mask values, the noise scaling factor is smaller, which limits the noise intensity; for background regions with low soft mask values, the noise scaling factor is larger, which improves noise tolerance and achieves adaptive directional noise allocation.
[0036] Step S30: Construct an overall objective function that combines the main classification loss, the auxiliary guidance loss, and the scaling regularization term, and perform end-to-end joint training on the entire network based on the overall objective function.
[0037] Specifically, in step S30, the formula for calculating the overall objective function is L_total = L_main + α. L_aux+γ L_reg, where L_total is the overall objective function, L_main is the cross-entropy loss of the main classification branch, L_aux is the auxiliary guidance loss of the auxiliary classification head, L_reg is the scaling regularization term, and α and γ are the preset weight coefficients of the auxiliary guidance loss and the scaling regularization term, respectively. The input of the main classification branch is the image after being soft-masked, element-wise reweighted, and injected with adaptive scaling directional noise. The scaling regularization term is used to constrain the overall magnitude of the noise scaling matrix to avoid unlimited amplification of the noise scale.
[0038] Specifically, the formula for calculating the overall objective function is: L_total = L_main + α L_aux+γ In this embodiment, α is 0.5 and γ is 1e-4, which can be adjusted according to the dataset and application scenario.
[0039] Where L_main is the cross-entropy loss of the main classification branch, and the input of the main classification branch is the image x'=x_mask+S⊙z after soft mask reweighting and injection of adaptive noise, where z is the sampling basis Gaussian noise z~N(0,σ 2 I) In this embodiment, the standard deviation of the basic noise σ is 0.25. The main classifier adopts the ResNet50 convolutional neural network model and outputs the final classification prediction result. The cross-entropy loss L_main is calculated based on the prediction result and the real label to ensure the core classification ability of the model.
[0040] L_aux is the cross-entropy loss of the auxiliary classification head, calculated based on the prediction results of the auxiliary classification head and the true label. It ensures that the intermediate features extracted by the feature encoder have sufficient class discrimination power, thus guaranteeing the accuracy of the saliency map and soft mask generation.
[0041] L_reg is the scaling regularization term, using L2 regularization, and is calculated as L_reg = ||S||2. 2 It is used to constrain the overall magnitude of the noise scaling matrix, avoid unlimited amplification of the noise scale, and balance the model's defense capability and recognition performance.
[0042] Based on the overall objective function mentioned above, the Adam optimizer is used to perform end-to-end joint training on all parameters of the entire network. The training batch size is set to 32, the initial learning rate is set to 1e-4, and the training epochs are set to 120. During the training process, a cosine decay strategy of the learning rate is adopted to complete the convergence optimization of the entire model.
[0043] Step S40: In the inference and robustness authentication phase, sample the underlying isotropic Gaussian noise, apply the learned noise scaling matrix to transform the Gaussian noise, generate input-dependent directional perturbations, and output the smoothed prediction results after noise injection.
[0044] Specifically, the steps in step S40 for generating directional disturbances and completing noise injection include: The sampled image has the same basic isotropic Gaussian noise z ~ N(0,σ) with the same size as the input image. 2 I), where σ is the preset standard deviation of the base noise, I is the identity matrix, and N is a normal distribution; The isotropic Gaussian noise z is transformed by the noise scaling matrix S to generate a directional perturbation S⊙z, where ⊙ is an element-wise product operation. The input image x is reweighted by a soft mask to obtain x_mask=m⊙x, where x_mask is the reweighted soft mask image and m is a soft mask with the same size as the input image x. A directional perturbation is injected into the reweighted image to obtain the final image x'=x_mask+S⊙z input to the main classifier.
[0045] Step S50: Calculate the initial certification robust radius in the weighted Lp space corresponding to the noise scaling matrix, extract the minimum noise scaling coefficient as the lower bound through analytical derivation, convert the weighted robust region into the equivalent certification radius in the standard Lp space, and output the final mathematical robustness certification result.
[0046] Specifically, the robust radius conversion steps in step S50 include: Based on the authentication framework of random smoothing, the initial authentication robustness radius R_w is calculated in the weighted Lp space defined by the noise scaling matrix; The minimum value s_min in the noise scaling matrix is extracted as the lower bound coefficient for the robust radius transformation. The initial certified robust radius R_w in the weighted Lp space is converted to the equivalent certified radius R_std in the standard Lp space through analytical derivation. The transformation formula is R_std = s_min. R_w outputs the equivalent authentication radius in the standard Lp space as the final verifiable robustness metric, providing a formalized mathematical security guarantee.
[0047] Specifically, based on the standard authentication framework of Neyman-Pearson lemma and stochastic smoothing, the initial authentication robustness radius R_w is first calculated in the weighted L2 space defined by the noise scaling matrix S. This radius is the maximum range of adversarial perturbations in which the model prediction results remain unchanged under the weighted space.
[0048] The minimum value s_min in the noise scaling matrix S is extracted as the lower bound coefficient for the robust radius transformation. Through rigorous analytical derivation, the initial certified robust radius R_w in the weighted L2 space is transformed into the equivalent certified radius R_std in the standard L2 space. The transformation formula is R_std = s_min. R_w.
[0049] The equivalent certification radius is a provable robust radius in standard L2 space. That is, when the L2 norm against perturbation is less than or equal to R_std, the model’s classification prediction results will remain unchanged, providing a formal mathematical security guarantee for the model. The equivalent certification radius is used as the final robustness index output.
[0050] In addition, such as Figure 2 As shown, in one embodiment of the present invention, an adaptive directional smoothing robustness enhancement system is proposed, the system comprising: Saliency-guided semantic masking unit: used to input the input image into the saliency-guided semantic masking module, extract the intermediate feature representation of the image and generate a soft mask, and attenuate the effective signal amplitude of the background region through the soft mask, providing structural priors for subsequent steps; Input adaptive directional noise unit: used to input the extracted intermediate features into the input adaptive directional noise module, learn to generate a noise scaling matrix, and assign a larger noise tolerance to the background region through the noise scaling matrix, while limiting the noise scale of the foreground region; End-to-end joint training unit: used to construct an overall objective function that combines the main classification loss, the auxiliary guidance loss, and the scaling regularization term, and to perform end-to-end joint training on the entire network based on the overall objective function; Directional perturbation and inference unit: used in the inference and robustness authentication stages, to sample the underlying isotropic Gaussian noise, apply the learned noise scaling matrix to transform the Gaussian noise, generate input-dependent directional perturbations, and output the smoothed prediction results after noise injection; Robust radius certification and conversion unit: It is used to calculate the initial certification robust radius in the weighted Lp space corresponding to the noise scaling matrix, extract the minimum noise scaling coefficient as the lower bound through analytical derivation, convert the weighted robust region into the equivalent certification radius in the standard Lp space, and output the final mathematical robustness certification result.
[0051] This application provides an adaptive directional smoothing robustness enhancement system, employing an adaptive directional smoothing robustness enhancement method as described in the above embodiments. This system addresses the technical problems of existing stochastic smoothing techniques, such as the inability to simultaneously achieve model recognition accuracy and adversarial defense capabilities, unstable training processes, and limited security defense scope. Compared to existing technologies, the beneficial effects of the adaptive directional smoothing robustness enhancement system provided in this application are the same as those of the adaptive directional smoothing robustness enhancement method provided in the above embodiments. Furthermore, other technical features of the adaptive directional smoothing robustness enhancement system are the same as those disclosed in the methods of the above embodiments, and will not be repeated here.
[0052] This application provides an adaptive directional smoothing robustness enhancement device, which includes: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, which are executed by the at least one processor to enable the at least one processor to perform an adaptive directional smoothing robustness enhancement method as described in Embodiment 1 above.
[0053] like Figure 3 As shown, in one embodiment of the present invention, a structural schematic diagram of an adaptive directional smoothing robustness enhancement device suitable for implementing an embodiment of the present application is illustrated. An adaptive directional smoothing robustness enhancement device in an embodiment of the present application may include, but is not limited to, mobile terminals such as mobile phones, laptops, digital broadcast receivers, PDAs (Personal Digital Assistants), PADs (Portable Application Descriptions), PMPs (Portable Media Players), etc., as well as fixed terminals such as digital TVs, desktop computers, etc. Figure 3 The adaptive directional smoothing robustness enhancement device shown is merely an example and should not impose any limitation on the functionality and scope of use of the embodiments of this application.
[0054] Figure 3The adaptive directional smoothing robustness enhancement device shown may include a processor 1001 (e.g., a central processing unit, graphics processing unit, etc.) that can perform various appropriate actions and processes based on a program stored in read-only memory (ROM) 1002 or a program loaded from storage device 1003 into machine-readable storage medium (RAM) 1004. The RAM 1004 also stores various programs and data required for the operation of the adaptive directional smoothing robustness enhancement device. The processor 1001, ROM 1002, and RAM 1004 are interconnected via a bus 1005. An input / output (I / O) interface 1006 is also connected to the bus. Typically, the following systems can be connected to I / O interface 1006: input devices 1007 including, for example, touchscreens, touchpads, keyboards, mice, image sensors, microphones, accelerometers, gyroscopes, etc.; output devices 1008 including, for example, liquid crystal displays (LCDs), speakers, vibrators, etc.; storage devices 1003 including, for example, magnetic tapes, hard disks, etc.; and a communication unit 1009. Communication unit 1009 allows an adaptive orientation smoothing robustness enhancement device to communicate wirelessly or wiredly with other devices to exchange data. Although an adaptive orientation smoothing robustness enhancement device with various systems is shown in the figure, it should be understood that it is not required to implement or possess all the systems shown. More or fewer systems can be implemented alternatively.
[0055] Specifically, according to the embodiments disclosed in this application, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments disclosed in this application include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via a communication unit, or installed from storage device 1003, or installed from read-only memory 1002. When the computer program is executed by processor 1001, it performs the functions defined in the methods of the embodiments disclosed in this application.
[0056] like Figure 4 As shown in one embodiment of the present invention, an overview diagram of the robustness enhancement system framework for adaptive directional smoothing is presented. The semantic masking module in the lower left generates a mask, the directional noise module in the upper right calculates the noise level at each position, and the main branches (upper and lower right) complete the noise addition and output the final classification result.
[0057] like Figure 5As shown in one embodiment of the present invention, a comparison diagram of the decision boundary of the adaptive directional smoothing robustness enhancement method of the present invention is presented, demonstrating the principle of the present invention to expand the defense range. The defense area of the traditional uniform noise addition method (Figure a) is a regular circle limited to the weakest point; the adaptive noise addition method of the present invention (Figure b) can make the defense area fit the decision boundary, significantly expanding the defense range.
[0058] This application provides an adaptive directional smoothing robustness enhancement device, employing an adaptive directional smoothing robustness enhancement method as described in the above embodiments. This addresses the technical problems of existing stochastic smoothing techniques, such as the inability to simultaneously achieve model recognition accuracy and adversarial defense capabilities, unstable training processes, and limited security defense scope. Compared to existing technologies, the beneficial effects of the adaptive directional smoothing robustness enhancement device provided in this application are the same as those of the adaptive directional smoothing robustness enhancement method provided in the above embodiments. Furthermore, other technical features of this adaptive directional smoothing robustness enhancement device are the same as those disclosed in the previous embodiment method, and will not be repeated here.
[0059] The various parts disclosed in this application can be implemented using hardware, software, firmware, or a combination thereof. In the description of the above embodiments, specific features, structures, materials, or characteristics can be combined in any suitable manner in one or more embodiments or examples.
[0060] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the steps of the adaptive directional smoothing robustness enhancement method described above.
[0061] The computer program product provided in this application can solve the technical problems of existing stochastic smoothing techniques, such as the inability to balance model recognition accuracy and adversarial defense capabilities, unstable training processes, and limited security defense scope. Compared with the prior art, the beneficial effects of the computer program product provided in this application are the same as those of the robustness enhancement method for adaptive directional smoothing provided in the above embodiments, and will not be repeated here.
[0062] Obviously, those skilled in the art can make various modifications and variations to this invention without departing from its spirit and scope. Therefore, if these modifications and variations fall within the scope of the claims of this invention and their equivalents, this invention also intends to include these modifications and variations.
Claims
1. A robustness enhancement method for adaptive directional smoothing, characterized in that, The method includes the following steps: Step S10: Input the input image into the saliency-guided semantic masking module, extract the intermediate feature representation of the image and generate a soft mask, and attenuate the effective signal amplitude of the background region through the soft mask; Step S20: Input the extracted intermediate features into the input adaptive direction noise module to learn and generate a noise scaling matrix. The noise scaling matrix is used to assign noise tolerance to the background region while limiting the noise scale of the foreground region. Step S30: Construct an overall objective function that combines the main classification loss, the auxiliary guidance loss, and the scaling regularization term, and perform end-to-end joint training on the entire network based on the overall objective function; Step S40: In the inference and robustness authentication phase, sample the underlying isotropic Gaussian noise, apply the learned noise scaling matrix to transform the Gaussian noise, generate input-dependent directional perturbations, and output the smoothed prediction results after noise injection. Step S50: Calculate the initial certification robust radius in the weighted Lp space corresponding to the noise scaling matrix, extract the minimum noise scaling coefficient as the lower bound through analytical derivation, convert the weighted robust region into the equivalent certification radius in the standard Lp space, and output the final mathematical robustness certification result.
2. The robustness enhancement method for adaptive directional smoothing according to claim 1, characterized in that, The saliency-guided semantic masking module in step S10 includes a feature encoder and an auxiliary classification head. The specific steps for generating the soft mask include: The intermediate feature representation of the input image is extracted by the feature encoder, and the intermediate features are input into the auxiliary classification head to complete the classification prediction. A gradient-driven class activation mapping method is used to generate a saliency map related to the input image category. The value of the saliency map corresponds to the importance of the image region to the classification result. The saliency map is converted into a soft mask to separate the foreground region from redundant background features.
3. The robustness enhancement method for adaptive directional smoothing according to claim 2, characterized in that, The feature encoder adopts the U-Net encoder structure, the gradient-driven class activation mapping method adopts the Grad-CAM++ method, the feature decoder adopts the U-Net decoder structure corresponding to the feature encoder, and the base classifier of the main classification branch adopts the convolutional neural network classification model.
4. The robustness enhancement method for adaptive directional smoothing according to claim 1, characterized in that, The input adaptive direction noise module in step S20 includes a feature decoder, and the specific steps for generating the noise scaling matrix include... The intermediate features of the input are upsampled and restored by the feature decoder, and a noise scaling matrix matching the size of the input image is generated. The values of the noise scaling matrix are activated using the Softplus activation function; The activated noise scaling matrix is subject to upper and lower bounds through a preset pruning mechanism, which constrains the range of noise scaling amplitude values.
5. The robustness enhancement method for adaptive directional smoothing according to claim 1, characterized in that, In step S30, the overall objective function is calculated as L_total = L_main + α. L_aux+γ L_reg, where L_total is the overall objective function, L_main is the cross-entropy loss of the main classification branch, L_aux is the auxiliary guidance loss of the auxiliary classification head, L_reg is the scaling regularization term, and α and γ are the preset weight coefficients of the auxiliary guidance loss and the scaling regularization term, respectively. The input of the main classification branch is the image after element-wise reweighting with a soft mask and injection of adaptive scaling directional noise. The scaling regularization term is used to constrain the overall magnitude of the noise scaling matrix.
6. The robustness enhancement method for adaptive directional smoothing according to claim 1, characterized in that, The specific steps for generating directional disturbances and completing noise injection in step S40 include: The sampled image has the same basic isotropic Gaussian noise z ~ N(0,σ) with the same size as the input image. 2 I), where σ is the preset standard deviation of the base noise, I is the identity matrix, and N is a normal distribution; The isotropic Gaussian noise z is transformed by the noise scaling matrix S to generate a directional perturbation S⊙z, where ⊙ is an element-wise product operation. The input image x is reweighted by a soft mask to obtain x_mask=m⊙x, where x_mask is the reweighted soft mask image and m is a soft mask with the same size as the input image x. A directional perturbation is injected into the reweighted image to obtain the final image x'=x_mask+S⊙z input to the main classifier.
7. The robustness enhancement method for adaptive directional smoothing according to claim 1, characterized in that, The specific steps of robust radius conversion in step S50 include: Based on the authentication framework of random smoothing, the initial authentication robustness radius R_w is calculated in the weighted Lp space defined by the noise scaling matrix; The minimum value s_min in the noise scaling matrix is extracted as the lower bound coefficient for the robust radius transformation. The initial certified robust radius R_w in the weighted Lp space is converted to the equivalent certified radius R_std in the standard Lp space through analytical derivation. The transformation formula is R_std = s_min. R_w outputs the equivalent authentication radius in the standard Lp space as the final verifiable robustness metric.
8. A robust enhancement system for adaptive directional smoothing, characterized in that, The system implements the robustness enhancement method for adaptive directional smoothing as described in claim 1, comprising: Saliency-guided semantic masking unit: used to input the input image into the saliency-guided semantic masking module, extract the intermediate feature representation of the image and generate a soft mask, and attenuate the effective signal amplitude of the background region through the soft mask; Input adaptive directional noise unit: used to input the extracted intermediate features into the input adaptive directional noise module, learn to generate a noise scaling matrix, assign noise tolerance to the background region through the noise scaling matrix, and limit the noise scale of the foreground region; End-to-end joint training unit: used to construct an overall objective function that combines the main classification loss, the auxiliary guidance loss, and the scaling regularization term, and to perform end-to-end joint training on the entire network based on the overall objective function; Directional perturbation and inference unit: used in the inference and robustness authentication stages, to sample the underlying isotropic Gaussian noise, apply the learned noise scaling matrix to transform the Gaussian noise, generate input-dependent directional perturbations, and output the smoothed prediction results after noise injection; Robust radius certification and conversion unit: It is used to calculate the initial certification robust radius in the weighted Lp space corresponding to the noise scaling matrix, extract the minimum noise scaling coefficient as the lower bound through analytical derivation, convert the weighted robust region into the equivalent certification radius in the standard Lp space, and output the final mathematical robustness certification result.
9. A robust enhancement device for adaptive directional smoothing, characterized in that, include: The present invention includes a memory, a processor, and an adaptive directional smoothing robustness enhancement program stored in the memory and executable on the processor, wherein the adaptive directional smoothing robustness enhancement program, when executed by the processor, implements an adaptive directional smoothing robustness enhancement method as described in any one of claims 1 to 7.
10. A computer program product, characterized in that, The computer program product includes an adaptive directional smoothing robustness enhancement program, which, when executed by a processor, implements an adaptive directional smoothing robustness enhancement method as described in any one of claims 1 to 7.