An unsupervised PCB image reconstruction and defect detection method and system
By combining unsupervised learning methods with wavelet transform, multi-scale dilated convolution, and the Mamba state-space model, the problems of high annotation cost, limited receptive field, and insufficient reconstruction quality in PCB defect detection are solved, achieving efficient detection and repair of minute defects.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHENGDU UNIV OF INFORMATION TECH
- Filing Date
- 2026-05-14
- Publication Date
- 2026-06-12
AI Technical Summary
Existing PCB defect detection methods rely on a large number of labeled defect samples, which is difficult to adapt to the scarce and diverse real production scenarios. Traditional convolutional neural networks have limited receptive fields, making it difficult to balance global continuity with the representation of minute defects. Existing unsupervised reconstruction methods are insufficient in repairing high-frequency details, and frequency domain information is not fully utilized, resulting in poor detection accuracy and repair quality.
An unsupervised learning method is adopted to extract high-frequency features through discrete wavelet transform and multi-scale dilated convolution, and low-frequency modeling is performed by combining the Mamba state space model. Anomaly repair is performed in the frequency domain, and the texture and structural consistency of the reconstructed image are enhanced by fast Fourier transform. A joint framework for reconstruction and discrimination is constructed for defect detection.
It significantly improves the detection accuracy of minute defects and the quality of anomaly repair, reduces annotation costs, and improves the accuracy of anomaly area positioning and segmentation boundary accuracy. It is suitable for PCB online inspection systems and has good engineering usability.
Smart Images

Figure CN122199541A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of industrial visual inspection technology, specifically to an unsupervised PCB image reconstruction and defect detection method and system. Background Technology
[0002] In the PCB (Printed Circuit Board) production and quality inspection process, the actual acquired PCB images often suffer from strong noise interference, blurred details, low contrast, and missing information in local areas due to various factors such as imaging equipment noise, uneven lighting, surface reflection, contamination, and complex circuit structures. These imaging quality problems significantly weaken the grayscale or texture differences between defective and normal areas, making it difficult to accurately identify fine-grained defects such as micro-cracks, broken lines, burrs, holes, and cold solder joints. This seriously affects the accuracy and stability of subsequent defect detection and localization. This type of problem is particularly prominent in PCB online inspection scenarios, because online inspection usually faces high-frequency imaging, complex background noise, and reflection interference simultaneously. If the defective area cannot maintain sufficiently significant structural and texture differences in the image, it is very easy to miss or falsely detect defects.
[0003] Currently, PCB defect detection methods mainly fall into three categories: the first is supervised learning methods based on convolutional neural networks, which typically require a large number of labeled defect samples for training to achieve defect classification or segmentation; the second is deep learning methods based on Transformer or attention mechanisms, which enhance the characterization of long-range dependencies by introducing global modeling capabilities; and the third is unsupervised anomaly detection methods based on image reconstruction, such as autoencoders and DRAEM (Discriminatively Trained Reconstruction Anomaly Embedding Model), whose core idea is to train the reconstruction network using only normal samples and achieve anomaly detection during the inference stage through reconstruction errors or reconstruction differences.
[0004] While the aforementioned methods have improved PCB defect detection performance to some extent, they still have many shortcomings in practical industrial applications. Specifically, supervised learning-based methods heavily rely on a large number of labeled defect samples. However, in real industrial scenarios, defect samples are usually scarce and diverse, making manual labeling costly, time-consuming, and difficult to cover all defect types, thus limiting the model's generalization ability. Methods based on Transformer or attention mechanisms have certain advantages in global modeling, but their computational and deployment costs are usually high, making it difficult to balance detection accuracy and inference efficiency in high-resolution PCB image scenarios. Unsupervised methods based on ordinary reconstruction networks are prone to irreversible loss of high-frequency detail information during multiple downsampling and upsampling processes, making it difficult for the reconstructed image to truly reflect the detailed features of defect-free PCBs.
[0005] Furthermore, existing methods generally do not fully utilize frequency domain information, failing to effectively mine the directional texture features, periodic trace structure features, and noise distribution features contained in PCB images, making it difficult to achieve global consistency repair of abnormal regions at the frequency domain level. At the same time, the receptive field of traditional convolutional networks is limited, making it difficult to simultaneously take into account the overall continuity of PCB trace structure over a large scale and the local fine-grained representation of minute defects. Even if some information loss is mitigated by ordinary skip connections, high-frequency detail information may still be weakened during the encoding and decoding process, resulting in fine-grained defects such as microcracks, broken lines, and edge burrs being over-smoothed or completely erased in the reconstruction results. This makes the reconstruction difference between abnormal and normal regions insufficiently significant, affecting the accuracy of subsequent anomaly detection.
[0006] For the reasons mentioned above, there is an urgent need for a new network structure and method that can simultaneously integrate frequency domain information and long-range dependency modeling capabilities, and is applicable to unsupervised or self-supervised PCB defect detection, so as to improve the accuracy of micro-defect detection and the quality of anomaly repair while reducing annotation costs. Summary of the Invention
[0007] The purpose of this invention is to provide an unsupervised PCB image reconstruction and defect detection method to at least solve the following problems existing in the prior art: First, it relies heavily on a large number of labeled defect samples, making it difficult to adapt to real production scenarios where defect samples are scarce, diverse, and unevenly distributed; second, traditional convolutional neural networks have limited receptive fields, making it difficult to simultaneously take into account the global continuity of PCB trace structures and the local fine-grained representation of minute defects; third, existing unsupervised reconstruction methods have insufficient ability to restore high-frequency details and micro-cracks, and the reconstruction results are prone to texture blurring and structural distortion; fourth, frequency domain information and long-range dependencies are not fully utilized, making it difficult to achieve global consistency repair of abnormal areas at the frequency domain level.
[0008] To achieve the above objectives, a first aspect of the present invention provides an unsupervised PCB image reconstruction and defect detection method, comprising: A normal PCB image sample set is obtained, and normalization, size alignment, and pseudo-anomaly construction are performed on the normal PCB image sample set to generate a training input image, a normal reference image, and an anomaly mask. The training input image is input into the reconstruction sub-network. In the encoding stage, the input features are decomposed into multiple frequency bands based on discrete wavelet transform, high-frequency features are extracted based on multi-scale dilated convolution, and low-frequency features are modeled based on Mamba state space model. In the decoding stage, the encoded features are decomposed into frequency domain, enhanced into frequency domain, and reconstructed into frequency domain based on fast Fourier transform to generate the reconstructed image. The training input image and the reconstructed image are concatenated along the channel dimension, and the concatenation result is input into the discriminant subnetwork to generate an anomaly probability map. Based on the normal reference image, the reconstructed image, the anomaly mask, and the anomaly probability map, the joint loss is calculated and the reconstructed subnetwork and the discriminant subnetwork are jointly optimized to obtain an unsupervised PCB image reconstruction and defect detection model. The PCB image to be detected is input into the jointly optimized unsupervised PCB image reconstruction and defect detection model, which outputs a defect mask image and determines the location, outline and area information of the defect region based on the defect mask image.
[0009] Optionally, a normal PCB image sample set is obtained, and normalization, size alignment, and pseudo-anomaly construction processing are performed on the normal PCB image sample set to generate a training input image, a normal reference image, and an anomaly mask, including: Obtain normal images containing only defect-free PCBs as the normal PCB image sample set; Normalization and size alignment are performed on each normal PCB image in the normal PCB image sample set to generate a preprocessed normal image; In the preprocessed normal image, a target region is randomly selected, and at least one of the following processes is performed on the target region: noise injection, local occlusion, texture replacement, brightness anomaly processing, contrast anomaly processing, local blurring processing, or local sharpening processing, to generate a pseudo-anomaly region. The pseudo-abnormal regions are mapped back to the corresponding preprocessed normal images to generate training input images, and an anomaly mask representing the spatial location of the pseudo-abnormal regions is generated simultaneously. The preprocessed normal image is used as a normal reference image, and the training input image, the normal reference image, and the anomaly mask are used to form a training sample pair.
[0010] Optionally, the training input image is input into the reconstruction sub-network, and during the encoding stage, multi-band decomposition based on discrete wavelet transform, high-frequency feature extraction based on multi-scale dilated convolution, and low-frequency feature modeling based on the Mamba state-space model are performed on the input features, including: The training input image is mapped to an input feature map. ; For the input feature map Perform a discrete wavelet transform to obtain the low-frequency subband features and the high-frequency subband features, expressed as follows: ; in, Indicates the characteristics of low-frequency subbands. , , The high-frequency subband features are represented by DWT(·), and DWT(·) represents the discrete wavelet transform operator. The high-frequency subband features are input into the high-frequency local branch for multi-scale local feature extraction. The low-frequency subband features are input into the low-frequency global branch for long-range dependency modeling. Feature fusion is performed on the output of the low-frequency global branch and the output of the high-frequency local branch, and the fusion result is transmitted to the decoding stage.
[0011] Optionally, the high-frequency sub-band features are input into the high-frequency local branch for multi-scale local feature extraction, including: The high-frequency sub-band features are respectively , , Perform a 1×1 convolutional mapping to generate the corresponding high-frequency mapping features; Perform multiple sets of depthwise separable dilated convolutions on the high-frequency mapping features to obtain multi-scale high-frequency local features, expressed as follows: ; ; in, This represents the characteristics of any high-frequency subband. This indicates the output of the depth-separable dilated convolution in the i-th group. Indicates the kernel size as Expansion rate Depth-separable dilated convolution operator, This represents a 1×1 convolution mapping operator, and Concat(·) represents a channel concatenation operator. Take 3, 5, and 7 respectively. Take 1, 2, and 3 respectively; Perform 1×1 convolution fusion on the multi-scale high-frequency local features to generate high-frequency fused features.
[0012] Optionally, the low-frequency sub-band features are input into the low-frequency global branch for long-range dependency modeling, and feature fusion is performed on the output of the low-frequency global branch and the output of the high-frequency local branch, including: For the low-frequency subband features Perform linear mapping and normalization processing; The processed low-frequency subband features are input into the Mamba state-space model to obtain the low-frequency global features, expressed as follows: ; Where Linear(·) represents the linear mapping operator, Norm(·) represents the normalization operator, and Mamba(·) represents the state-space modeling operator. Indicates low-frequency global features; The low-frequency global features and the high-frequency fused features are concatenated by channels, and convolutional fusion and inverse wavelet transform are performed to obtain the encoded output features, satisfying the following: ; in, Indicates high-frequency fusion characteristics, IDWT(·) represents the 3×3 convolution fusion operator, and IDWT(·) represents the inverse wavelet transform operator. Indicates the encoded output features; The encoded output features are downsampled and then transmitted to the next level of encoding layer or bottleneck layer.
[0013] Optionally, during the decoding stage, frequency domain decomposition, frequency domain enhancement, and frequency domain reconstruction based on Fast Fourier Transform are performed on the encoded features to generate a reconstructed image, including: Encoding features Perform a Fast Fourier Transform to obtain a complex spectrum, and then perform amplitude spectral decomposition and phase spectral decomposition on the complex spectrum to satisfy: ; ; ; in, denoted by , FFT(·) denotes the Fast Fourier Transform operator, A denotes the amplitude spectrum, Φ denotes the phase spectrum, |·| denotes the modulus operator, and ∠(·) denotes the phase operator; Convolutional enhancement is performed on the amplitude spectrum, and linear mapping, normalization, and Mamba state-space modeling are performed on the phase spectrum to obtain enhanced amplitude features and global phase features, satisfying: ; ; Where A′ represents the enhanced amplitude feature, Φ′ represents the global phase feature, and σ(·) represents the activation function; Based on the enhanced amplitude features and the global phase features, the complex spectrum is reconstructed, and an inverse fast Fourier transform is performed to obtain the reconstructed features, satisfying: ; ; in, This represents the reconstruction of the complex spectrum. This represents the phase complex exponential term generated from global phase features, where j represents the imaginary unit, and IFFT(·) represents the inverse fast Fourier transform operator. Indicates reconstruction features; The reconstructed features are residually fused with the skip connection features from the encoding stage to generate the reconstructed image.
[0014] Optionally, the training input image and the reconstructed image are concatenated along the channel dimension, and the concatenation result is input into the discriminant subnetwork to generate an anomaly probability map, including: The training input image or the PCB image to be detected is denoted as... The input image and the reconstructed image are then concatenated along the channel dimension to obtain the discriminative input features, expressed as: ; in, This indicates the input feature discrimination, and Concat(·) represents the channel concatenation operator; The discriminative input features are input into a convolution-based U-Net discriminative subnetwork to output an anomaly probability map P with the same spatial size as the input image. During the inference phase, the PCB image to be detected is sequentially input into the reconstruction subnetwork and the discrimination subnetwork to generate the corresponding anomaly probability map.
[0015] Optionally, based on the normal reference image, the reconstructed image, the anomaly mask, and the anomaly probability map, a joint loss is calculated, and the reconstructed subnetwork and the discriminative subnetwork are jointly optimized, including: The reconstructed image is denoted as The normal reference image is denoted as Let P be the anomaly probability map and M be the anomaly mask. The reconstruction loss is calculated based on the reconstructed image and the normal reference image, expressed as follows: ; in, Indicates the losses incurred during reconstruction. SSIM(·) represents the mean squared error loss function, and SSIM(·) represents the structural similarity function. and Indicates the loss weighting coefficient; The segmentation loss is calculated based on the anomaly probability map and the anomaly mask, satisfying the following: ; in, Indicates the loss from partitioning. Represents the focus loss function; The joint loss is calculated based on the reconstruction loss and the segmentation loss, satisfying the following: ; Where L represents the joint loss and γ represents the weighting coefficient of the segmentation loss; Backpropagation updates are performed on the reconstruction subnetwork and the discriminant subnetwork based on the joint loss to obtain an unsupervised PCB image reconstruction and defect detection model.
[0016] Optionally, the location, outline, and area information of the defect region are determined based on the defect mask image, specifically including: Thresholding is performed on the anomaly probability map to obtain a binary defect mask that satisfies: ; Where B(x,y) represents the binary defect mask value at coordinate (x,y), and P(x,y) represents the anomaly probability value at coordinate (x,y). Indicates the threshold parameter; Morphological filtering and connected component analysis are performed on the binary defect mask to remove noise regions and extract candidate defect regions; The location, outline, and area information of the defect region are output based on the candidate defect region.
[0017] A second aspect of the present invention provides an unsupervised PCB image reconstruction and defect detection system, the system being used to perform the above-described unsupervised PCB image reconstruction and defect detection method, the system comprising: The construction unit is used to acquire a normal PCB image sample set and perform normalization processing, size alignment processing and pseudo-anomaly construction processing on the normal PCB image sample set to generate a training input image, a normal reference image and an anomaly mask. The reconstruction unit is used to input the training input image into the reconstruction sub-network. In the encoding stage, it performs multi-band decomposition based on discrete wavelet transform, high-frequency feature extraction based on multi-scale dilated convolution, and low-frequency feature modeling based on Mamba state space model. In the decoding stage, it performs frequency domain decomposition, frequency domain enhancement, and frequency domain reconstruction based on fast Fourier transform to generate the reconstructed image. The discrimination unit is used to concatenate the training input image and the reconstructed image in the channel dimension, and input the concatenation result into the discrimination subnetwork to generate an anomaly probability map; An optimization unit is used to calculate the joint loss and jointly optimize the reconstruction subnetwork and the discriminant subnetwork based on the normal reference image, the reconstructed image, the anomaly mask and the anomaly probability map, so as to obtain an unsupervised PCB image reconstruction and defect detection model. The output unit is used to input the PCB image to be detected into the jointly optimized unsupervised PCB image reconstruction and defect detection model, output a defect mask map, and determine the location, outline and area information of the defect region based on the defect mask map.
[0018] Beneficial technical effects of the present invention: Through the above technical solution, low-frequency structural information and high-frequency texture information are explicitly separated by wavelet transform during the encoding stage. Multi-scale dilated convolution modeling is then applied to the high-frequency subband features, effectively enhancing the expressive power of PCB trace edges, pad contours, and minute defects such as cracks, burrs, and broken lines. This avoids the problem of small-target defect features being submerged during multiple downsampling processes in traditional convolutional networks, thus significantly improving the robust perception capability for defects of different scales and morphologies. Simultaneously, this invention uses the Mamba state-space model to perform long-range dependency modeling of low-frequency features, ensuring that the overall continuity and topological consistency of PCB traces are more fully maintained over a large scale.
[0019] Furthermore, this invention introduces an FFTMB frequency domain anomaly repair module based on Fast Fourier Transform (FFT) in the decoding stage. It models the amplitude and phase spectra separately and uses the Mamba state-space model to perform global dependency modeling of phase features. This achieves structural consistency repair of anomaly regions in the frequency domain, maintaining overall trace layout continuity while preserving local texture details, and reducing texture blurring, structural distortion, and broken lines in the reconstructed image. In addition, this invention employs an unsupervised or self-supervised reconstruction-discrimination joint framework, training only on normal PCB images. The original and reconstructed images are jointly input into the discriminant subnetwork for anomaly detection, effectively improving the accuracy of anomaly region localization and segmentation boundaries, and significantly reducing defect sample annotation costs. Simultaneously, by introducing the Mamba state-space model and depthwise separable dilated convolution, it reduces the number of parameters and computational complexity while maintaining modeling capabilities. Compared to Transformer-based methods, it has lower memory usage and higher inference efficiency, making it more suitable for deployment in PCB online inspection systems and edge computing devices, and possessing good engineering usability.
[0020] Other features and advantages of the embodiments of the present invention will be described in detail in the following detailed description section. Attached Figure Description
[0021] The accompanying drawings are provided to further illustrate embodiments of the present invention and form part of the specification. They are used together with the following detailed description to explain the embodiments of the present invention, but do not constitute a limitation thereof. In the drawings: Figure 1 This is a flowchart of the steps of an unsupervised PCB image reconstruction and defect detection method provided in one embodiment of the present invention; Figure 2 This is an overall structure diagram of an unsupervised PCB image reconstruction and defect detection network provided in one embodiment of the present invention; Figure 3 This is a schematic diagram of the WMCB module structure provided in one embodiment of the present invention; Figure 4 This is a schematic diagram of the FFTMB module structure provided in one embodiment of the present invention; Figure 5 This is a schematic diagram of the structure of an unsupervised PCB image reconstruction and defect detection system provided in one embodiment of the present invention. Detailed Implementation
[0022] The specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are for illustration and explanation only and are not intended to limit the present invention.
[0023] In this invention, the so-called feature channel refers to a set of features extracted from an image, used to represent different attributes or information of the image; the so-called encoder-decoder structure refers to a deep learning network architecture consisting of an encoder and a decoder; the so-called unsupervised learning refers to a learning method that does not rely on real defect-labeled samples during training, but only uses normal samples or synthetic abnormal samples for model training; the so-called self-supervised learning refers to a training method that guides the network to learn the ability to suppress and locate anomalies by constructing pseudo-labels or synthetic abnormal samples without manual annotation; the so-called synthetic anomaly refers to a pseudo-abnormal region constructed by applying noise perturbation, partial occlusion, texture replacement, brightness or contrast anomalies, local blurring or sharpening, etc., to normal images, and is used to replace real defect samples for training.
[0024] like Figure 1 and Figure 2As shown, this invention provides an unsupervised PCB defect detection method based on the collaborative operation of a Reconstruction Sub-Network and a Discriminator Sub-Network. The method employs a joint reconstruction-discrimination framework in its overall structure to address the problems of high-frequency detail loss, difficulty in detecting small-scale defects, insufficient structural consistency repair capability, and poor adaptability to unsupervised scenarios in existing PCB defect detection methods. It is easy to understand that this invention does not simply replace the existing DRAEM structure. Instead, it inherits the basic idea of training using only normal samples and performing joint learning by constructing pseudo-abnormal samples, and introduces wavelet transform, multi-scale dilated convolution, fast Fourier transform, and the Mamba state-space model to form a network structure and training process more suitable for PCB scenarios. This significantly improves the detection accuracy and anomaly repair quality of small PCB defects while ensuring model lightweightness and engineering deployability. Specifically, it includes the following execution steps: Step S10: Obtain a normal PCB image sample set, and perform normalization processing, size alignment processing, and pseudo-anomaly construction processing on the normal PCB image sample set to generate a training input image, a normal reference image, and an anomaly mask.
[0025] In this embodiment of the invention, normal images containing only defect-free PCBs are first acquired as the training data. It is readily understood that real defect samples in PCB industrial inspection scenarios are often scarce, unevenly distributed, and morphologically diverse. Relying entirely on manually labeled defect samples is not only costly but also difficult to cover all defect types in real production. Therefore, this invention preferably employs unsupervised or self-supervised training methods, using normal samples as the primary training data source, and generating the required anomaly supervision signals during training by performing pseudo-anomaly construction operations on the normal samples. This design not only maintains the basic advantages of unsupervised anomaly detection but also enables the reconstruction subnetwork to learn anomaly suppression capabilities and the discriminant subnetwork to learn anomaly localization capabilities.
[0026] Furthermore, in one executable implementation, step S10 can be refined into steps S101 to S104.
[0027] Step S101: Obtain a normal PCB image sample set and perform preprocessing.
[0028] Specifically, step S101 is used to obtain a normal PCB image sample set from the acquisition system, historical inspection data, or a normal sample library. The normal PCB image sample set preferably contains only defect-free PCB images, so that the network can establish a stable representation of normal PCB texture distribution, trace structure, pad boundaries, hole layout, and brightness distribution during training.
[0029] In this embodiment of the invention, normalization and size alignment are performed on each normal PCB image in the normal PCB image sample set to generate preprocessed normal images. The normalization process reduces the impact of different batch acquisition conditions, exposure conditions, and lighting environments on the grayscale distribution; the size alignment process ensures that each image maintains consistency in spatial size and input format, thereby facilitating subsequent batch training and multi-scale feature extraction. It is easy to understand that this preprocessing is not limited to a single fixed algorithm, as long as it ensures that the images input to the encoder have a uniform scale and a relatively stable statistical distribution.
[0030] Step S102: Perform pseudo-anomaly construction processing on the preprocessed normal image.
[0031] Specifically, after preprocessing, a target region is randomly selected from the preprocessed normal image, and at least one of the following processes is performed on the target region: noise injection, local occlusion, texture replacement, brightness anomaly processing, contrast anomaly processing, local blurring, or local sharpening processing, to generate a pseudo-anomaly region. It should be noted that the purpose of the above pseudo-anomaly construction method is to simulate as closely as possible the abnormal manifestations that may occur in actual PCB industrial scenarios, such as cracks, stains, scratches, occlusion, local brightness anomalies, edge blurring, or abrupt changes in detail.
[0032] It's easy to understand that the purpose of constructing pseudo-anomalies is not to obtain the exact same geometric shape as real defects, but rather to allow the network to perceive the mapping relationship between anomalous inputs, normal outputs, and anomalous masks during the training phase, thereby learning the difference propagation rules of anomalous regions in the reconstruction and discrimination branches. Furthermore, in one feasible implementation, the spatial extent, perturbation intensity, and perturbation type of the pseudo-anomaly region can be randomly combined to improve the model's robustness to different anomaly patterns.
[0033] Step S103: Generate training input image and anomaly mask.
[0034] Specifically, the pseudo-anomaly regions are mapped back to their corresponding preprocessed normal images to generate training input images, and anomaly masks representing the spatial locations of the pseudo-anomaly regions are generated simultaneously. The training input images serve as input to the reconstruction sub-network, and the anomaly masks serve as supervisory signals for the discriminative sub-network in learning anomaly segmentation.
[0035] It is easy to understand that in this invention, the anomaly mask does not originate from manually labeled real defects, but rather from the known locations of anomaly regions during the pseudo-anomaly construction step, falling under the category of unsupervised or self-supervised learning. In this way, on the one hand, the reconstruction sub-network is guided to learn how to restore an image containing anomalies to an image without anomalies; on the other hand, the discriminative sub-network can learn to recover the locations of anomaly regions from the differences between the original and reconstructed images, thus forming a joint learning mechanism for reconstruction and discrimination.
[0036] Step S104: Generate normal reference images and form training sample pairs.
[0037] Specifically, the preprocessed normal image is used as the normal reference image, and the training input image, the normal reference image, and the anomaly mask constitute a training sample pair. It should be noted that in this training sample pair, there is a clear data correspondence between the training input image and the normal reference image, and there is also a clear spatial correspondence between the anomaly mask and the pseudo-anomaly region in the training input image. Therefore, the subsequent reconstruction loss and segmentation loss can be jointly calculated around the same batch of training samples.
[0038] Step S20: Input the training input image into the reconstruction sub-network. In the encoding stage, perform multi-band decomposition based on discrete wavelet transform, high-frequency feature extraction based on multi-scale dilated convolution, and low-frequency feature modeling based on Mamba state space model on the input features. In the decoding stage, perform frequency domain decomposition, frequency domain enhancement, and frequency domain reconstruction based on fast Fourier transform on the encoded features to generate the reconstructed image.
[0039] In this embodiment of the invention, the reconstruction subnetwork is used to repair and reconstruct the input anomalous image. Its goal is to reconstruct an image containing anomalous defects into an anomalous image, thereby explicitly weakening the texture and structural representation of the anomalous region. The reconstruction subnetwork adopts an encoder-decoder structure, with the encoder backbone composed of multiple WMCB modules (Wave Mamba ConvolutionBlock, wavelet-Mamba multi-scale dilated convolution modules), such as... Figure 3 As shown, the decoder backbone consists of multiple FFTMB modules (Fourier Frequency Transform Mamba Block). It is easy to understand that this invention does not simply stack ordinary convolutional encoders, but rather uses the WMCB module to simultaneously integrate wavelet frequency domain decomposition characteristics, multi-scale spatial receptive field characteristics, and long-range dependency modeling capabilities during the encoding stage, thereby enhancing the joint representation of high-frequency details, minute defects, and global trace structures in PCB images.
[0040] Furthermore, in one executable implementation, the process of performing modeling in step S20 can be refined into steps S201 to S204.
[0041] Step S201: Map the training input image to the input feature map and perform discrete wavelet transform.
[0042] Specifically, the training input image is mapped to an input feature map. Then, a discrete wavelet transform is performed on the input feature map to obtain low-frequency sub-band features and high-frequency sub-band features, satisfying: ; in, It represents the low-frequency subband characteristics and is mainly used to characterize the overall brightness distribution, coarse-scale structure, and trace topology information in PCB images; , , These represent high-frequency sub-band features, mainly used to characterize edge contours, texture details, and high-frequency details such as micro-cracks, burrs, and broken lines; DWT(·) represents the discrete wavelet transform operator.
[0043] Step S202: Perform multi-scale local modeling on the high-frequency subband features.
[0044] Specifically, the high-frequency sub-band features are input into the high-frequency local branch for multi-scale local feature extraction. More specifically, the high-frequency sub-band features are processed separately. , , Perform a 1×1 convolution mapping to generate corresponding high-frequency mapping features; then perform multiple sets of depthwise separable dilated convolutions on the high-frequency mapping features to obtain multi-scale high-frequency local features, satisfying: ; ; in, This represents the characteristics of any high-frequency subband. This indicates the output of the depth-separable dilated convolution in the i-th group. Indicates the kernel size as Expansion rate Depth-separable dilated convolution operator, This represents a 1×1 convolution mapping operator, and Concat(·) represents a channel concatenation operator. Take 3, 5, and 7 respectively. Do not select 1, 2, or 3. Subsequently, perform 1×1 convolution fusion on the multi-scale high-frequency local features to generate high-frequency fused features.
[0045] Step S203: Perform Mamba long-range dependency modeling on the low-frequency subband features.
[0046] Specifically, the low-frequency sub-band features Long-range dependency modeling is performed using the low-frequency global branch. More specifically, this can be applied to the low-frequency sub-band features. Perform linear mapping and normalization, and input the processed low-frequency subband features into the Mamba state-space model to obtain low-frequency global features that satisfy: ; Where Linear(·) represents the linear mapping operator, Norm(·) represents the normalization operator, and Mamba(·) represents the state-space modeling operator. It shows low-frequency global characteristics.
[0047] Step S204: Perform fusion reconstruction of low-frequency features and high-frequency features.
[0048] Specifically, the low-frequency global features and the high-frequency fused features are concatenated through channels, and convolutional fusion and inverse wavelet transform are performed to obtain the encoded output features, satisfying the following: ; in, Indicates high-frequency fusion characteristics, IDWT(·) represents the 3×3 convolution fusion operator, and IDWT(·) represents the inverse wavelet transform operator. This represents the encoded output feature. Subsequently, the encoded output feature is downsampled and transmitted to the next level of encoding layer or bottleneck layer.
[0049] Furthermore, in this embodiment of the invention, the multi-scale fusion features from the encoder bottleneck layer are fed into the FFTMB frequency domain anomaly repair module, such as... Figure 4 As shown, it is easy to understand that the goal of the decoding stage is not only to restore spatial resolution, but more importantly, to suppressively repair anomalous regions during reconstruction, making the output image as close as possible to a defect-free state. This creates anomalous differences between the original image and the reconstructed image that can be utilized by the discriminative sub-network. Therefore, this invention introduces Fast Fourier Transform in the decoding stage to map features from the spatial domain to the frequency domain, and decomposes the resulting complex spectrum into amplitude and phase spectra to specifically model local texture details and global structural consistency, respectively.
[0050] Furthermore, in one executable implementation, the repair process in step S20 can be further refined into steps S205 to S208.
[0051] Step S205: Perform a fast Fourier transform on the encoded output features and decompose the amplitude spectrum and phase spectrum.
[0052] Specifically, for the encoded output features Perform a Fast Fourier Transform to obtain a complex spectrum, and then perform amplitude spectral decomposition and phase spectral decomposition on the complex spectrum to satisfy: ; ; ; in, Let FFT(·) represent the complex spectrum, A represent the amplitude spectrum, used to characterize the energy distribution characteristics of different frequency components; Φ represent the phase spectrum, used to characterize the structural layout and texture arrangement of the image; |·| represent the modulus operator; and ∠(·) represent the phase operator. It should be noted that the amplitude spectrum and phase spectrum perform different information expression functions in the image frequency domain. The amplitude spectrum is more inclined towards energy and texture intensity distribution, while the phase spectrum is more inclined towards spatial layout and structural relationships. Therefore, modeling them separately is more beneficial for balancing local detail restoration with the preservation of overall structural consistency.
[0053] Step S206: Perform local frequency enhancement on the amplitude spectrum.
[0054] Specifically, convolution enhancement is performed on the amplitude spectrum to obtain enhanced amplitude features that satisfy: ; Where A′ represents the enhanced amplitude feature, Let σ(·) denote the local convolution enhancement operator, and let σ(·) denote the activation function. This represents the channel mapping operator.
[0055] Step S207: Perform Mamba global dependency modeling on the phase spectrum.
[0056] Specifically, linear mapping, normalization, and Mamba state-space modeling are performed on the phase spectrum to obtain global phase features that satisfy: ; Where Φ′ represents the global phase feature.
[0057] Step S208: Perform frequency domain fusion, inverse Fourier transform, and skip connection residual fusion.
[0058] Specifically, the complex spectrum is reconstructed based on the enhanced amplitude features and the global phase features, and an inverse fast Fourier transform is performed to obtain the reconstructed features, satisfying: ; ; in, This represents the reconstruction of the complex spectrum. This represents the phase complex exponential term generated from global phase features, where j represents the imaginary unit, and IFFT(⋅) represents the inverse fast Fourier transform operator. The reconstructed features are then represented. Subsequently, residual fusion is performed between the reconstructed features and the skip connection features from the encoding stage to generate the reconstructed image.
[0059] Step S30: Concatenate the training input image and the reconstructed image along the channel dimension, and input the concatenation result into the discriminant sub-network to generate an anomaly probability map. Step S40: Based on the normal reference image, the reconstructed image, the anomaly mask, and the anomaly probability map, calculate the joint loss and jointly optimize the reconstructed sub-network and the discriminant sub-network to obtain an unsupervised PCB image reconstruction and defect detection model.
[0060] In this embodiment of the invention, the discriminant subnetwork is used for segmentation and detection of abnormal regions. Essentially, it is an anomaly segmentation network used to learn the difference distribution patterns between the original image and the reconstructed image. To reduce the number of parameters and improve inference efficiency, this invention employs a convolution-based U-Net structure as the discriminant subnetwork. It is easy to understand that the purpose of setting up the discriminant subnetwork is not to replace the reconstruction network for direct segmentation, but rather to utilize the structural, textural, and semantic differences between the original and reconstructed images to further transform the reconstruction residual into a more accurate pixel-level anomaly probability map, thereby improving the accuracy of anomaly region localization and segmentation boundaries.
[0061] Furthermore, in one executable implementation, step S30 can be refined into the following process: Step S301: Construct the input features for discrimination and output the anomaly probability map.
[0062] Specifically, the training input image or the PCB image to be detected is denoted as... The reconstructed image is denoted as The two are then concatenated along the channel dimension to obtain the discriminative input features, satisfying the following: ; in, The input features are defined as follows: `Concat(·)` represents the channel concatenation operator. Subsequently, these input features are fed into a convolution-based U-Net discriminant subnetwork to output an anomaly probability map P with the same spatial dimensions as the input image.
[0063] Furthermore, in one executable implementation, step S40 can be refined into the following process: Step S401: Calculate the reconstruction loss.
[0064] Specifically, the reconstructed image is denoted as The normal reference image is denoted as And based on the reconstructed image and the normal reference image, the reconstruction loss is calculated, satisfying: ; in, Indicates the losses incurred during reconstruction. SSIM(·) represents the mean squared error loss function, and SSIM(·) represents the structural similarity function. and This represents the loss weighting coefficient.
[0065] Step S402: Calculate the segmentation loss and the joint loss.
[0066] Specifically, let P be the anomaly probability map, M be the anomaly mask, and calculate the segmentation loss based on the anomaly probability map and the anomaly mask, satisfying: ; in, Indicates the partition loss. Let represent the focus loss function. Further, a joint loss is calculated based on the reconstruction loss and the segmentation loss, satisfying: ; Where L represents the joint loss and γ represents the weighting coefficient of the segmentation loss.
[0067] Step S403: Perform joint optimization of the reconstructed subnetwork and the discriminative subnetwork based on the joint loss.
[0068] Specifically, backpropagation updates are performed on the reconstruction subnetwork and the discriminant subnetwork based on the joint loss to obtain an unsupervised PCB image reconstruction and defect detection model.
[0069] Step S50: Input the PCB image to be detected into the jointly optimized unsupervised PCB image reconstruction and defect detection model, output a defect mask map, and determine the location, outline and area information of the defect region based on the defect mask map.
[0070] In this embodiment of the invention, the overall flow of the inference phase is consistent with the data flow of the training phase, except that a normal reference image and anomaly mask are no longer required. Specifically, the inference phase includes the following processing steps: First, the PCB image to be detected is input into the reconstruction sub-network to obtain a reconstructed image; second, the PCB image to be detected and the reconstructed image are concatenated along the channel dimension and used as input to the discriminant sub-network; third, the discriminant sub-network outputs a predicted anomaly mask map or anomaly probability map based on the differences between the original image and the reconstructed image at the structural, texture, and semantic levels; finally, the anomaly probability map is thresholded, morphologically filtered, and analyzed by connected components to obtain the final location, contour, and area information of the defect region.
[0071] Furthermore, in one executable implementation, step S50 can be refined into steps S501 to S503.
[0072] Step S501: Input the PCB image to be detected and generate an anomaly probability map.
[0073] Specifically, the PCB image to be detected is sequentially input into the reconstruction subnetwork and the discrimination subnetwork to generate the corresponding anomaly probability map.
[0074] Step S502: Perform thresholding to obtain a binary defect mask.
[0075] Specifically, thresholding is performed on the anomaly probability map to obtain a binary defect mask that satisfies: ; Where B(x,y) represents the binary defect mask value at coordinate (x,y), and P(x,y) represents the anomaly probability value at coordinate (x,y). This represents the threshold parameter.
[0076] Step S503: Perform post-processing and output defect area information.
[0077] Specifically, morphological filtering and connected component analysis are performed on the binary defect mask to remove noise regions and extract candidate defect regions; then, the position, contour and area information of the defect regions are output based on the candidate defect regions.
[0078] like Figure 5 As shown, embodiments of the present invention also provide an unsupervised PCB image reconstruction and defect detection system, the system being used to perform the aforementioned method. The system includes: The construction unit is used to acquire a normal PCB image sample set and perform normalization processing, size alignment processing and pseudo-anomaly construction processing on the normal PCB image sample set to generate a training input image, a normal reference image and an anomaly mask. The reconstruction unit is used to input the training input image into the reconstruction sub-network. In the encoding stage, it performs multi-band decomposition based on discrete wavelet transform, high-frequency feature extraction based on multi-scale dilated convolution, and low-frequency feature modeling based on Mamba state space model. In the decoding stage, it performs frequency domain decomposition, frequency domain enhancement, and frequency domain reconstruction based on fast Fourier transform to generate the reconstructed image. The discrimination unit is used to concatenate the training input image and the reconstructed image in the channel dimension, and input the concatenation result into the discrimination subnetwork to generate an anomaly probability map; An optimization unit is used to calculate the joint loss and jointly optimize the reconstruction subnetwork and the discriminant subnetwork based on the normal reference image, the reconstructed image, the anomaly mask and the anomaly probability map, so as to obtain an unsupervised PCB image reconstruction and defect detection model. The output unit is used to input the PCB image to be detected into the jointly optimized unsupervised PCB image reconstruction and defect detection model, output a defect mask map, and determine the location, outline and area information of the defect region based on the defect mask map.
[0079] Those skilled in the art will understand that all or part of the steps in the methods of the above embodiments can be implemented by a program instructing related hardware. This program is stored in a storage medium and includes several instructions to cause a microcontroller, chip, or processor to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as a USB flash drive, a portable hard drive, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.
[0080] The optional embodiments of the present invention have been described in detail above with reference to the accompanying drawings. However, the embodiments of the present invention are not limited to the specific details described above. Within the scope of the technical concept of the embodiments of the present invention, various simple modifications can be made to the technical solutions of the embodiments of the present invention, and these simple modifications all fall within the protection scope of the embodiments of the present invention. It should also be noted that the various specific technical features described in the above specific embodiments can be combined in any suitable manner without contradiction. To avoid unnecessary repetition, the embodiments of the present invention will not further describe the various possible combinations.
[0081] Furthermore, various different embodiments of the present invention can be combined in any way, as long as they do not violate the spirit of the embodiments of the present invention, they should also be regarded as the content disclosed by the embodiments of the present invention.
Claims
1. An unsupervised PCB image reconstruction and defect detection method, characterized in that, include: A normal PCB image sample set is obtained, and normalization, size alignment, and pseudo-anomaly construction are performed on the normal PCB image sample set to generate a training input image, a normal reference image, and an anomaly mask. The training input image is input into the reconstruction sub-network. In the encoding stage, the input features are decomposed into multiple frequency bands based on discrete wavelet transform, high-frequency features are extracted based on multi-scale dilated convolution, and low-frequency features are modeled based on Mamba state space model. In the decoding stage, the encoded features are decomposed into frequency domain, enhanced into frequency domain, and reconstructed into frequency domain based on fast Fourier transform to generate the reconstructed image. The training input image and the reconstructed image are concatenated along the channel dimension, and the concatenation result is input into the discriminant subnetwork to generate an anomaly probability map. Based on the normal reference image, the reconstructed image, the anomaly mask, and the anomaly probability map, the joint loss is calculated and the reconstructed subnetwork and the discriminant subnetwork are jointly optimized to obtain an unsupervised PCB image reconstruction and defect detection model. The PCB image to be detected is input into the jointly optimized unsupervised PCB image reconstruction and defect detection model, which outputs a defect mask image and determines the location, outline and area information of the defect region based on the defect mask image.
2. The unsupervised PCB image reconstruction and defect detection method according to claim 1, characterized in that, Obtain a normal PCB image sample set, and perform normalization, size alignment, and pseudo-anomaly construction processing on the normal PCB image sample set to generate a training input image, a normal reference image, and an anomaly mask, including: Obtain normal images containing only defect-free PCBs as the normal PCB image sample set; Normalization and size alignment are performed on each normal PCB image in the normal PCB image sample set to generate a preprocessed normal image; In the preprocessed normal image, a target region is randomly selected, and at least one of the following processes is performed on the target region: noise injection, local occlusion, texture replacement, brightness anomaly processing, contrast anomaly processing, local blurring processing, or local sharpening processing, to generate a pseudo-anomaly region. The pseudo-abnormal regions are mapped back to the corresponding preprocessed normal images to generate training input images, and an anomaly mask representing the spatial location of the pseudo-abnormal regions is generated simultaneously. The preprocessed normal image is used as a normal reference image, and the training input image, the normal reference image, and the anomaly mask are used to form a training sample pair.
3. The unsupervised PCB image reconstruction and defect detection method according to claim 1, characterized in that, The training input image is input into the reconstruction sub-network. During the encoding stage, the input features are subjected to multi-band decomposition based on discrete wavelet transform, high-frequency feature extraction based on multi-scale dilated convolution, and low-frequency feature modeling based on the Mamba state-space model, including: The training input image is mapped to an input feature map. ; For the input feature map Perform a discrete wavelet transform to obtain the low-frequency subband features and high-frequency subband features, expressed as follows: ; in, Indicates the characteristics of low-frequency subbands. , , They represent the high-frequency sub-band characteristics, and DWT(·) represents the discrete wavelet transform operator; The high-frequency sub-band features are input into the high-frequency local branch for multi-scale local feature extraction. The low-frequency subband features are input into the low-frequency global branch for long-range dependency modeling, and feature fusion is performed on the output of the low-frequency global branch and the output of the high-frequency local branch. The fusion result is then transmitted to the decoding stage.
4. The unsupervised PCB image reconstruction and defect detection method according to claim 3, characterized in that, The high-frequency sub-band features are input into the high-frequency local branch for multi-scale local feature extraction, including: The high-frequency sub-band features are respectively , , Perform a 1×1 convolutional mapping to generate the corresponding high-frequency mapping features; Multiple sets of depthwise separable dilated convolutions are performed on the high-frequency mapping features to obtain multi-scale high-frequency local features, expressed as follows: ; ; in, This represents the characteristics of any high-frequency subband. This indicates the output of the depth-separable dilated convolution in the i-th group. Indicates the kernel size as Expansion rate Depth-separable dilated convolution operator, This represents a 1×1 convolution mapping operator, and Concat(·) represents a channel concatenation operator. Take 3, 5, and 7 respectively. Take 1, 2, and 3 respectively; Perform 1×1 convolution fusion on the multi-scale high-frequency local features to generate high-frequency fused features.
5. The unsupervised PCB image reconstruction and defect detection method according to claim 4, characterized in that, The low-frequency sub-band features are input into the low-frequency global branch for long-range dependency modeling, and feature fusion is performed on the output of the low-frequency global branch and the output of the high-frequency local branch, including: For the low-frequency subband features Perform linear mapping and normalization. The processed low-frequency subband features are input into the Mamba state-space model to obtain the low-frequency global features, expressed as follows: ; Where Linear(·) represents the linear mapping operator, Norm(·) represents the normalization operator, and Mamba(·) represents the state-space modeling operator. Indicates low-frequency global features; The low-frequency global features and the high-frequency fused features are concatenated by channels, and convolutional fusion and inverse wavelet transform are performed to obtain the encoded output features, satisfying the following: ; in, Indicates high-frequency fusion characteristics, IDWT(·) represents the 3×3 convolution fusion operator, and IDWT(·) represents the inverse wavelet transform operator. Indicates the encoded output features; The encoded output features are downsampled and then transmitted to the next level of encoding layer or bottleneck layer.
6. The unsupervised PCB image reconstruction and defect detection method according to claim 1, characterized in that, In the decoding stage, frequency domain decomposition, frequency domain enhancement, and frequency domain reconstruction based on Fast Fourier Transform are performed on the encoded features to generate a reconstructed image, including: Encoding features Perform a Fast Fourier Transform to obtain a complex spectrum, and then perform amplitude spectral decomposition and phase spectral decomposition on the complex spectrum to satisfy: ; ; ; in, denoted by , FFT(·) denotes the Fast Fourier Transform operator, A denotes the amplitude spectrum, Φ denotes the phase spectrum, |·| denotes the modulus operator, and ∠(·) denotes the phase operator; Convolutional enhancement is performed on the amplitude spectrum, and linear mapping, normalization, and Mamba state-space modeling are performed on the phase spectrum to obtain enhanced amplitude features and global phase features, satisfying: ; ; Where A′ represents the enhanced amplitude feature, Φ′ represents the global phase feature, and σ(·) represents the activation function; Based on the enhanced amplitude features and the global phase features, the complex spectrum is reconstructed, and an inverse fast Fourier transform is performed to obtain the reconstructed features, satisfying: ; ; in, This represents the reconstruction of the complex spectrum. This represents the phase complex exponential term generated from global phase features, where j represents the imaginary unit, and IFFT(·) represents the inverse fast Fourier transform operator. Indicates reconstruction features; The reconstructed features are residually fused with the skip connection features from the encoding stage to generate the reconstructed image.
7. The unsupervised PCB image reconstruction and defect detection method according to claim 1, characterized in that, The training input image and the reconstructed image are concatenated along the channel dimension, and the concatenation result is input into the discriminant subnetwork to generate an anomaly probability map, including: The training input image or the PCB image to be detected is denoted as... The input image and the reconstructed image are then concatenated along the channel dimension to obtain the discriminative input features, expressed as: ; in, This indicates the input feature discrimination, and Concat(·) represents the channel concatenation operator; The discriminative input features are input into a convolution-based U-Net discriminative subnetwork to output an anomaly probability map P with the same spatial size as the input image. During the inference phase, the PCB image to be detected is sequentially input into the reconstruction subnetwork and the discrimination subnetwork to generate the corresponding anomaly probability map.
8. The unsupervised PCB image reconstruction and defect detection method according to claim 1, characterized in that, Based on the normal reference image, the reconstructed image, the anomaly mask, and the anomaly probability map, a joint loss is calculated, and the reconstructed subnetwork and the discriminative subnetwork are jointly optimized, including: The reconstructed image is denoted as The normal reference image is denoted as Let P be the anomaly probability map and M be the anomaly mask. The reconstruction loss is calculated based on the reconstructed image and the normal reference image, expressed as follows: ; in, Indicates the losses incurred during reconstruction. SSIM(·) represents the mean squared error loss function, and SSIM(·) represents the structural similarity function. and Indicates the loss weighting coefficient; The segmentation loss is calculated based on the anomaly probability map and the anomaly mask, satisfying the following: ; in, Indicates the loss from partitioning. Represents the focus loss function; The joint loss is calculated based on the reconstruction loss and the segmentation loss, satisfying the following: ; Where L represents the joint loss and γ represents the weighting coefficient of the segmentation loss; Backpropagation updates are performed on the reconstruction subnetwork and the discriminant subnetwork based on the joint loss to obtain an unsupervised PCB image reconstruction and defect detection model.
9. The unsupervised PCB image reconstruction and defect detection method according to claim 1, characterized in that, Determining the location, outline, and area information of the defect region based on the defect mask image specifically includes: Thresholding is performed on the anomaly probability map to obtain a binary defect mask that satisfies: ; Where B(x,y) represents the binary defect mask value at coordinate (x,y), and P(x,y) represents the anomaly probability value at coordinate (x,y). Indicates the threshold parameter; Morphological filtering and connected component analysis are performed on the binary defect mask to remove noise regions and extract candidate defect regions; The location, outline, and area information of the defect region are output based on the candidate defect region.
10. An unsupervised PCB image reconstruction and defect detection system, characterized in that, The system is used to execute the unsupervised PCB image reconstruction and defect detection method according to any one of claims 1-9, and the system includes: The construction unit is used to acquire a normal PCB image sample set and perform normalization processing, size alignment processing and pseudo-anomaly construction processing on the normal PCB image sample set to generate a training input image, a normal reference image and an anomaly mask. The reconstruction unit is used to input the training input image into the reconstruction sub-network. In the encoding stage, it performs multi-band decomposition based on discrete wavelet transform, high-frequency feature extraction based on multi-scale dilated convolution, and low-frequency feature modeling based on Mamba state space model. In the decoding stage, it performs frequency domain decomposition, frequency domain enhancement, and frequency domain reconstruction based on fast Fourier transform to generate the reconstructed image. The discrimination unit is used to concatenate the training input image and the reconstructed image in the channel dimension, and input the concatenation result into the discrimination subnetwork to generate an anomaly probability map; An optimization unit is used to calculate the joint loss and jointly optimize the reconstruction subnetwork and the discriminant subnetwork based on the normal reference image, the reconstructed image, the anomaly mask and the anomaly probability map, so as to obtain an unsupervised PCB image reconstruction and defect detection model. The output unit is used to input the PCB image to be detected into the jointly optimized unsupervised PCB image reconstruction and defect detection model, output a defect mask map, and determine the location, outline and area information of the defect region based on the defect mask map.