Anomaly detection method and apparatus, electronic device, and storage medium
By normalizing and reducing the dimensions of the target image features and using a preset deviation threshold for anomaly detection, the problem of difficulty in selecting a threshold due to the complex feature distribution in the target space is solved, and more accurate anomaly detection is achieved.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- SUZHOU MEGAROBO TECH CO LTD
- Filing Date
- 2025-09-29
- Publication Date
- 2026-06-18
AI Technical Summary
When performing anomaly detection based on features of the target space, it is difficult to select a threshold to determine whether an anomaly exists, especially when the feature distribution is complex and the distribution pattern is not obvious.
By normalizing the features of the target image using preset normalization parameters, and using the feature distribution information of the first positive sample image, dimensionality reduction operation and inverse Laplacian transform, the degree of deviation of the target image relative to the first positive sample image is determined, and anomaly detection is performed based on the degree of deviation and the preset deviation threshold.
This makes the distribution pattern of target image features predictable, allows for accurate setting of appropriate preset deviation thresholds, reduces background information interference, and improves the accuracy of anomaly detection.
Smart Images

Figure CN2025125156_18062026_PF_FP_ABST
Abstract
Description
Anomaly detection methods, devices, electronic equipment, and storage media
[0001] This application claims priority to Chinese Patent Application No. 202411829987.8, filed on December 12, 2024, entitled "Anomaly Detection Method, Apparatus, Electronic Device, Storage Medium", the entire contents of which are incorporated herein by reference. Technical Field
[0002] This application relates to the field of industrial inspection, and more specifically, to an anomaly detection method, apparatus, electronic device, storage medium, and computer program product. Background Technology
[0003] Anomaly detection has wide applications in industrial fields, and normalized flow-based methods are one important approach. Normalized flow-based methods map image features from a sample image to a target space, where anomaly detection is then performed. However, because the feature distribution in the target space is complex and its patterns are not obvious, selecting the appropriate threshold to determine the presence of anomalies is challenging when using features from the target space for anomaly detection. Summary of the Invention
[0004] This application is made in consideration of the above-mentioned problems. This application provides an anomaly detection method, apparatus, electronic device, storage medium, and computer program product, which makes the distribution pattern of target image features predictable at least when approaching a first positive sample image, thereby making it possible to set a preset deviation threshold based on this predictable distribution pattern.
[0005] According to a first aspect of this application, an anomaly detection method is provided, comprising: inputting a target image acquired for a target object into a trained feature extraction network to obtain target image features of the target image; normalizing the target image features based on preset normalization parameters to obtain normalized features of the target image, wherein the preset normalization parameters are statistical information obtained by statistically analyzing the distribution of feature vector values of multiple sets of first sample image features, the multiple sets of first sample image features are image features obtained by inputting multiple first positive sample images into the trained feature extraction network, and the first positive sample images are images acquired for a non-abnormal object; determining the degree of deviation of the target image relative to the first positive sample images based on the normalized features; and determining an anomaly detection result of the target image based on the degree of deviation and a preset deviation threshold, wherein the anomaly detection result is used to indicate whether there is an abnormal region in the target object.
[0006] In one possible implementation, determining the degree of deviation of the target image relative to the first positive sample image based on normalized features includes: performing a dimensionality reduction operation by averaging the feature vector values of each channel of the normalized features to obtain an abnormal heatmap, wherein the pixel values in the abnormal heatmap are used to represent the degree of deviation of the target image relative to the first positive sample image; and the pixel values of each heatmap region are used to represent the degree of deviation of the image region in the target image corresponding to that heatmap region relative to the image region in the first positive sample image corresponding to that heatmap region.
[0007] In one possible implementation, before performing dimensionality reduction by averaging the feature vector values of each channel of the normalized features to obtain the abnormal heatmap, determining the degree of deviation of the target image from the first positive sample image based on the normalized features further includes: converting the scale of the feature map contained in the normalized features to be consistent with the scale of the target image by using an inverse Laplacian transform.
[0008] In one possible implementation, before determining the anomaly detection result of the target image based on the degree of deviation and a preset deviation threshold, the method further includes setting a preset deviation threshold based on a preset normalization parameter.
[0009] In one possible implementation, the preset normalization parameters include one or more of the following: mean, standard deviation, variance, range, absolute deviation of median, and interquartile range.
[0010] In one possible implementation, the training process of the trained feature extraction network is as follows: acquiring multiple second positive sample images, which include one or more sample pairs, each sample pair including two positive sample images; inputting each sample pair into the initial feature extraction network to obtain the second sample image features and third sample image features corresponding one-to-one with the two positive sample images in each sample pair; calculating the loss value based on the second sample image features and third sample image features corresponding to each sample pair, and using the loss value to optimize the initial feature extraction network until the preset training requirements are met, so as to obtain the trained feature extraction network.
[0011] In one possible implementation, the plurality of second positive sample images are identical to the plurality of first positive sample images.
[0012] In one possible implementation, the target image acquired for the target object is input into a trained feature extraction network to obtain the target image features of the target image, including: extracting multi-scale image features of the target image in the trained feature extraction network and converting the multi-scale image features into target image features of a preset scale through normalization stream processing.
[0013] According to a second aspect of this application, an anomaly detection device is also provided, comprising: an input module for inputting a target image acquired for a target object into a trained feature extraction network to obtain target image features of the target image; a processing module for normalizing the target image features based on preset normalization parameters to obtain normalized features of the target image, wherein the preset normalization parameters are statistical information obtained by statistically analyzing the distribution of feature vector values of multiple sets of first sample image features, the multiple sets of first sample image features are image features obtained by inputting multiple first positive sample images into the trained feature extraction network, and the first positive sample images are images acquired for non-abnormal objects; a first determination module for determining the degree of deviation of the target image relative to the first positive sample images based on the normalized features; and a second determination module for determining the anomaly detection result of the target image based on the degree of deviation and a preset deviation threshold, wherein the anomaly detection result is used to indicate whether there is an abnormal region in the target object.
[0014] According to a third aspect of this application, an electronic device is also provided, comprising: a processor and a memory, wherein the memory stores computer program instructions, which are executed by the processor to perform the above-described anomaly detection method.
[0015] According to a fourth aspect of this application, a storage medium is also provided, on which program instructions are stored, which are used to execute the above-described anomaly detection method during runtime.
[0016] According to a fifth aspect of this application, a computer program product is also provided, including computer program instructions that, when run, are used to perform the anomaly detection method as described above.
[0017] The aforementioned technical solution normalizes the target image features using preset normalization parameters. Based on the statistical information of the first sample image features from the first positive sample image, it tends to converge the feature vector values of target image features with high dispersion, large distribution range, and no obvious distribution pattern to a relatively small numerical range. This reduces the dispersion of the feature vector values of the obtained normalized features and makes the distribution pattern more obvious. This processing method makes the distribution pattern of the target image features predictable, at least when approaching the first positive sample image. Therefore, it becomes possible to set a preset deviation threshold based on this predictable distribution pattern. In other words, this processing method helps to set an appropriate preset deviation threshold for anomaly detection. Furthermore, the target image features output by this solution can reduce interference from background information, such as texture interference from materials like metal.
[0018] The above description is only an overview of the technical solution of this application. In order to better understand the technical means of this application and to implement it in accordance with the contents of the specification, and to make the above and other objects, features and advantages of this application more obvious and understandable, the following are specific embodiments of this application. Attached Figure Description
[0019] The above and other objects, features, and advantages of this application will become more apparent from the more detailed description of the embodiments of this application in conjunction with the accompanying drawings. The accompanying drawings are used to provide a further understanding of the embodiments of this application and form part of the specification. They are used together with the embodiments of this application to explain this application and do not constitute a limitation thereof. In the accompanying drawings, the same reference numerals generally represent the same components or steps.
[0020] Figure 1 shows a schematic flowchart of an anomaly detection method according to an embodiment of this application;
[0021] Figure 2 shows a schematic diagram comparing a target image and a corresponding anomaly heatmap according to an embodiment of this application;
[0022] Figure 3 shows a schematic flowchart of the normalization process of target image features obtained after inputting a target image into a trained feature extraction network according to an embodiment of the present application;
[0023] Figure 4 illustrates a schematic diagram of the initial training process of a feature extraction network according to an embodiment of this application;
[0024] Figure 5 shows a schematic block diagram of an anomaly detection device according to an embodiment of this application;
[0025] Figure 6 shows a schematic block diagram of an electronic device according to an embodiment of the present application. Detailed Implementation
[0026] To make the objectives, technical solutions, and advantages of this application more apparent, exemplary embodiments according to this application will be described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are merely some embodiments of this application, and not all embodiments of this application. It should be understood that this application is not limited to the exemplary embodiments described herein. Based on the embodiments of this application described herein, all other embodiments obtained by those skilled in the art without inventive effort should fall within the protection scope of this application.
[0027] As mentioned above, in related technologies, selecting a threshold for determining the presence of an anomaly when performing anomaly detection based on features of the target space is relatively difficult. To at least partially solve the above-mentioned technical problems, embodiments of this application provide an anomaly detection method, apparatus, electronic device, storage medium, and computer program product. This solution makes the distribution pattern of target image features predictable at least when approaching a first positive sample image, thereby making it possible to set a preset deviation threshold based on this predictable distribution pattern.
[0028] Please refer to Figure 1, which is a schematic flowchart of an anomaly detection method according to an embodiment of this application. According to a first aspect of this application, an anomaly detection method is provided, the method including: steps S110, S120, S130 and S140.
[0029] In step S110, the target image acquired for the target object is input into the trained feature extraction network to obtain the target image features.
[0030] In one possible implementation, the target object can be any type of object, such as a wafer, chip, or electronic component. An image acquisition device (e.g., a camera) can be used to acquire an image of the target object to obtain a target image. The target image can be the acquired raw image, or it can be an image obtained after image preprocessing (e.g., smoothing, denoising) of the raw image. Inputting the target image into a trained feature extraction network can obtain target image features. Specifically, in the feature extraction network, features can be extracted from the target image to obtain multiple image features at different scales, and normalized stream processing and inverse Laplacian transform are performed on the obtained multiple image features at different scales. The normalized stream processing can include mapping multiple image features at different scales to a target space (also known as a "latent space"). The multiple image features mapped to the target space can be merged after normalization processing and inverse Laplacian transform to obtain the target image features. More specifically, the feature extraction network can be a normalized flow-based neural network, such as a neural network based on PyramidFlow, a neural network based on FastFlow, or a neural network based on Multiscale Normalized Flow (MSFlow).
[0031] In step S120, the target image features are normalized based on preset normalization parameters to obtain normalized features of the target image. The preset normalization parameters are statistical information obtained by statistically analyzing the distribution of feature vector values of multiple sets of first sample image features. The multiple sets of first sample image features are image features obtained by inputting multiple first positive sample images into a trained feature extraction network. The first positive sample images are images acquired for non-abnormal objects.
[0032] In one possible implementation, after obtaining the target image features, the target image features can be normalized based on a preset normalization parameter. The process of obtaining the preset normalization parameter may include: inputting multiple first positive sample images into a trained feature extraction network to obtain multiple sets of first sample image features corresponding one-to-one with the multiple first positive sample images; and using statistical information obtained by statistically analyzing the distribution of feature vector values of the multiple sets of first sample image features as the preset normalization parameter. In this embodiment, the feature vector value of any image feature can be understood as the element contained in the feature vector representing that image feature. For example, any image feature can be represented by a feature vector of size H×W×C, then the feature vector may include H×W×C elements or feature vector values. Statistical information may be, for example, any one or more parameters that can describe the data distribution, such as mean, variance, standard deviation, quantiles, etc. Taking the mean and standard deviation of the feature vector values of multiple sets of first sample image features as preset normalization parameters, and denoting the feature vector value of the normalized feature as x', the normalized feature can be determined in one possible implementation using the formula x' = (x - μ) / σ, where x is the feature vector value of the target image feature, and μ and σ are the mean and standard deviation of the feature vector values of multiple sets of first sample image features, respectively. It can be understood that the larger the absolute value of the difference between the feature vector value of the target image feature and the mean of the feature vector values of multiple sets of first sample image features, the larger the feature vector value of the obtained normalized feature. Therefore, the feature vector value of the normalized feature can reflect the deviation of the target image feature from the first sample image feature to a certain extent. It should be noted that the preset normalization parameters used in the normalization processing of this application embodiment are not limited to the mean and standard deviation; other parameters such as variance and quantiles can also be used to obtain the normalized feature of the target image. In one possible implementation, the first positive sample image can be an image acquired by an image acquisition device for a non-abnormal object. The objects contained in the first positive sample image can be any type of object, such as wafers, chips, electronic components, etc., and the type of objects contained in the first positive sample image can be the same as the type of the target object.
[0033] In step S130, the degree of deviation of the target image relative to the first positive sample image is determined based on the normalized features.
[0034] In one possible implementation, the normalized features are determined based on preset normalization parameters. The normalization parameters themselves are statistical information obtained by statistically analyzing the distribution of feature vector values of multiple sets of first sample image features. The normalized features can reflect the deviation of the target image features from the first sample image features to a certain extent. Therefore, the degree of deviation of the target image from the first positive sample image can be determined based on the normalized features.
[0035] In step S140, the anomaly detection result of the target image is determined based on the degree of deviation and the preset deviation threshold. The anomaly detection result is used to indicate whether there is an abnormal region in the target object.
[0036] In one possible implementation, the preset deviation threshold can be selected by the user based on the numerical distribution of the feature vector values of the normalized features, or it can be determined using a model-based method. The model could be, for example, a One-Class Support Vector Machine (SVM), an Isolation Forest, or an Autoencoder. The preset deviation threshold can also be set based on preset normalization parameters. It is understood that different preset normalization parameters can correspond to different preset deviation thresholds. For example, when the preset normalization parameters are the mean and standard deviation, the preset deviation threshold can be a preset multiple relative to the standard deviation. The preset multiple can be determined based on statistical critical value setting rules. The preset multiple has a positive or negative sign (used to indicate the direction of deviation), corresponding to positive and negative values of the preset multiple, respectively. When the feature vector value of the normalized feature is greater than a positive value of the preset multiple or less than a negative value of the preset multiple, it can be determined that an abnormal region exists in the target object. As another example, when the preset normalization parameter is the range, the preset deviation threshold can include a first threshold and a second threshold, with the first threshold being less than the second threshold. The first threshold can be, for example, 0, or the difference between 0 and a preset empirical constant. The second threshold can be, for example, 1, or the sum of 1 and the preset empirical constant. The preset empirical constant can be selected according to the actual situation, and this application embodiment does not impose a specific limitation. When the deviation is less than the first threshold or greater than the second threshold, it can be determined that there is an abnormal area in the target object.
[0037] The aforementioned technical solution normalizes the target image features using preset normalization parameters. Based on the statistical information of the first sample image features from the first positive sample image, it tends to converge the feature vector values of target image features with high dispersion, large distribution range, and no obvious distribution pattern to a relatively small numerical range. This reduces the dispersion of the feature vector values of the obtained normalized features and makes the distribution pattern more obvious. This processing method makes the distribution pattern of the target image features predictable, at least when approaching the first positive sample image. Therefore, it becomes possible to set a preset deviation threshold based on this predictable distribution pattern. In other words, this processing method helps to set an appropriate preset deviation threshold for anomaly detection. Furthermore, the target image features output by this solution can reduce interference from background information, such as texture interference from materials like metal.
[0038] In one possible implementation, determining the degree of deviation of the target image relative to the first positive sample image based on normalized features includes: performing a dimensionality reduction operation by averaging the feature vector values of each channel of the normalized features to obtain an abnormal heatmap, wherein the pixel values in the abnormal heatmap are used to represent the degree of deviation of the target image relative to the first positive sample image; and the pixel values of each heatmap region are used to represent the degree of deviation of the image region in the target image corresponding to that heatmap region relative to the image region in the first positive sample image corresponding to that heatmap region.
[0039] In one possible implementation, the target image may have one or more channels. For example, if the target image is an RGB image, it may have R, G, and B channels; if the target image is a YCbCr image, it may have Y, Cb, and Cr channels. After the target image is input into a trained feature extraction network, the resulting target image features include feature vector values for each channel. Averaging the feature vector values of the target image features across all channels reduces the feature dimension of the target image features, resulting in an anomaly heatmap. Each heatmap region in the anomaly heatmap corresponds one-to-one with each image region in the target image, and the image position of each heatmap region in the anomaly heatmap is consistent with the image position of the corresponding image region in the target image. Each image region in the target image corresponds one-to-one with each image region in the first positive sample image, and the image position of each image region in the target image is consistent with the image position of the corresponding image region in the first positive sample image. The pixel values of each heatmap region in the anomaly heatmap can be used to represent the degree of deviation of the corresponding image region in the target image relative to the corresponding image region in the first positive sample image. Please refer to Figure 2, which is a schematic diagram comparing a target image and a corresponding abnormal heatmap according to an embodiment of this application. Figure 2(a) shows the target image, and Figure 2(b) shows the corresponding abnormal heatmap. In the embodiment shown in Figure 2, the target image is a single-channel grayscale image. In this case, the average value of the feature vector values of the target image features in each channel can be the feature vector values of the normalized features of the target image in the grayscale channel. The normalized features of the target image are obtained by normalizing the target image features using the mean and variance. Based on the feature vector values of the normalized features in the grayscale channel, the abnormal heatmap shown in Figure 2(b) can be obtained. In this abnormal heatmap, the higher the pixel value of the heatmap region, the greater the deviation of the image region corresponding to the heatmap region in the target image from the corresponding image region in the first positive sample image. The preset deviation threshold can be a preset pixel value. Based on the preset pixel value threshold, image segmentation of the abnormal heatmap can obtain two white ring regions in the abnormal heatmap with pixel values greater than or equal to the preset pixel value threshold. It can be understood that the two abnormal regions of the target object can be determined based on the two white regions in the abnormal heat map (shown as dotted regions and ring regions respectively in Figure 2(b)).
[0040] The above technical solution can fuse the information of the normalized features in each channel by averaging the feature vector values of each channel. The resulting anomaly heat map can visually and intuitively show the degree of deviation of each region of the target image from each region of the first sample image, thereby accurately obtaining the location of the abnormal region in the target object.
[0041] In one possible implementation, before performing dimensionality reduction by averaging the feature vector values of each channel of the normalized features to obtain the abnormal heatmap, determining the degree of deviation of the target image from the first positive sample image based on the normalized features further includes: converting the scale of the feature map contained in the normalized features to be consistent with the scale of the target image by using an inverse Laplacian transform.
[0042] Please refer to Figure 3, which is a schematic flowchart illustrating the normalization process of target image features obtained after inputting a target image into a trained feature extraction network according to an embodiment of this application. In the embodiment shown in Figure 3, before inputting the target image into the trained feature extraction network, the initial number of channels of the target image can be increased, and the image with increased channel number can be used as the new target image. The trained feature extraction network shown in Figure 3 may include a feature extraction module (L... dec The system includes a feature extraction module and a mapping module (NF) connected to the feature extraction module. The feature extraction module extracts features from the target image and expands them using a Laplacian pyramid to obtain four feature maps at different scales, denoted as the first feature map. The mapping module maps the first feature maps input from the feature extraction module to the target space to obtain second feature maps that correspond one-to-one with the first feature maps. The scale of each feature map in the second feature map is the same as the scale of the corresponding feature map in the first feature map. For example, if the scale of a certain feature map in the first feature map is 4×4, then the scale of the corresponding feature map in the second feature map is also 4×4. In this embodiment, the second feature map can represent the features of the target image. Normalizing the second feature map based on preset normalization parameters yields a normalized feature map of the target image. Similarly, the scale of each feature map in the normalized feature map is the same as the scale of the corresponding feature map in the second feature map. In this embodiment, the normalized features can include four feature maps at different scales. The scale of each feature map can be transformed using an inverse Laplacian transform, making the scale-transformed feature map consistent with the scale of the target image. After scaling the feature maps contained in the normalized features, the scaled feature maps can be fused together (L). com The fusion method can be to average the feature vector values corresponding to each feature map after scale transformation in each channel to obtain the abnormal heat map.
[0043] The above technical solution can restore feature maps at different scales to the same scale as the target image through inverse Laplacian transform, which can achieve feature alignment and help ensure the comparability of features.
[0044] In one possible implementation, before determining the anomaly detection result of the target image based on the degree of deviation and a preset deviation threshold, the method further includes setting a preset deviation threshold based on a preset normalization parameter.
[0045] In one possible implementation, the preset deviation threshold can be determined based on the parameter type of the preset normalization parameters. For example, when the preset normalization parameters are the mean and standard deviation, the preset deviation threshold can be a preset multiple relative to the standard deviation. The preset multiple can be determined based on statistical critical value setting rules, such as the three sigma criterion. This criterion assumes that the target image features are normally distributed and that data exceeding a certain standard deviation range is not random error but gross error. Specifically, the probability density that the difference between the mean and the mean of the dataset is less than or equal to 3 times the standard deviation of the dataset is 99.74%, indicating that data outside this range can be judged as gross error. Therefore, when the preset normalization parameters are the mean and standard deviation, the preset deviation threshold can be determined as -3 and 3 based on the three sigma criterion, or the preset deviation threshold can be determined as the difference between -3 and a preset empirical constant, and the sum of 3 and the preset empirical constant. For example, when the preset normalization parameter is the range, and the eigenvector value of the normalized feature is denoted as x', then in one possible implementation, the formula x' = (xI) can be used. min ) / (I max -I min Determine the normalized features, where x is the feature vector value of the target image feature, and I max and I min These represent the maximum and minimum values of the feature vectors of the first set of sample images, respectively. It can be understood that when the target image features are identical to the image features of any one of the first positive sample images, then x' ranges from [0,1]. In this case, the preset deviation thresholds can be determined as 0 and 1, or the difference between the preset deviation threshold of 0 and a preset empirical constant, and the sum of 1 and the preset empirical constant. Normalization can scale the feature vector values of the target image features to a relatively small numerical range. When different preset normalization parameters are used to normalize the target image features, the numerical range of the concentrated distribution of the feature vector values of the normalized features will differ. Correspondingly, the preset deviation thresholds set according to different preset normalization parameters can be different.
[0046] The above technical solution, by setting a preset deviation threshold according to preset normalization parameters, can adaptively set a preset deviation threshold based on the different expected distribution characteristics of the feature vector values of the normalized features when normalizing the target image features using different preset normalization parameters, so that the determined preset deviation threshold can conform to the expected distribution law of the feature vector values of the normalized features.
[0047] In one possible implementation, the preset normalization parameters include one or more of the following: mean, standard deviation, variance, range, absolute deviation of median, and interquartile range.
[0048] In one possible implementation, the preset normalization parameters may include the mean and standard deviation, or they may include the range. Normalization formulas based on the mean and standard deviation, and normalization formulas based on the range, can be found in the foregoing embodiments and will not be repeated here. The preset normalization parameters may also include the mean and variance. Let the feature vector value of the normalized feature be denoted as x'. In one possible implementation, the normalized feature can be determined using the formula x' = (x - μ) / s, where x is the feature vector value of the target image feature, and μ and s are the mean and variance of the feature vector values of multiple groups of first sample image features, respectively. The Welford algorithm can be used to calculate the mean and variance. The preset normalization parameters may also include the median absolute deviation. In one possible implementation, the normalized feature can be determined using the formula x' = (x - median(X)) / (MAD * k), where x is the feature vector value of the target image feature, median(X) is the median of the feature vector values of the multiple sets of first sample image features, k is a constant (e.g., 1.4826), and MAD is the median of the differences between each feature vector value of the multiple sets of first sample image features and the median of the feature vector values of the multiple sets of first sample image features. The preset normalization parameters may also include the interquartile range. Let the feature vector value of the normalized feature be denoted as x'. In one possible implementation, the normalized feature can be determined using the formula x' = (x - Q1) / (Q3 - Q1), where x is the feature vector value of the target image feature, and Q1 and Q3 are the first and third quartiles of the feature vector values of the multiple sets of first sample image features, respectively.
[0049] The preset normalization parameters used in the above technical solution can make the distribution pattern of the feature vector values of the obtained normalized features more obvious, making it easier to distinguish between concentrated and discrete data, and making it easier to accurately determine the preset deviation threshold.
[0050] In one possible implementation, the training process of the trained feature extraction network is as follows: acquiring multiple second positive sample images, which include one or more sample pairs, each sample pair including two positive sample images; inputting each sample pair into the initial feature extraction network to obtain the second sample image features and third sample image features corresponding one-to-one with the two positive sample images in each sample pair; calculating the loss value based on the second sample image features and third sample image features corresponding to each sample pair, and using the loss value to optimize the initial feature extraction network until the preset training requirements are met, so as to obtain the trained feature extraction network.
[0051] In one possible implementation, the second positive sample image can be an image acquired by an image acquisition device for a non-abnormal object. The object contained in the second positive sample image can be any type of object, such as a wafer, chip, or electronic component. The type of object contained in the second positive sample image can be the same as the type of object contained in the first positive sample image, and can also be the same as the type of the target object. Multiple second positive sample images can constitute one or more sample pairs, each sample pair including two positive sample images. Inputting each sample pair into the initial feature extraction network can obtain two image features corresponding one-to-one with the two positive sample images contained in the sample pair, which are denoted as the second sample image feature and the third sample image feature, respectively. The loss value can be calculated based on the second sample image feature and the third sample image feature to optimize the initial feature extraction network until the preset training requirements are met. Please refer to Figure 4, which is a schematic diagram of the training process of the initial feature extraction network according to an embodiment of this application. For one of the sample pairs of multiple second positive sample images, the two positive sample images included in the sample pair are denoted as I(i) and I(j), respectively. Before inputting the two positive sample images into the initial feature extraction network, the initial number of channels in each of the two positive sample images can be increased to obtain new positive sample images I'(i) and I'(j). In the embodiment shown in Figure 4, a feature extraction module (Ldec) and a mapping module (NF) connected to the feature extraction module are included. The feature extraction module and the mapping module can share weights. The feature extraction module can perform feature extraction on the positive sample images I'(i) and I'(j) respectively. For the positive sample image I'(i), the feature extraction module can use a Laplacian pyramid expansion to obtain four feature maps of different scales, denoted as the third feature map x. d The mapping module maps the third feature map input from the feature extraction module to the target space to obtain a fourth feature map that corresponds one-to-one with the third feature map. The scale of each feature map in the fourth feature map is the same as the scale of its corresponding feature map in the third feature map. For example, if the scale of a certain feature map in the third feature map is 4×4, then the scale of the corresponding feature map in the fourth feature map is also 4×4. In this embodiment, the fourth feature map can represent the second sample image feature z. d (i). Features of the third sample image z d The extraction process for (j) is similar to the extraction process for the features of the second sample image, and will not be described again here. In this embodiment, the loss value is the feature z of the second sample image. d (i) and the features z of the third sample image d (j) The distance in the target space, which can be denoted as Δz d , In Δz d When the distance is less than or equal to the preset distance value, the preset training requirements can be considered met.
[0052] The above technical solution can train the initial feature extraction network based on the differences between the features of the sample images corresponding to each of the multiple second positive sample images. In this way, the feature extraction network obtained by training can reduce the interference of background information, such as the texture interference of materials such as metal.
[0053] In one possible implementation, the plurality of second positive sample images are identical to the plurality of first positive sample images.
[0054] Referring to Figure 4, after obtaining the trained feature extraction network, each second positive sample image can be input into the feature extraction network to obtain the features of each group of fourth sample images in the target space. Statistical analysis of the features of each group of fourth sample images in the target space yields the mean and variance of the feature vector values for each group of fourth sample images. The obtained mean and variance can be used as preset normalization parameters. In other words, a second positive sample image can be a first positive sample image; that is, multiple second positive sample images are identical to multiple first positive sample images.
[0055] The above technical solution re-inputs the positive sample images used in the training of the feature extraction network into the trained feature extraction network. The resulting sample image features are close in distance in the target space, and the preset normalization parameters obtained based on the sample image features are highly referential. They can represent the image features of each image collected for non-abnormal objects. This method makes the normalized features obtained by normalization processing based on preset normalization parameters highly referential, so that the obtained anomaly detection results can accurately indicate the difference area between the target object and the non-abnormal object, that is, the abnormal area of the target object.
[0056] In one possible implementation, the target image acquired for the target object is input into a trained feature extraction network to obtain the target image features of the target image, including: extracting multi-scale image features of the target image in the trained feature extraction network and converting the multi-scale image features into target image features of a preset scale through normalization stream processing.
[0057] In one possible implementation, the Laplacian pyramid algorithm can be used in the feature extraction network to extract features from the target image to obtain multiple image features at different scales. Normalization stream processing is then applied to these multiple image features at different scales to convert them into image features of the same scale. These image features of the same scale are the target image features, and the scale of each target image feature is a preset scale. The preset scale can be, for example, the scale of the target image.
[0058] In the above technical solution, the feature extraction network can achieve anomaly detection at high resolution by performing normalized stream processing on multi-scale image features, and can also perceive the anomaly detection results at different feature scales corresponding to the target image.
[0059] Please refer to Figure 5, which is a schematic block diagram of an anomaly detection device according to one embodiment of this application. According to a second aspect of this application, an anomaly detection device 500 is also provided, comprising:
[0060] The input module 510 is used to input the target image collected for the target object into the trained feature extraction network to obtain the target image features of the target image;
[0061] The processing module 520 is used to normalize the target image features based on preset normalization parameters to obtain the normalized features of the target image. The preset normalization parameters are statistical information obtained by statistically analyzing the distribution of feature vector values of multiple sets of first sample image features. The multiple sets of first sample image features are image features obtained by inputting multiple first positive sample images into a trained feature extraction network. The first positive sample images are images acquired for non-abnormal objects.
[0062] The first determining module 530 is used to determine the degree of deviation of the target image relative to the first positive sample image based on the normalized features;
[0063] The second determining module 540 is used to determine the anomaly detection result of the target image based on the degree of deviation and a preset deviation threshold. The anomaly detection result is used to indicate whether there is an abnormal region in the target object.
[0064] Please refer to Figure 6, which is a schematic block diagram of an electronic device according to an embodiment of this application. According to a third aspect of this application, an electronic device 600 is also provided, including: a processor 610 and a memory 620, wherein the memory 620 stores computer program instructions, which are executed by the processor 610 to perform the above-described anomaly detection method.
[0065] According to a fourth aspect of this application, a storage medium is also provided, on which program instructions are stored. When the program instructions are executed by a computer or processor, the computer or processor performs the corresponding steps of the anomaly detection method described in the embodiments of this application, and is used to implement the corresponding module in the anomaly detection device described in the embodiments of this application, or the corresponding module in the anomaly detection device described above. The storage medium may, for example, include a memory card of a smartphone, a storage component of a tablet computer, a hard disk of a personal computer, a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a portable compact disc read-only memory (CD-ROM), a USB memory, or any combination of the above storage media. A computer-readable storage medium may be any combination of one or more computer-readable storage media.
[0066] According to a fifth aspect of this application, a computer program product is also provided, including computer program instructions that, when run, are used to perform the anomaly detection method as described above.
[0067] Those skilled in the art can understand the specific implementation and beneficial effects of the above-described anomaly detection device by reading the detailed description of the anomaly detection method above, and for the sake of brevity, they will not be described in detail here.
[0068] Although exemplary embodiments have been described herein with reference to the accompanying drawings, it should be understood that the above exemplary embodiments are merely illustrative and are not intended to limit the scope of this application. Various changes and modifications can be made therein by those skilled in the art without departing from the scope and spirit of this application. All such changes and modifications are intended to be included within the scope of this application as claimed in the appended claims.
[0069] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0070] In the several embodiments provided in this application, it should be understood that the disclosed devices and methods can be implemented in other ways. For example, the device embodiments described above are merely illustrative. For instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another device, or some features may be ignored or not executed.
[0071] Numerous specific details are set forth in the specification provided herein. However, it will be understood that embodiments of this application may be practiced without these specific details. In some instances, well-known methods, structures, and techniques have not been shown in detail so as not to obscure the understanding of this specification.
[0072] Similarly, it should be understood that, in order to streamline this application and aid in understanding one or more of the various inventive aspects, features of this application may sometimes be grouped together in a single embodiment, figure, or description thereof in the description of exemplary embodiments of this application. However, this approach should not be construed as reflecting an intention that the claimed application requires more features than are expressly recited in each claim. Rather, as reflected in the corresponding claims, its inventive point lies in solving the corresponding technical problem with features fewer than all features of a single disclosed embodiment. Therefore, the claims following the detailed description are hereby expressly incorporated into that detailed description, wherein each claim itself is a separate embodiment of this application.
[0073] Those skilled in the art will understand that, apart from the mutual exclusion of features, all features disclosed in this specification (including the accompanying claims, abstract, and drawings) and all processes or units of any method or apparatus so disclosed can be combined in any combination. Unless otherwise expressly stated, each feature disclosed in this specification (including the accompanying claims, abstract, and drawings) may be replaced by an alternative feature that serves the same, equivalent, or similar purpose.
[0074] Furthermore, those skilled in the art will understand that although some embodiments herein include certain features included in other embodiments but not others, combinations of features from different embodiments are intended to be within the scope of this application and form different embodiments. For example, in the claims, any of the claimed embodiments can be used in any combination.
[0075] The various component embodiments of this application can be implemented in hardware, or as software modules running on one or more processors, or a combination thereof. Those skilled in the art will understand that microprocessors or digital signal processors (DSPs) can be used in practice to implement some or all of the functions of some modules in the anomaly detection apparatus according to embodiments of this application. This application can also be implemented as an apparatus program (e.g., a computer program and computer program product) for performing part or all of the methods described herein. Such an implementation of this application can be stored on a computer-readable medium, or can be in the form of one or more signals. Such signals can be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.
[0076] It should be noted that the above embodiments are illustrative of this application and not restrictive, and that those skilled in the art can devise alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses should not be construed as limiting the claims. The word "comprising" does not exclude the presence of elements or steps not listed in the claims. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. This application can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by the same item of hardware. The use of the words first, second, and third, etc., does not indicate any order. These words can be interpreted as names.
[0077] The above are merely specific embodiments or descriptions of specific embodiments of this application. The scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. The scope of protection of this application shall be determined by the scope of the claims.
Claims
1. An anomaly detection method, characterized in that, include: The target image collected for the target object is input into the trained feature extraction network to obtain the target image features of the target image; The target image features are normalized based on preset normalization parameters to obtain normalized features of the target image. The preset normalization parameters are statistical information obtained by statistically analyzing the distribution of feature vector values of multiple sets of first sample image features. The multiple sets of first sample image features are image features obtained by inputting multiple first positive sample images into the trained feature extraction network. The first positive sample images are images collected for non-abnormal objects. The degree of deviation of the target image relative to the first positive sample image is determined based on the normalized features; The anomaly detection result of the target image is determined based on the degree of deviation and a preset deviation threshold. The anomaly detection result is used to indicate whether there is an abnormal region in the target object.
2. The method according to claim 1, characterized in that, Determining the degree of deviation of the target image relative to the first positive sample image based on the normalized features includes: Dimensionality reduction is performed by averaging the feature vector values of each channel of the normalized feature to obtain an anomaly heatmap, wherein the pixel values in the anomaly heatmap are used to represent the degree of deviation of the target image from the first positive sample image. The pixel value of each heatmap region is used to represent the degree of deviation of the image region in the target image corresponding to that heatmap region relative to the image region in the first positive sample image corresponding to that heatmap region.
3. The method according to claim 2, characterized in that, Before performing dimensionality reduction by averaging the feature vector values of each channel of the normalized features to obtain the anomaly heatmap, determining the degree of deviation of the target image relative to the first positive sample image based on the normalized features further includes: The scale of the feature map contained in the normalized features is converted to be consistent with the scale of the target image by inverse Laplacian transformation.
4. The method according to any one of claims 1-3, characterized in that, Before determining the anomaly detection result of the target image based on the degree of deviation and a preset deviation threshold, the method further includes: The preset deviation threshold is set according to the preset normalization parameter.
5. The method according to any one of claims 1-3, characterized in that, The preset normalization parameters include one or more of the following: mean, standard deviation, variance, range, absolute deviation of median, and interquartile range.
6. The method according to any one of claims 1-3, characterized in that, The training process of the trained feature extraction network is as follows: Acquire multiple second positive sample images, wherein the multiple second positive sample images include one or more sets of sample pairs, and each set of sample pairs includes two positive sample images; Each pair of samples is input into the initial feature extraction network to obtain the second and third sample image features corresponding to the two positive sample images in each pair of samples. The loss value is calculated based on the second sample image features and the third sample image features corresponding to each sample pair, and the loss value is used to optimize the initial feature extraction network until the preset training requirements are met, so as to obtain the trained feature extraction network.
7. The method according to claim 6, characterized in that, The plurality of second positive sample images are the same as the plurality of first positive sample images.
8. The method according to any one of claims 1-3, characterized in that, The step of inputting the target image acquired for the target object into a trained feature extraction network to obtain the target image features of the target image includes: In the trained feature extraction network, multi-scale image features of the target image are extracted and the multi-scale image features are converted into target image features of a preset scale through normalization stream processing.
9. An anomaly detection device, characterized in that, include: The input module is used to input the target image collected for the target object into the trained feature extraction network to obtain the target image features of the target image; The processing module is used to normalize the target image features based on preset normalization parameters to obtain the normalized features of the target image. The preset normalization parameters are statistical information obtained by statistically analyzing the distribution of feature vector values of multiple sets of first sample image features. The multiple sets of first sample image features are image features obtained by inputting multiple first positive sample images into the trained feature extraction network. The first positive sample images are images acquired for non-abnormal objects. The first determining module is used to determine the degree of deviation of the target image relative to the first positive sample image based on the normalized features; The second determining module is used to determine the anomaly detection result of the target image based on the degree of deviation and a preset deviation threshold. The anomaly detection result is used to indicate whether there is an abnormal region in the target object.
10. An electronic device comprising a processor and a memory, characterized in that, The memory stores computer program instructions, which, when executed by the processor, are used to perform the anomaly detection method as described in any one of claims 1-8.
11. A storage medium on which program instructions are stored, characterized in that, The program instructions are used to execute the anomaly detection method as described in any one of claims 1-8 when the program is run.
12. A computer program product comprising computer program instructions, characterized in that, The computer program instructions are used to execute the anomaly detection method as described in any one of claims 1-8 when the program is run.