Method for detecting appearance defects of flour products based on image recognition
By using image recognition technology, combined with frequency domain enhancement and multimodal feature extraction, high-precision and stable detection of appearance defects in flour products has been achieved, solving the problems of insufficient adaptability and accuracy in existing technologies. This technology is suitable for intelligent detection in flour product processing enterprises.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHANDONG FU SHIKANG BIOTECHNOLOGY CO LTD
- Filing Date
- 2026-04-30
- Publication Date
- 2026-06-19
AI Technical Summary
Existing flour product appearance inspection technologies are insufficient in terms of adaptability, accuracy, robustness, and automation, making it difficult to meet the quality control requirements of high-speed production lines.
An image recognition-based method for detecting appearance defects in flour products is adopted, including image acquisition and frequency domain enhancement preprocessing, multi-scale morphological candidate region extraction, multi-modal feature extraction and spatial correlation modeling, local-global joint optimization and defect judgment. Combined with adaptive threshold segmentation and sub-pixel precision reconstruction, high-precision and stable defect detection is achieved.
It significantly reduces false detection and false negative rates, improves the accuracy and consistency of defect identification, and has environmental adaptability and robustness, making it suitable for the intelligent upgrading of flour product processing enterprises of all sizes.
Smart Images

Figure CN122244004A_ABST
Abstract
Description
TECHNICAL FIELD
[0001] The present application relates to the technical field of appearance defect detection, more particularly, the present application relates to a flour product appearance defect detection method based on image recognition. BACKGROUND
[0002] As a daily staple food, the appearance quality of flour products is an important indicator of quality control in the production and processing process. With the continuous expansion of the scale of food industrial production and the continuous improvement of the running speed of the production line, the traditional method of relying on manual visual inspection for appearance defect screening has been difficult to match the continuous production rhythm. Manual detection is easily affected by subjective judgment, fatigue state and working environment, and cannot meet the quality control needs of large-scale and standardization.
[0003] Under this background, automated detection technology based on machine vision and image recognition has been gradually applied in the field of food appearance quality inspection. Existing related detection technologies are mostly based on general target recognition framework. In view of the scene characteristics such as complex surface texture of flour products, small difference in defect morphology and uneven illumination reflection, related algorithms and processes are still developed around general ideas such as conventional image preprocessing, feature extraction and region determination.
[0004] With the continuous maturity of deep learning and visual feature modeling technology, the demand for fine detection of food appearance continues to increase, and the industry has increasingly prominent demand for detection methods that take into account texture interference suppression, multi-morphology defect adaptation and spatial correlation analysis. Related technology systems continue to develop in the direction of multi-modal feature fusion, global optimization decision and high-precision boundary positioning to meet the intelligent detection needs of flour product industrial production.
[0005] However, it still has some disadvantages in actual use, such as:
[0006] 1. The existing flour product appearance detection technology mostly uses conventional visual processing method, which has poor adaptability to the natural texture of the product itself and uneven illumination, and is easy to misjudge normal texture as defects, resulting in insufficient stability of the detection result. At the same time, the detection algorithm has weak universality, and it is difficult to adapt to different morphologies and sizes of defects, which is easy to miss detection and false detection on large-scale high-speed production lines, and it is difficult to meet the quality inspection accuracy requirements of continuous production;
[0007] 2. Traditional detection methods mostly rely on single feature or simple threshold segmentation, lack comprehensive utilization of multi-dimensional features, and have limited recognition ability for small defects and edge fuzzy defects. Most schemes only make local region judgment without considering the spatial correlation between defects, which is easy to cause regional fragmentation judgment, resulting in insufficient logic of the overall detection result, and the detection accuracy is difficult to further improve;
[0008] 3. Existing automated inspection systems generally lack a fine-grained boundary reconstruction process, and defect localization is mostly limited to pixel-level accuracy, which cannot meet the requirements of high-standard quality assessment. At the same time, the inspection process is mostly a simplified serial process, lacking a global optimization decision-making mechanism, which is prone to judgment bias in complex scenarios. Furthermore, the subsequent data quantification and grade assessment functions are imperfect, which is not conducive to production data traceability and quality analysis.
[0009] 4. Traditional manual inspection is easily affected by factors such as personnel experience, fatigue, and subjective judgment, making it difficult to standardize inspection criteria. Existing automated equipment mostly uses fixed models and parameters, which cannot adaptively adjust the inspection strategy. It lacks robustness in the face of product batch differences and changes in ambient light, and the system deployment and maintenance costs are high, making it difficult to widely apply in small and medium-sized flour processing enterprises. Summary of the Invention
[0010] In order to overcome the above-mentioned defects of the prior art, the present invention provides a method for detecting appearance defects of flour products based on image recognition, which solves the problems mentioned in the background art through the following scheme.
[0011] To achieve the above objectives, the present invention provides the following technical solution: a method for detecting appearance defects in flour products based on image recognition, comprising:
[0012] S1: Image acquisition and frequency domain enhancement preprocessing: Acquire images of the surface of flour products and perform grayscale conversion and noise reduction in sequence. Decompose the image into different frequency domain sub-bands through discrete wavelet transform, process each frequency domain sub-band separately, and then reconstruct the preprocessed image through inverse wavelet transform.
[0013] S2: Multi-scale morphological candidate region extraction: Based on the preprocessed image, multi-scale and multi-directional structural elements are constructed. Bright and dark defect responses are extracted by top-hat and bottom-hat transformations respectively. After normalizing the multi-scale responses, the maximum values are fused to obtain a comprehensive defect response map. Then, noise is removed by adaptive threshold segmentation and connected component analysis to obtain defect candidate regions.
[0014] S3: Multimodal feature extraction and spatial correlation modeling: For each defect candidate region, extract morphological geometry, local texture and gradient field topology multimodal features. After weighted optimization by channel attention mechanism, use dual path network to obtain regional local discrimination results. At the same time, construct hypergraph model to characterize the spatial correlation between candidate regions and form global spatial constraints.
[0015] S4: Local-Global Joint Optimization and Defect Judgment: A joint energy function is constructed by combining local discriminative data terms and hypergraph global smoothing terms. Local feature evidence and global spatial constraints are balanced by adaptive weights. The graph cut algorithm is used to minimize energy to complete the global optimal defect judgment. The level set model is used to perform sub-pixel precision fine reconstruction of the real defect boundary.
[0016] S5: Defect Quantification and Output: Converts the pixel-level geometric parameters of defects into actual physical dimensions, completes the defect classification assessment based on preset level thresholds, and finally outputs an inspection report containing defect information and overall product judgment results.
[0017] The technical effects and advantages of this invention are as follows:
[0018] 1. This solution utilizes frequency domain layering processing technology to effectively suppress interference from the inherent texture of flour products and the effects of uneven illumination, significantly reducing false detection and false negative rates. Combined with multi-scale, multi-directional structural element design, it can adapt to appearance defects of different shapes and sizes, exhibiting excellent identification capabilities for cracks, defects, stains, etc., and maintaining stable and reliable detection performance even in high-speed production scenarios.
[0019] 2. This solution integrates multimodal features of morphology, geometry, texture, and gradient field topology, and strengthens key features through a channel attention mechanism to enhance defect representation capabilities. Simultaneously, it introduces a dual-path network and a hypergraph model to combine accurate local discrimination with global spatial constraints, making defect judgment more logical and holistic, effectively improving the accuracy and consistency of complex defect identification.
[0020] 3. This solution employs a level set model to achieve sub-pixel precision reconstruction of defect boundaries, resulting in significantly higher positioning accuracy than traditional pixel-level detection methods. By combining the energy function and graph cut algorithm, it achieves globally optimal decision-making, adaptively optimizing detection results. It also possesses complete physical dimension conversion and defect level assessment functions, facilitating quality traceability and production control.
[0021] 4. This solution operates entirely automatically, eliminating the influence of subjective human factors. Testing standards are standardized and reproducible. The system possesses strong environmental adaptability and robustness, capable of handling complex scenarios such as changes in lighting and product batch variations. The overall process is compact and efficient, can be directly embedded into existing production lines, and has low deployment and maintenance costs, making it suitable for the intelligent upgrading of flour product processing enterprises of all sizes. Attached Figure Description
[0022] Figure 1 This is a schematic diagram of the overall structure of the present invention.
[0023] Figure 2 This is a schematic diagram of the S1 process of the present invention.
[0024] Figure 3 This is a schematic diagram of the S2 process of the present invention.
[0025] Figure 4 This is a schematic diagram of the S3 process of the present invention.
[0026] Figure 5 This is a schematic diagram of the S4 process of the present invention. Detailed Implementation
[0027] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0028] refer to Figure 1 - Figure 5 The image recognition-based method for detecting appearance defects in flour products, as shown, includes:
[0029] S1: Image acquisition and frequency domain enhancement preprocessing:
[0030] By standardizing image acquisition and wavelet frequency domain decomposition, problems such as uneven illumination and natural texture interference on the surface of flour products are eliminated. Differential enhancement processing is applied to different frequency domain components of the image to reconstruct a preprocessed image with prominent defect features and suppressed background noise, providing a high-quality image foundation for subsequent defect region extraction. The specific steps are as follows:
[0031] S101: Image Acquisition and Basic Preprocessing:
[0032] An industrial area scan camera was used to acquire surface images of flour products under standardized lighting conditions. The camera resolution was no less than 1280×720 pixels, and a ring-shaped shadowless light source was used with a stable illumination intensity of 300-500 lux to avoid specular reflections and localized shadows on the product surface, thus obtaining the original images. .
[0033] The original color image is converted to grayscale using the ITU-R BT.601 weighted grayscale formula: Where R, G, and B are the pixel values of the red, green, and blue channels, respectively. Eliminate color information interference for pixel value coordinates and unify the dimensions of image data.
[0034] Apply 3×3 or 5×5 window mid-range filtering to grayscale images Noise reduction is performed by replacing the gray value of the center pixel with the median of all pixel gray values within the window, effectively removing sensor salt-and-pepper noise and flour particle noise, and outputting the noise-reduced image. .
[0035] Simultaneously, pixel physical equivalent calibration is completed. The conversion relationship between pixels and physical size is obtained through a standard size calibration board: s=0.1mm / pixel, that is, a single pixel corresponds to an actual physical size of 0.1mm, which is used for the physical quantity conversion of all subsequent geometric parameters. This calibration is completed once during system deployment. If the camera installation height, lens, or field of view changes, the calibration needs to be re-executed and the s value updated.
[0036] S102: Wavelet Transform and Frequency Domain Decomposition:
[0037] For denoised images Discrete wavelet transform is performed using the Daubechies4 or Symlet4 wavelet basis, which possesses compact support, regularity, and good time-frequency locality, making it suitable for frequency domain decomposition of food surface images. The decomposition level is set to 3-4 levels; too low a level fails to effectively distinguish natural textures from defect edges, while too high a level loses subtle defect information. Each wavelet decomposition level generates a low-frequency approximate subband. With three high-frequency detail sub-bands in three directions, namely the horizontal sub-band Vertical sub-band Diagonal sub-band The low-frequency approximation subband represents the overall brightness, contour, and large uniform areas of the image; the first and second high-frequency subbands correspond to subtle defect features such as cracks and the edges of focal spots; the second and third mid-frequency subbands correspond to natural texture features such as dough pores, powdering, and surface wrinkles, achieving effective separation of different frequency domain components and providing a foundation for subsequent frequency band differentiation processing.
[0038] After wavelet decomposition, the coefficients in all detail subbands (mid frequency and high frequency) are uniformly denoted as C. C is the original wavelet coefficient at the corresponding position, which represents the intensity and direction of the detail response at that position in the corresponding frequency band. It can be positive or negative, and the larger the amplitude, the more significant the detail.
[0039] S103: Low-frequency subband illumination correction:
[0040] Approximate subband for low frequency Limit Contrast Adaptive Histogram Equalization (CLAHE) is performed to remap the pixel grayscale values within the sub-band to the full dynamic range of 0-255, correcting uneven lighting caused by the curvature of flour products and placement angle deviations. The grayscale mapping slope is limited during the equalization process to prevent excessive amplification of local noise, ultimately resulting in an enhanced low-frequency sub-band with uniform brightness distribution and no areas that are too dark or too bright. This ensures that subsequent defect detection is not affected by brightness.
[0041] S104: Mid-frequency subband texture suppression:
[0042] The mid-frequency detail subband mainly corresponds to the natural texture of flour products, and it needs to be specifically suppressed. Within the mid-frequency subband, a 7×7 sliding window is used to calculate the local variance Var region by region. The local variance reflects the degree of fluctuation of the wavelet coefficients within the window. The coefficients in the natural texture region fluctuate evenly and have small variance, while the coefficients in the defect edge region fluctuate violently and have large variance.
[0043] The texture discrimination threshold was determined by offline statistical analysis of the variance of the mid-frequency subband coefficients in images of defect-free flour products. To avoid negative attenuation factors that could cause abnormal magnification of texture areas, a non-negative segmented attenuation strategy is adopted: ;in This is the wavelet coefficient attenuation factor, whose value is strictly limited to [0, 1]. The coefficient update rule is: if ≤ The area was determined to be a natural texture region, and the coefficient was set according to... Perform attenuation suppression; if > If the region is identified as a suspected defect edge area, the original coefficients are directly retained. This avoids excessive suppression of defect information.
[0044] S105: High-frequency subband defect edge enhancement:
[0045] The high-frequency detail subband mainly corresponds to the edge features of defects such as cracks, scorch marks, and mold spots. A nonlinear gain function is used to differentiate the coefficients to achieve weak edge enhancement and noise suppression.
[0046]
[0047] in: This is a sign function used to preserve the sign of the coefficients, corresponding to the direction of edge brightness; γ is the absolute value of the coefficient, representing the magnitude of the edge response; γ is the nonlinear enhancement exponent, which takes a value greater than 1, to amplify the edge coefficient of large-amplitude defects and suppress small-amplitude noise coefficients. To enhance the post-wavelet coefficients and output clearer defect edge features.
[0048] S106: Inverse Wavelet Transform and Image Reconstruction
[0049] The corrected low-frequency subband The mid-frequency subband after texture suppression and the high-frequency subband after edge enhancement are reconstructed layer by layer according to the inverse process of wavelet decomposition to obtain the final enhanced image. The interference from natural textures in the image is significantly suppressed, and the edge features of defects such as cracks, scorch marks, mold spots, and dents are highlighted and enhanced, resulting in a significant improvement in the image signal-to-noise ratio. This provides a stable and reliable input basis for subsequent multi-scale morphological defect extraction.
[0050] S2: Multi-scale morphological candidate region extraction:
[0051] To address the diverse morphologies and large size range of defects in flour products, a multi-scale, multi-directional set of structural elements was constructed. Bright and dark defect responses were extracted through morphological top-cap and bottom-cap transformations, respectively. After normalization and maximum value fusion, a comprehensive defect response map was obtained. Then, adaptive threshold segmentation and connected component analysis were used to remove noisy regions, resulting in a set of effective defect candidate regions that meet the requirements of subsequent feature calculations. The specific steps are as follows:
[0052] S201: Construction of Multi-Scale, Multi-Directional Structural Elements
[0053] The size of the structural element is determined based on the physical size and then converted into a pixel value using the pixel equivalent s=0.1mm / pixel, which avoids strong coupling with the image resolution and improves the versatility of the solution.
[0054] definition Let i be the i-th structuring element, where i is the structuring element index, used to distinguish structuring elements of different sizes, shapes, and orientations; construct three sets of structuring elements:
[0055] Linear structural elements: physical length covers 1-10mm, corresponding to pixel length 10-100 pixels, and direction includes 0°, 45°, 90°, and 135°, realizing omnidirectional detection of slender cracks;
[0056] Circular structural element: physical radius covers 0.5-4mm, corresponding to pixel radius 5-40 pixels, suitable for regional defects such as blocky scorch spots and mold spots;
[0057] Cross-shaped structural element: size range 3×3-9×9 pixels, used to detect small, clustered defects and dense flaws.
[0058] S202: Multi-scale top and bottom cap transformation:
[0059] For each structuring element Morphological top-hat and bottom-hat transformations are performed separately to achieve targeted extraction of different types of defects; specifically, this includes:
[0060] Top hat transformation formula: ;
[0061] in The image opening operation, which is erosion followed by dilation, is used to eliminate bright areas in the image that are smaller than the structuring element. The top-hat transform ultimately extracts defects that are brighter than the surrounding areas, such as scorch marks, mold spots, and raised blemishes.
[0062] Formula for transforming the bottom cap: ;
[0063] in The image closing operation, which is dilation followed by erosion, is used to eliminate dark areas in the image that are smaller than the structuring element. The bottom cap transformation ultimately extracts defects that are darker than the surrounding area, such as cracks, dents, and damage.
[0064] By iterating through all structural elements at all scales and in all directions, n sets of top-hat response maps and bottom-hat response maps are obtained.
[0065] S203: Response plot normalization and maximum value fusion:
[0066] To eliminate numerical differences between response maps at different scales and avoid subjectivity and instability introduced by manual intervention, all response maps were independently normalized, compressing the numerical range to [0, 1]. Subsequently, a maximum value fusion strategy was employed to retain the strongest defect response at each pixel location.
[0067]
[0068]
[0069] Brightness defect response With dark defect response By overlaying the data, we obtain the comprehensive defect response diagram: This fusion method can maximize the preservation of the response characteristics of various defects while suppressing random noise interference.
[0070] S204: Adaptive Threshold Segmentation and Candidate Region Generation
[0071] The Otsu adaptive thresholding algorithm is applied to the comprehensive defect response map R(x, y) to automatically calculate the optimal segmentation threshold T. This threshold is determined by maximizing the inter-class variance of gray levels between the foreground defect region and the background region, without manual intervention. A binarized candidate region map is then generated based on the threshold.
[0072]
[0073] The areas with a pixel value of 1 are suspected defect candidate areas, and the areas with a pixel value of 0 are background areas.
[0074] S205: Connectivity Analysis and Noise Removal
[0075] Binarized candidate region map Perform 8-neighbor connected component labeling and calculate pixel area for each region.
[0076] To ensure stable calculation of subsequent texture features such as LBP and gray-level co-occurrence matrix, a minimum effective area threshold of 100 pixels (corresponding to a physical area of 1 mm²) is set to eliminate tiny noisy connected components with an area less than 100 pixels.
[0077] For candidate regions with an area between 100 and 200 pixels and a thinness greater than 5, morphological dilation of 1-2 pixels is applied before feature extraction to ensure sufficient window for texture feature calculation; for extremely thin regions (thinness greater than 10), geometric features are used as the main feature and the weight of texture features is automatically reduced in the subsequent attention mechanism.
[0078] Finally, a set of valid candidate regions is obtained. k = 1, 2, ..., m, where m is the total number of valid candidate regions retained after connected component analysis, and each candidate region... This corresponds to a single potential defect location.
[0079] S3: Multimodal Feature Extraction and Spatial Correlation Modeling
[0080] Three complementary features—morphological geometry, local texture, and gradient field topology—are extracted in parallel from each candidate defect region. These features are then normalized and concatenated to obtain a fused feature vector. A channel attention mechanism is used to weight and optimize these features, strengthening discriminative features and suppressing redundant features. Based on a dual-path network structure, local discrimination probabilities and reconstruction errors are calculated to obtain regional-level defect discrimination evidence. Simultaneously, a hypergraph model is constructed to characterize the high-order spatial relationships between candidate regions. The hypergraph Laplacian matrix and global correlation strength are calculated to form global spatial constraints, providing data support for subsequent joint optimization and judgment. The specific steps are as follows:
[0081] S301: Multimodal Feature Initialization and Extraction:
[0082] For each candidate region Ck, three complementary features—morphological geometry, local texture, and gradient field topology—are extracted in parallel to construct a high-dimensional feature vector, which comprehensively characterizes the attributes of the defect region.
[0083] (1) Extraction of morphological geometric features:
[0084] Morphological geometric features are used to describe the macroscopic contour shape of the defect region, specifically including:
[0085] Area A: Total number of pixels within the area;
[0086] Perimeter P: Total number of pixels at the outer boundary of the region;
[0087] Thin length E: The ratio of the long side to the short side of the minimum bounding rectangle, E=L / W; where L is the pixel length of the long side of the minimum bounding rectangle, and W is the pixel length of the short side of the minimum bounding rectangle; the larger the thin length, the closer the defect shape is to a thin crack.
[0088] Roundness C: Characterizes the degree to which a region approximates a circle. The value range is [0, 1], and the closer it is to 1, the closer the shape is to a circle;
[0089] Concavity / convexity H: The ratio of the actual area of the region to the area of the convex hull. This reflects the degree of edge irregularity; among which The pixel area of the convex hull of the defect region is used to reflect the irregularity and degree of concavity / convexity of the defect edge.
[0090] Compactness S: It characterizes the compactness of the region's shape.
[0091] (2) Local texture feature extraction:
[0092] Local texture features are used to depict the gray-level distribution patterns within defect areas, distinguishing defects from natural textures:
[0093] Rotation-invariant local binary pattern (RI-LBP) features: It uses a local binary pattern with radii of 1, 2, and 8 sampling points to extract rotation-invariant texture histograms, which are robust to changes in illumination and product placement angle.
[0094] Gray-level Co-occurrence Matrix (GLCM) Features: The GLCM is calculated in four directions with a gray-level gradient distance of 1 and angles of 0°, 45°, 90°, and 135°. Four-dimensional statistical features (contrast, correlation, energy, and homogeneity) are extracted. If the number of pixels in a candidate region is less than 50, the GLCM features are not calculated; instead, the region's performance in image enhancement is used. The local gray-level histogram entropy and local gray-level variance are replaced.
[0095] Local gray-scale statistical characteristics:
[0096] Regional grayscale mean: The average grayscale value of all pixels within the defective area, representing the overall brightness of the area;
[0097] Regional grayscale variance: The degree of dispersion of grayscale values within a defective region, reflecting the uniformity of texture within the region.
[0098] (3) Gradient field topological feature extraction:
[0099] Gradient field topological features are used to describe the spatial distribution and connectivity of defect edges:
[0100] Edge direction entropy: The gradient direction of all edge points in the defect area is statistically analyzed to calculate the information entropy of the direction distribution; the lower the entropy value, the more concentrated the edge direction (typically linear cracks), and the higher the entropy value, the more chaotic the edge direction (typically natural textures).
[0101] Edge connectivity: Edge points within a region are treated as graph nodes, and 8-neighborhood connections are treated as graph edges. The number of edge connected branches and the size of the largest connected subgraph are counted to distinguish between continuous defects and discrete noise points.
[0102] Skeleton topological features: Morphological refinement of the defect area yields the central skeleton of the region, and the number of skeleton endpoints and branch points is counted; cracks usually exhibit a small number of endpoints and simple branching structures, while natural textures present complex network branches.
[0103] (4) Feature normalization and splicing:
[0104] For the three types of features mentioned above, min-max normalization is applied to map each dimension of the features to the interval [0, 1]. The normalization formula is as follows: Where: f is the original value of the feature in the current dimension; This is the minimum value of this dimension feature across all candidate regions; This is the maximum value of this dimension feature across all candidate regions.
[0105] Let the morphological geometric features be dimensional vector Texture features are dimensional vector The gradient topological features are dimensional vector The three normalized features are concatenated sequentially from beginning to end to obtain the initial fused feature vector: Total feature dimension after fusion ;
[0106] S302: Feature Attention Weighted Optimization
[0107] Suppose there are m candidate regions and the feature dimension is d, then the feature matrix... A channel attention mechanism is employed to weight and optimize the fused features, thereby strengthening discriminative features and suppressing redundant features.
[0108]
[0109] in: To perform global average pooling in the region dimension, the spatial dimension is compressed to obtain a global description of the features; It is a two-layer fully connected network that learns feature channel weights; The Sigmoid activation function maps the weights to [0, 1]. To perform column-weighted analysis on the feature matrix F, the output is a weighted and optimized discriminative feature matrix. .
[0110] S303: Dual-path adversarial local discrimination:
[0111] A dual-path discrimination network is constructed, and local class probabilities and confidence scores are output based on optimized features to form a regional defect discrimination criterion, as detailed below:
[0112] The discrimination path consists of three fully connected layers, with input features... Output the probability distribution of defect categories:
[0113]
[0114] Reconstruction path: Consists of an autoencoder, where G is the encoder network of the autoencoder, used to map input features to a low-dimensional hidden layer feature space; G is the decoder network of the autoencoder, used to reconstruct the input features from the hidden layer features. The encoder EG extracts the hidden layer representation of the features, and the decoder G reconstructs the input features, calculating the normalized reconstruction error.
[0115]
[0116] The reconstruction error r is normalized to [0, 1]. The smaller r is, the closer the region is to normal texture. The larger r is, the more likely it is to be a real defect.
[0117] Dimensionality-compatible local discrimination score:
[0118]
[0119] in =2.0 is the error attenuation coefficient, which ensures that the score is always in the range of [0, 1], without negative values or numerical anomalies.
[0120] Local confidence calculation:
[0121]
[0122] The value ranges from [0, 1]. A higher value indicates that the local discrimination result of the region is more reliable.
[0123] S304: Hypergraph Space Association Modeling
[0124] Building a hypergraph model Modeling higher-order spatial relationships between candidate regions:
[0125] Node set All valid candidate regions The total number of nodes is m;
[0126] Hyperedge set E: Construct hyperedges according to three types of rules, with a total number of hyperedges of M, to achieve higher-order association modeling:
[0127] Spatial Proximity Hyperedge: Regions with a spatial Euclidean distance of less than 20 pixels are grouped into the same hyperedge;
[0128] Orientation Consistency Exceedance: Calculate Minimum Angle Difference: Only when < hour, , Let be the main extension direction angles of the i-th and j-th slender candidate regions, and group slender regions with the same gradient direction into the same hyperedge.
[0129] Texture similarity hyperedges: Regions with a texture feature Euclidean distance less than 0.2 are grouped into the same hyperedge; definition The weight coefficient for the e-th hyperedge is used to characterize the correlation importance of different types of hyperedges; a weight is assigned to each hyperedge. The weight of a hyperedge is 0.8 for spatial proximity, 0.7 for orientation consistency, and 0.6 for texture similarity. The higher the correlation strength, the greater the weight.
[0130] S305: Global constraint generation:
[0131] (1) Construction of the hypergraph correlation matrix:
[0132] The hypergraph incidence matrix H is an m×M matrix. The element H(i, e) = 1 indicates that the i-th node belongs to the e-th hyperedge, otherwise H(i, e) = 0.
[0133] The hyperedge weighted diagonal matrix W is an M×M diagonal matrix, with diagonal elements W(e, e) = , Let be the weight of the e-th superedge, with off-diagonal elements having a weight of 0;
[0134] Hyperdiagonal matrix : is an M×M diagonal matrix, with diagonal elements: That is, the number of nodes contained in the e-th superedge;
[0135] Node degree diagonal matrix :
[0136] For an m×m diagonal matrix, the diagonal elements That is, the sum of the weights of all hyperedges associated with the i-th node.
[0137] Calculating the Laplacian matrix of a hypergraph based on its structure: ;
[0138] (2) Calculation of global correlation strength:
[0139] Calculate the global association strength of a single candidate region: ;in Let k be the set of hyperedges containing node k. Let e be the number of nodes contained in the hyperedge. It reflects the spatial correlation strength between region k and the surrounding region; the larger the value, the stronger the spatial constraint effect.
[0140] S4: Local-Global Joint Optimization and Defect Assessment
[0141] A joint energy function integrating local discriminative data terms and hypergraph global smoothing terms is constructed. Adaptive weights are introduced to dynamically balance local feature evidence and global spatial constraints. A graph cut algorithm is used to minimize the energy and obtain the globally optimal defect label assignment result. For regions determined to be real defects, sub-pixel precision boundary refinement is performed using a level set active contour model to improve the defect contour localization accuracy. The specific steps are as follows:
[0142] S401: Construction of the local-global joint energy function:
[0143] Construct a joint energy function that includes local data terms and a global smoothing term:
[0144]
[0145] Where: L is the set of defect category labels for all candidate regions;
[0146] The defect category label is the kth candidate region. The label value can be real crack, real scorch mark, real mold spot or fake defect.
[0147] These are the defect category labels for the i-th and j-th candidate regions within the hyperedge, respectively.
[0148] For indicator functions, when and The value is 1 when the categories are different and 0 when the categories are the same, which is used to penalize inconsistent categories in spatially adjacent regions.
[0149] Local data items: This indicates that the k-th candidate region is classified as category [k]. The higher the probability of the corresponding category, the lower the energy cost.
[0150] Adaptive weighting coefficients: The weights achieve dual adaptive adjustment: the higher the local discrimination confidence, the smaller the global constraint weight; the higher the spatial correlation strength between regions, the larger the global constraint weight, thus retaining the discrimination results in high-confidence regions and using spatial context to correct the classification in low-confidence regions.
[0151] S402: Global optimization solution of graph cut algorithm:
[0152] The joint energy function is minimized using the maximum flow / minimum cut graph cut algorithm, and the steps are as follows:
[0153] Construct a network flow graph, where source nodes represent real defects and sink nodes represent pseudo-defects;
[0154] Each candidate region is mapped to a node in the graph;
[0155] The connection weights between nodes and the source and sink nodes are set as local data items. ;
[0156] The connection weights between adjacent nodes are set as a superedge-weighted smoothing term: ;
[0157] Perform maximum flow computation, obtain the minimum cut result, and output the globally optimal label assignment. Complete the defect type determination and false defect elimination.
[0158] S403: Refined Reconstruction of Defect Boundaries
[0159] For regions identified as genuine defects, an edge-guided active contour model (level set method) is used for sub-pixel precision boundary reconstruction.
[0160]
[0161] in:
[0162] It is a horizontal set function used to characterize the spatial location of the defect profile;
[0163] It is a Dirac delta function that acts near the contour to constrain the contour length and avoid excessive fragmentation.
[0164] : Length regularization coefficient, which controls the smoothness of the contour;
[0165] Edge driving coefficient: controls the strength of contour convergence towards high gradient edges in the image;
[0166] This is the Heaviside step function, used to distinguish between the internal and external regions of the contour;
[0167] The magnitude of the image gradient. The maximum gradient magnitude of the entire image;
[0168] Ω represents the image computation domain;
[0169] Let be the energy function of the level set. The defect profile is accurately converged by iteratively minimizing this energy.
[0170] The binary defect region output by the graph cut is converted into a signed distance function (SDF) and used as the initial contour ϕ0 of the level set, where the value is negative inside the region, positive outside, and zero at the boundary. Iterate 50–100 times or until the energy change is less than 10. −3 The time is stopped, and the final output is a sub-pixel precision defect contour. .
[0171] S5: Defect Quantification and Output:
[0172] The pixel-level geometric parameters of the defective area are converted into actual physical dimensions. Combined with preset thresholds and region division rules, a defect level assessment is completed, outputting a complete inspection report containing defect location, category, quantification parameters, level, and overall product judgment results. This provides a standardized basis for production line quality control. The specific steps are as follows:
[0173] S501: Physical Quantification of Defect Geometric Parameters:
[0174] Pixel-level geometric parameters are converted into actual physical dimensions to eliminate the influence of image resolution and shooting distance. The definitions and calculation methods of each parameter are as follows:
[0175] : Pixel area of the defect region, i.e., the total number of pixels contained in the defect connected region;
[0176] The actual physical area of the defect is calculated using the following formula: ;
[0177] : The pixel length corresponding to the longest side of the minimum bounding rectangle of the defect;
[0178] The actual physical length of the longer side of the defect is calculated using the following formula: ;
[0179] : The pixel length corresponding to the short side of the minimum bounding rectangle of the defect;
[0180] The actual physical length of the short side of the defect is calculated using the following formula: ;
[0181] s: Physical equivalent of a pixel, in mm / pixel, with a value of 0.1 mm / pixel.
[0182] Dimensionless shape parameters such as roundness, slenderness, concavity, and compactness are only related to the region proportion and can be directly calculated at the pixel level without the need for physical scale conversion.
[0183] S502: Defect Level Assessment:
[0184] Based on preset size thresholds and regional distribution rules, defects are divided into three levels:
[0185] Minor defects: Defects with a physical area of <5mm², or elongated defects with a physical length of <3mm, have a negligible impact on product quality;
[0186] Moderate defects: The physical area of the defect is between 5-20 mm², or the number of the same type of defect on a single product exceeds 3, which has a significant impact on the appearance and quality of the product.
[0187] Critical defects: Defects with a physical area greater than 20 mm², or defects located in the critical center area of the product. The critical center area of the product is defined as a rectangular area centered on the product's center of mass, occupying 40% of the product's total length in the length direction and 40% of the product's total width in the width direction. Defects located within this area are considered critical defects.
[0188] S503: Complete output of test results:
[0189] Output a detection report containing all key information:
[0190] Defect location information: physical coordinates of the defect center and physical coordinates of the defect's minimum bounding rectangle;
[0191] Defect category information: The type of defect, including cracks, scorch marks, mold spots, dents, etc.;
[0192] Defect quantification information: physical area, physical length, physical width of the defect, as well as shape parameters such as roundness and slenderness;
[0193] Defect rating information: Defect rating level, including minor defect, moderate defect, and severe defect;
[0194] Overall product assessment: The product is deemed qualified or unqualified. For unqualified products, the reasons for non-compliance are indicated, including the type of defect, the number of defects, the level of defects, and the distribution of key areas.
[0195] Secondly: The accompanying drawings of the embodiments disclosed in this invention only involve the structures involved in the embodiments disclosed in this invention. Other structures can refer to the general design. In the absence of conflict, the same embodiment and different embodiments of this invention can be combined with each other.
[0196] In conclusion, the above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.
Claims
1. A method for detecting appearance defects in flour products based on image recognition, characterized in that, include: S1: Image acquisition and frequency domain enhancement preprocessing: Acquire images of the surface of flour products and perform grayscale conversion and noise reduction in sequence. Decompose the image into different frequency domain sub-bands through discrete wavelet transform, process each frequency domain sub-band separately, and then reconstruct the preprocessed image through inverse wavelet transform. S2: Multi-scale morphological candidate region extraction: Based on the preprocessed image, multi-scale and multi-directional structural elements are constructed. Bright and dark defect responses are extracted by top-hat and bottom-hat transformations respectively. After normalizing the multi-scale responses, the maximum values are fused to obtain a comprehensive defect response map. Then, noise is removed by adaptive threshold segmentation and connected component analysis to obtain defect candidate regions. S3: Multimodal feature extraction and spatial correlation modeling: For each defect candidate region, extract morphological geometry, local texture and gradient field topology multimodal features. After weighted optimization by channel attention mechanism, use dual path network to obtain regional local discrimination results. At the same time, construct hypergraph model to characterize the spatial correlation between candidate regions and form global spatial constraints. S4: Local-Global Joint Optimization and Defect Judgment: A joint energy function is constructed by combining local discriminative data terms and hypergraph global smoothing terms. Local feature evidence and global spatial constraints are balanced by adaptive weights. The graph cut algorithm is used to minimize energy to complete the global optimal defect judgment. The level set model is used to perform sub-pixel precision fine reconstruction of the real defect boundary. S5: Defect Quantification and Output: Converts the pixel-level geometric parameters of defects into actual physical dimensions, completes the defect classification assessment based on preset level thresholds, and finally outputs an inspection report containing defect information and overall product judgment results.
2. The method for detecting appearance defects in flour products based on image recognition according to claim 1, characterized in that: S1 includes: Images of flour products were acquired under standardized illumination conditions. The images were then sequentially processed by grayscale conversion and median filtering for noise reduction, with pixel and physical size calibrated. A multi-level discrete wavelet transform was performed on the denoised images using a specified wavelet basis, decomposing them into low-frequency approximate sub-bands and high-frequency detail sub-bands in multiple directions. For the low-frequency approximate sub-bands, contrast-limited adaptive histogram equalization was used to correct for uneven illumination. For the mid-frequency detail sub-bands, natural texture suppression was performed based on local variance. For the high-frequency detail sub-bands, a nonlinear gain method was used to enhance defect edge features. Finally, each processed sub-band was reconstructed using inverse wavelet transform to obtain the preprocessed image.
3. The method for detecting appearance defects in flour products based on image recognition according to claim 1, characterized in that: The construction of multi-scale, multi-directional structural elements includes: Based on pixel physical equivalents, physical dimensions are converted into corresponding pixel scales, and multi-directional linear structural elements, disk-shaped structural elements, and cross-shaped structural elements are constructed respectively, forming a multi-scale, multi-directional set of structural elements covering different defect sizes and shapes.
4. The method for detecting appearance defects in flour products based on image recognition according to claim 1, characterized in that: The comprehensive defect response diagram includes: Based on multi-scale, multi-directional structural elements, top-hat transformation and bottom-hat transformation are performed on the preprocessed image respectively. The top-hat transformation includes performing opening operations on the preprocessed image sequentially, subtracting the original image from the opening operation result, and extracting the defect response in the image where the brightness is higher than the background. The bottom-hat transformation includes performing closing operations on the preprocessed image sequentially, subtracting the closing operation result from the original image, and extracting the defect response in the image where the brightness is lower than the background. The bright defect response and dark defect response at multiple scales are normalized and then fused using the maximum value. The two types of fused responses are then superimposed to obtain a comprehensive defect response map.
5. The method for detecting appearance defects in flour products based on image recognition according to claim 1, characterized in that: The extracted morphological geometry, local texture, and gradient field topological multimodal features include: The area, perimeter, thinness, roundness, concavity, and compactness of the defect candidate region are extracted as morphological geometric features; rotation-invariant local binary pattern features, gray-level co-occurrence matrix features, and local gray-level statistical features are extracted as local texture features; edge direction entropy, edge connectivity, and skeleton topology features are extracted as gradient field topology features; the above three types of features are normalized and then concatenated to form an initial fused feature vector.
6. The method for detecting appearance defects in flour products based on image recognition according to claim 5, characterized in that: After weighted optimization via the channel attention mechanism, a dual-path network is used to obtain regional-level local discrimination results, including: Global average pooling is used to extract global feature descriptions from the initial fused feature vector. The weights of each feature channel are learned through a two-layer fully connected network. After the weights are mapped to a reasonable range by an activation function, the initial fused feature vector is optimized by channel weighting to obtain the optimized fused features. A dual-path network is used to process the optimized fused features. One path outputs the defect category probability through a fully connected network, and the other path reconstructs the fused features through an autoencoder and calculates the normalized reconstruction error. The defect category probability and the normalized reconstruction error are combined to obtain the regional defect discrimination score and local confidence, which is the regional local discrimination result.
7. The method for detecting appearance defects in flour products based on image recognition according to claim 6, characterized in that: The construction of the hypergraph model to characterize the spatial relationships between candidate regions includes: Using all candidate defect regions as the node set of the hypergraph, a hyperedge set is constructed according to preset rules to form a hypergraph model. The construction rules of the hyperedge set include three categories: candidate regions whose spatial Euclidean distance meets the preset condition are grouped into the same hyperedge to form a spatially adjacent hyperedge; slender candidate regions whose main extension direction angle difference meets the preset condition are grouped into the same hyperedge to form a directionally consistent hyperedge; and candidate regions whose texture feature similarity meets the preset condition are grouped into the same hyperedge to form a texture similarity hyperedge. Corresponding weights are assigned to each type of hyperedge to characterize the spatial relationship between candidate regions through the hypergraph model.
8. The method for detecting appearance defects in flour products based on image recognition according to claim 7, characterized in that: The construction of the joint energy function includes: The local discriminant data terms of each candidate region are used as the basic terms of the energy function, and the global smoothing terms corresponding to each hyperedge in the hypergraph are used as the constraint terms. Combined with the adaptive weights determined by the local confidence and the global correlation strength, the local data terms and the global smoothing terms are fused to construct a joint energy function.
9. The method for detecting appearance defects in flour products based on image recognition according to claim 1, characterized in that: The method of using a level set model to perform sub-pixel precision fine-tuning reconstruction of the real defect boundary includes: The determined real defect region is transformed into a signed distance function as the initial level set contour. A level set energy function is constructed, which includes a contour length regularization term and an edge driving term. The level set energy function is iteratively optimized until the energy change is less than a preset threshold or the preset number of iterations is reached. Finally, a fine defect contour with sub-pixel precision is output.
10. The method for detecting appearance defects in flour products based on image recognition according to claim 1, characterized in that: The S5 includes: The pixel-level geometric parameters of the defect candidate area are converted into actual physical dimensions according to the preset pixel-to-physical-size conversion relationship; each defect is graded according to the preset defect area, length and regional distribution threshold, and divided into three defect levels: minor, moderate and severe; the location, category, physical quantification parameters and grade information of each defect are summarized, and a qualified or unqualified judgment result is given in combination with the overall defect situation of the product, and finally a complete inspection report containing all the above information is output.