A pump body casting surface defect intelligent detection method and system
By generating a texture complexity index and segmenting brightness residuals, combined with multi-scale feature weighted fusion and channel compression, the problems of texture misjudgment and boundary ambiguity in the surface defect detection of pump body castings are solved, achieving high-precision and high-efficiency detection results.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- WUHAN SPECIAL IND PUMP FACTORY
- Filing Date
- 2026-05-28
- Publication Date
- 2026-06-23
AI Technical Summary
Existing technologies struggle to distinguish the gradient differences between surface texture and subtle defects in pump body casting surface defect detection. Furthermore, the lack of a dynamic weight allocation mechanism based on image content during multi-scale feature fusion results in limited detection accuracy and blurred boundary positioning.
The texture complexity index is generated by calculating the local gradient magnitude and direction consistency. Combined with the brightness residual, segmented suppression and gain compensation are performed to construct the texture constraint map. Then, the multi-scale features are weighted and fused and the channels are grouped and compressed to generate defect discrimination features.
It effectively distinguishes between surface texture and minor defects, improving detection accuracy and boundary positioning accuracy while ensuring inference efficiency, thus solving the problem of balancing detection accuracy and efficiency in existing technologies.
Smart Images

Figure CN122265291A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of machine vision and industrial non-destructive testing technology. More specifically, this invention relates to an intelligent detection method and system for surface defects in pump body castings. Background Technology
[0002] As a core component of fluid transport equipment, the surface quality of pump body castings directly determines the equipment's sealing performance, pressure bearing capacity, and service life. During casting and subsequent machining, pump body castings are prone to various defects such as porosity, sand holes, cracks, and scratches. These defects vary in shape and size, and the casting surface itself contains casting and machining textures, posing significant challenges to automated inspection. Traditional manual visual inspection methods are inefficient, labor-intensive, and their results are easily affected by the inspector's experience and condition, making it difficult to meet the demands of modern industrial production for high-volume, high-precision inspection.
[0003] In recent years, detection technologies based on machine vision and deep learning have gradually become the mainstream technology in the field of casting surface defect detection. Among them, the technical approach of combining multi-scale feature pyramid networks with lightweight convolutional models has been widely used. This type of technology extracts image features at different resolution levels to cope with changes in defect scale and introduces basic attention mechanisms or channel weighting to suppress background interference, improving the network's focus on defect areas while maintaining model inference efficiency. For example, Chinese patent application CN118196059A discloses a lightweight detection method for casting surface defects. This method uses EfficientViT_b0 as the backbone network to extract multi-scale feature maps and enhances the multi-path feature extraction capability within a single scale through a CIRB module. This reduces the number of model parameters while improving multi-scale defect detection performance and can adapt to the deployment requirements of edge computing devices.
[0004] However, the above technical solutions still have the following shortcomings: First, in the early stages of image processing and feature input, the gradient differences between surface texture and weak defects are not analyzed and distinguished, making the low-level features susceptible to high-frequency noise contamination. This can easily lead to normal textures being misjudged as defects, or weak real defects being submerged in background noise. Second, when performing multi-scale feature fusion, most methods use simple addition or fixed-weight splicing, without fully exploring the difference response of the central neighborhood and the deep structural representation features of directional segmentation statistics. This makes it impossible to achieve the allocation of weights within the pyramid layer and between adjacent layers based on image content, resulting in the loss of defect discrimination information during cross-layer transmission. Finally, no channel grouping compression and rearrangement mechanism for multi-scale features is set before outputting the detection results. This fails to extract discrimination features while taking into account computational lightweighting, resulting in a large deviation between the generated position parameters and the boundary correction amount, causing problems such as limited detection accuracy and blurred defect boundary positioning.
[0005] Therefore, there is an urgent need to develop a surface defect detection technology for pump body castings that can effectively distinguish between surface texture and weak defects, achieve adaptive multi-scale feature fusion, and simultaneously balance detection accuracy and inference efficiency. Summary of the Invention
[0006] To address the problems of existing detection methods failing to distinguish gradient differences between surface texture and subtle defects during feature extraction, lacking a dynamic weight allocation mechanism based on image content during multi-scale fusion, and failing to perform channel grouping compression and rearrangement of multi-scale features before output, resulting in loss of defect discrimination information and limited boundary positioning accuracy, this invention provides the following technical solution.
[0007] In a first aspect, the present invention provides an intelligent detection method for surface defects of pump body castings, comprising: acquiring a surface image of the pump body casting; calculating the absolute values of local gradient magnitude, directional consistency, and brightness residual; linearly fusing the normalized local gradient magnitude and directional consistency with preset weight coefficients to generate a texture complexity index; and using the texture complexity index to perform segmented suppression and gain compensation on the absolute value of the brightness residual to obtain a texture constraint map; inputting the surface image of the pump body casting into a backbone convolutional network to extract multi-scale features; combining the texture constraint map to perform region-weighted sampling of each layer of multi-scale features to extract multi-layer structure representation features; based on the texture constraint map and the multi-layer structure representation features, calculating the intra-layer distribution coefficient of each pyramid level and the inter-layer distribution coefficient between adjacent layers; using the intra-layer distribution coefficient and the inter-layer distribution coefficient to perform weighted fusion of multi-scale features to obtain a defect candidate feature map; performing channel grouping compression and rearrangement on the defect candidate feature map to obtain defect discrimination features; and inputting the defect discrimination features into a detection head to generate a surface defect detection result for the pump body casting.
[0008] This invention constructs a texture complexity index by calculating image gradient and directional features, and generates a texture constraint map by segmenting the brightness residual. The principle behind this is to distinguish normal textures from real defects, thereby suppressing background interference and enhancing weak defect signals. In the feature extraction stage, multi-scale features extracted by the network are combined with the constraint map for weighted sampling to extract multi-layer structural representation features. Then, the allocation coefficients within and between pyramid layers are dynamically calculated for weighted fusion. This mechanism avoids the loss of subtle defects during cross-layer transmission. Finally, channel grouping compression and rearrangement are performed on the feature map. This reduces computational load, ensures inference efficiency, and achieves interactive fusion of cross-group channel information, improving defect classification accuracy and boundary localization precision.
[0009] Preferably, before calculating the local gradient magnitude, the method further includes: mapping the pixel values of each channel of the pump casting surface image to a preset interval through linear transformation to complete grayscale normalization preprocessing; setting a bilateral filtering window, calculating the spatial distance weight and grayscale difference weight between the center pixel and neighboring pixels respectively within the filtering window; multiplying the spatial distance weight and the grayscale difference weight to obtain a comprehensive filtering weight, and using the comprehensive filtering weight to perform a weighted summation on the neighboring pixels within the filtering window to output a smooth filtered image that retains edge features.
[0010] First, a linear transformation is used to map pixel values to a uniform range, eliminating baseline drift caused by non-uniform lighting. Then, a bilateral filtering mechanism is employed, comprehensively considering the spatial distance and grayscale differences between pixels to calculate weights and perform weighted updates. The principle behind this processing method is to increase smoothness in smooth regions and decrease filtering intensity in edge regions, effectively filtering out high-frequency random noise interference on the casting surface while fully preserving the true geometric gradient features of the defect contour, laying a solid image foundation for subsequent defect feature extraction.
[0011] Preferably, generating a texture complexity index based on the local gradient magnitude and the directional consistency includes: using a two-dimensional edge detection operator to calculate the horizontal and vertical gradients of image pixels respectively, and obtaining the local gradient magnitude by taking the square root of the sum of the squares of the two gradients; constructing a structure tensor matrix in the neighborhood of each pixel, calculating the eigenvalues of the structure tensor matrix, and using the difference between 1 and the ratio of the smaller eigenvalue to the sum of the larger eigenvalue and a preset minimal constant as the directional consistency; and linearly fusing the local gradient magnitude and the directional consistency according to a preset weighting coefficient to generate a texture complexity index representing the texture complexity of a local region.
[0012] Preferably, the calculation of the absolute value of the brightness residual includes: acquiring an image of the surface of the pump body casting, and obtaining a background image by applying a large kernel mean filter to the image of the surface of the pump body casting; the absolute value of the difference between the two is used as the absolute value of the basic brightness residual.
[0013] By adopting the above scheme, high-frequency signals that deviate from the basic brightness can be effectively extracted, highlighting the texture undulations and potential abnormal defects on the surface of the casting, providing reliable basic data support for further modulation and identification of real weak defects by combining the texture complexity index.
[0014] Preferably, the absolute value of the brightness residual is segmented and suppressed and gain-compensated using the texture complexity index to obtain a texture constraint map, including: setting a first threshold and a second threshold, wherein the second threshold is greater than the first threshold; when the texture complexity index of a pixel is less than the first threshold, the absolute value of the brightness residual of the pixel is amplified and compensated using a preset gain coefficient; when the texture complexity index of a pixel is between the first threshold and the second threshold, the absolute value of the brightness residual of the pixel remains unchanged; when the texture complexity index of a pixel is greater than the second threshold, the absolute value of the brightness residual of the pixel is multiplied and suppressed using a preset attenuation coefficient; and integrating the absolute values of the brightness residuals of each pixel after segmentation processing to generate the texture constraint map.
[0015] Preferably, the region-weighted sampling of the multi-scale features at each layer in conjunction with the texture constraint map includes: using an interpolation algorithm to spatially downsample the texture constraint map so that the spatial resolution matches the scale of the multi-scale features at each layer output by the backbone convolutional network; normalizing the downsampled texture constraint map into a spatial weight matrix; and multiplying the spatial weight matrix with the multi-scale features at the corresponding scale pixel by pixel in the spatial dimension to obtain a texture constraint-weighted activation feature map.
[0016] By interpolating and downsampling, the previously generated texture constraint map is adjusted to a resolution consistent with the features of different layers in the backbone network. After being converted into a standardized spatial weight matrix, it is multiplied pixel-by-pixel with the network features at the corresponding scale. Introducing prior information about defects at the image level as spatial attention into the deep feature extraction process can specifically modulate feature responses at different scales, effectively suppressing interference from normal background features and enhancing the feature weight of potential defect areas, thereby reducing the risk of information loss during the transmission of subtle defects across network layers.
[0017] Preferably, the extraction of multi-layer structure representation features includes: performing region pooling operation on the activation feature map to sample it, and combining the texture constraint map to calculate the center neighborhood difference response, directional segmentation statistics and connected region features after threshold segmentation, and concatenating them to obtain the multi-layer structure representation features.
[0018] Preferably, the step of calculating the intra-layer allocation coefficients of each pyramid level and the inter-layer allocation coefficients between adjacent layers, and using the intra-layer allocation coefficients and inter-layer allocation coefficients to perform weighted fusion of multi-scale features, includes: inputting the texture constraint map and multi-layer structure representation features into a spatial attention module, calculating the two-dimensional spatial weight matrix of each feature scale as the intra-layer allocation coefficient; inputting the features of adjacent layers into a multilayer perceptron after global average pooling, calculating the channel dimension weight vector as the inter-layer allocation coefficient; performing layer-by-layer dot multiplication and addition aggregation of multi-scale features according to the intra-layer allocation coefficients and inter-layer allocation coefficients, and generating a fused defect candidate feature map through bottom-up feature pyramid network information transmission.
[0019] By adopting the above scheme, dynamic allocation of feature weights based on image content is achieved, which can specifically strengthen the response of important features and effectively avoid the loss of subtle defect signals during cross-layer fusion transmission, thereby improving the accuracy of the final defect candidate feature representation.
[0020] Preferably, performing channel grouping compression and rearrangement on the defect candidate feature map to obtain defect discrimination features includes: dividing the defect candidate feature map into multiple feature groups according to the channel dimension; applying a convolutional layer with a kernel size of 1×1 to each feature group for channel dimensionality reduction compression to obtain a corresponding compressed feature group; rearranging and merging all compressed feature groups in an interleaved channel order to achieve cross-group channel information interaction and generate the defect discrimination features.
[0021] Secondly, the present invention provides an intelligent detection system for surface defects of pump body castings, including a processor and a memory. The memory stores computer program instructions, and when the computer program instructions are executed by the processor, the above-mentioned intelligent detection method for surface defects of pump body castings is implemented.
[0022] By adopting the above technical solution, a computer program is generated from the above-mentioned intelligent detection method for surface defects of pump body castings and stored in the memory so that it can be loaded and executed by the processor. In this way, a terminal device can be made based on the memory and the processor for convenient use.
[0023] The technical solution of the present invention has the following beneficial technical effects: This scheme generates a texture complexity index by combining edge-preserving filtering with local gradient magnitude and direction consistency, and constructs a texture constraint map by segmenting the brightness residual. Multi-scale features are weighted and sampled using the constraint map, and adaptive allocation coefficients are calculated based on multi-layer structure representation features to achieve content-based feature fusion. This approach enhances weak defect signals and suppresses false defect interference by selectively modulating residual signals in different texture regions, improving the identifiability of defect features. Simultaneously, the adaptive weighted fusion strategy prevents the loss of subtle defect features during network transmission, effectively solving the problems of existing technologies where low-level features are easily contaminated by high-frequency noise, normal textures are easily misjudged, and weak defects are easily missed.
[0024] Furthermore, channel grouping, compression, and rearrangement are performed on the defect candidate feature map to extract high-discrimination features, which are then input into the detection head to output the detection results. This mechanism, through intra-group dimensionality reduction and cross-group information interaction, reduces computational redundancy and ensures model inference speed while refining discriminative features. It solves the problems of existing technologies being unable to balance detection accuracy and inference efficiency, and the large deviation between position parameters and boundary correction amounts leading to ambiguous defect boundary localization, thereby improving the classification accuracy and boundary precision of the detection. Attached Figure Description
[0025] Figure 1 This is a flowchart of an intelligent detection method for surface defects in pump body castings according to the present invention; Figure 2 This is a schematic diagram of segmented constraints based on the texture complexity index; Figure 3 This is a schematic diagram of multi-scale hierarchical downsampling; Figure 4 This is a schematic diagram comparing the results of the ablation experiment. Detailed Implementation
[0026] The technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are some embodiments of the present invention, but not all embodiments.
[0027] The specific embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
[0028] This invention discloses an intelligent detection method for surface defects in pump body castings, referring to... Figure 1 This includes the following steps: S1. Image grayscale normalization and edge-preserving filtering preprocessing.
[0029] The specific steps for acquiring and preprocessing images of the pump body casting surface are as follows: First, images of the pump casting surface are acquired using an area array camera. The image matrix is then read and converted into a single-channel grayscale image. The extreme values of all pixels are calculated, and a linear interpolation algorithm is used to map the pixel grayscale values to the 0-1 range to achieve grayscale normalization, eliminating baseline drift caused by non-uniform lighting. The grayscale normalization calculation formula is as follows: ; in, These are the grayscale values of the original image pixels; This represents the minimum grayscale value of all pixels in the original image. This represents the maximum grayscale value of all pixels in the original image. This represents the normalized pixel grayscale value.
[0030] Then, a bilateral filtering function is used for edge-preserving filtering, and the specific steps are as follows: A bilateral filtering window with a size of 5×5 or 7×7 is set. During the window sliding process, the spatial distance weight between the center pixel and neighboring pixels is calculated using a Gaussian function. The preferred range for the spatial Gaussian standard deviation parameter is 1.5 to 3, and 2 is used in this embodiment. The gray value difference weight is calculated based on the normalized gray value difference between the center pixel and neighboring pixels. The preferred range for the gray value Gaussian standard deviation parameter is 0.05 to 0.2, and 0.1 is used in this embodiment. The spatial distance weight and gray value difference weight corresponding to each neighboring pixel are multiplied point by point to generate a comprehensive filtering weight, and all comprehensive weights within the window are summed and normalized. The normalized comprehensive weight is multiplied by the corresponding neighboring pixel gray value and a weighted summation operation is performed to update the center pixel value.
[0031] This step increases smoothness in smooth areas and reduces filtering intensity in edge areas, thereby filtering out high-frequency random interference such as fine sand holes and noise on the casting surface while fully preserving the true geometric gradient features of the pump body defect contour.
[0032] S2. Calculate the texture complexity and generate a texture constraint graph.
[0033] The absolute values of local gradient magnitude, orientation consistency, and brightness residual are calculated. The local gradient magnitude and orientation consistency are linearly fused using preset weighting coefficients to generate a texture complexity index. The texture complexity index is then used to perform piecewise suppression and gain compensation on the absolute value of the brightness residual to obtain a texture constraint map.
[0034] Specifically, the Sobel operator is used to calculate the pixel gradient values in the horizontal and vertical directions respectively, and the square root of the sum of the squares of the gradients in both directions is used to obtain the local gradient magnitude. A structure tensor matrix is constructed based on the gradient components, and the largest and smallest eigenvalues of this structure tensor matrix are calculated. The difference between 1 and the smallest eigenvalue divided by the sum of the largest eigenvalue and the smallest constant is used as the directional consistency. The normalized local gradient magnitude and directional consistency are linearly fused using preset weighting coefficients to generate a texture complexity index representing the texture complexity of the local region. The calculation formula is as follows: ; In the formula, The texture complexity index; This represents the normalized local gradient magnitude. For directional consistency; The weighting coefficient for the local gradient magnitude is preferably in the range of 0.5 to 0.7, and is taken as 0.6 in this embodiment; The weighting coefficient for directional consistency is preferably in the range of 0.3 to 0.5; in this embodiment, it is set to 0.4, and satisfies the following conditions: .
[0035] The formula for calculating directional consistency is: ; in, These are the larger eigenvalues of the structure tensor matrix; These are the smaller eigenvalues of the structure tensor matrix; Let be the local minimum constant, and let its value be . This is used to prevent division by zero errors.
[0036] Regarding the acquisition of the absolute value of the brightness residual, the specific method is as follows: the absolute value of the difference between the original image and the background image after 15×15 large kernel mean filtering is obtained as the basic brightness residual matrix.
[0037] The texture constraint map is obtained by using the texture complexity index to perform piecewise suppression and gain compensation on the absolute value of the brightness residual. The specific steps are as follows: The absolute value of the brightness residual is normalized to the [0,1] interval to obtain the normalized brightness residual; Set the first threshold A value of 0.25 corresponds to a smooth or slightly undulating normal casting surface; a second threshold is set. The value is 0.7, corresponding to threaded holes or rough structural areas, and ; then the texture complexity index for each pixel Perform point-by-point judgment: When the texture complexity index of a pixel When the background is flat, the potential defect signal is extremely weak, and a preset gain coefficient is used. The normalized luminance residual of this pixel is amplified and compensated. The preferred range is 1.5 to 2.5, and in this embodiment, 1.8 is used; When the texture complexity index of a pixel satisfies At the same time, the absolute value of the brightness residual of the pixel remains unchanged; When the texture complexity index of a pixel If this occurs, it indicates the presence of high-frequency natural textures or abrupt changes in normal structure in the area, which can easily lead to false positives. Therefore, a preset attenuation coefficient should be used. Multiply the absolute value of the brightness residual of the pixel to suppress it. The preferred range is 0.2 to 0.5, and in this embodiment, it is 0.3; Finally, the absolute values of the brightness residuals of each pixel after segmentation are integrated to generate a two-dimensional texture constraint map matrix with the same resolution and size as the original image.
[0038] like Figure 2 As shown in the figure, the three-segment residual signal modulation rule adopted in this invention is as follows: when the texture complexity index is lower than the first threshold, a fixed gain coefficient is used to amplify the brightness residual; when the texture complexity index is between the first threshold and the second threshold, the brightness residual remains unchanged; when the texture complexity index is higher than the second threshold, a fixed attenuation coefficient is used to suppress the brightness residual.
[0039] S3. Input the backbone network to extract multi-scale features.
[0040] The surface image of the pump body casting is input into the backbone convolutional network to extract multi-scale features, specifically: A ResNet50 backbone convolutional network based on the PyTorch deep learning framework was constructed to transform the surface image of the pump casting into a 3D tensor input to the network. The structure of this backbone convolutional network includes initial convolutional layers and max-pooling layers, as well as multiple stacked residual layers. Each residual layer contains multiple convolutional layers and batch normalization layers, and the input is added to the output through skip connections of cross-layer identity mappings.
[0041] Intermediate feature maps from three different downsampling stages are extracted to construct a multi-scale feature map. The spatial dimensions of the three hierarchical feature maps are 1 / 8, 1 / 16, and 1 / 32 of the original input image resolution, respectively. If the input image resolution is 1024×1024, the corresponding feature map dimensions are 128, 64, and 32, respectively. By constructing a multi-scale feature pyramid through hierarchical downsampling, it can adapt to the feature extraction and detection needs of defects of different sizes.
[0042] like Figure 3As shown in the figure, this diagram illustrates the spatial size variation of feature maps at each level after the input image is downsampled layer by layer through the backbone convolutional network. It intuitively presents the construction process of the multi-scale feature pyramid and demonstrates the adaptability of different level features to defects of different sizes.
[0043] S4. Combine the constraint graph with weighted fusion to generate candidate feature maps.
[0044] By combining the texture constraint map, region-weighted sampling is performed on the features of each layer of multi-scale features to extract multi-layer structure representation features; based on the texture constraint map and multi-layer structure representation features, intra-layer and inter-layer allocation coefficients are calculated to perform weighted fusion of multi-scale features to obtain defect candidate feature maps.
[0045] The 1024×1024 texture constraint map generated in the above steps is subjected to cubic proportional spatial downsampling using a bilinear interpolation algorithm to generate sub-constraint maps that match the three physical resolution sizes mentioned above. Using the Softmax or Min-Max scaling function, the values of the three sets of sub-constraint maps are mapped to a normalized space from 0.0 to 1.0 to obtain a spatial weight matrix at a specific scale.
[0046] The spatial weight matrix of each scale level is broadcast and copied along the channel dimension to the corresponding multi-scale feature map, which has the same number of channels (256 in this embodiment). The spatial weights are then multiplied pixel-by-pixel using the Hadamard product method to obtain a texture-constrained weighted activation feature map. The calculation formula is as follows: ; in, To activate the feature map, This is the spatial weight matrix. This is a multi-scale feature map corresponding to the scale. This represents the Hadamard product operation.
[0047] Max pooling is performed on the activated feature map using a sliding window of size 3×3 and step size 1 to detect the maximum response point of the defect, while average pooling is performed to preserve the smooth semantics of the region. The two pooling results are then concatenated and sampled in the channel dimension to extract feature information representing local structural characteristics.
[0048] Further calculations of the center neighborhood difference response, directional segmentation statistics, and connected region features after threshold segmentation yield multi-layer structure representation features: A 3×3 convolution kernel is used to calculate and sum the grayscale differences between the center pixel and its eight neighboring pixels in the weighted sampled feature map, obtaining the center neighborhood difference response; the directional gradient histogram algorithm is used to calculate the sum of gradient magnitudes across nine angle intervals from 0° to 180°, obtaining the directional segmentation statistics; Otsu's method is applied to calculate a global threshold for binarization segmentation, extracting the pixel area and bounding rectangle size of the connected regions as connected region features; to address the alignment issue of features with different dimensions, the directional segmentation statistics and connected region features are mapped and expanded into a tensor with the same spatial resolution as the activated feature map; finally, the expanded features and the center neighborhood difference response are concatenated along the channel dimension to obtain the multi-layer structure representation features.
[0049] Subsequently, the texture constraint map and the multi-layer structure representation features are input into the spatial attention module, and the two-dimensional spatial weight matrix of each feature scale is calculated as the intra-layer allocation coefficient. The features of adjacent layers are then global average pooled and input into a multi-layer perceptron containing two fully connected layers, and the channel dimension weight vector is calculated as the inter-layer allocation coefficient.
[0050] Finally, based on the intra-layer and inter-layer allocation coefficients, the multi-scale features are aggregated through layer-by-layer dot multiplication and addition. After information transmission through the bottom-up feature pyramid network, a fused defect candidate feature map is generated. This step realizes the dynamic allocation of intra-layer and inter-layer weights based on image content, avoiding the loss of subtle defect features during network transmission.
[0051] S5: Channel compression and rearrangement, and output of defect detection results.
[0052] Channel grouping, compression, and rearrangement are performed on the candidate feature map of defects to obtain defect discrimination features. These defect discrimination features are then input into the detection head to generate the surface defect detection results of the pump body casting. The specific steps are as follows: Assume the defect candidate feature map output by the previous fusion stage has a size of H×W×C, where the total number of channels C is 256. The feature map is uniformly cut and divided into G feature groups along the channel dimension of the tensor. The preferred value of G is 4 or 8. In this embodiment, G is 4, so that each feature group contains 64 consecutive channels. For the four H×W×64 feature groups, four 1×1 pointwise convolutional layers with no shared weights are instantiated, and the number of output channels is set to 32 for channel dimensionality reduction compression. After each convolution operation, batch normalization and ReLU activation function are concatenated to generate four compressed feature groups of H×W×32. Subsequently, a channel shuffling and rearrangement mechanism is executed: the four feature groups are concatenated into an H×W×128 tensor. In memory, the channel dimension is transformed into a matrix structure containing 4 rows and 32 columns. A transpose operation is performed to swap the two dimensions, resulting in a matrix structure containing 32 rows and 4 columns. Finally, it is flattened back into a single 128-dimensional channel dimension. This operation forcibly interleaves the elements within the originally isolated feature groups, achieving parameter-free deep fusion of cross-group channel information and generating highly structured defect discrimination features.
[0053] The defect discrimination features are input into the detection head, and the outputs are defect category scores, target location parameters, and boundary correction amounts: A single-stage target detection head network containing classification and regression branches is constructed, and the defect discrimination features are input into the two branches respectively; the classification branch uses the Softmax activation function to calculate the predicted probability values of various surface defect categories, including pores, sand holes, and cracks, as the defect category scores; the regression branch uses a linear feature mapping layer to calculate the center point coordinates and length and width dimensions of the prior anchor box as the target location parameters; at the same time, the regression network is used to predict the horizontal and vertical offset coordinates of the real defect bounding box relative to the prior anchor box as the boundary correction amounts.
[0054] Finally, invalid predicted boxes with scores below the set confidence threshold are filtered out. The Torchvision library's Non-Maximum Suppression (NMS) algorithm is then used to eliminate redundant overlapping candidate boxes according to the Cross-Union Ratio (CUI). The retained target predicted boxes and their corresponding class labels are overlaid on the original input pump casting surface image to generate a visualized pump casting surface defect detection result.
[0055] This ablation experiment was conducted using a dataset of surface defects from industrial pump castings. The dataset contained 5000 training images and 1000 test images, with all input images normalized to 1024×1024 resolution. The backbone feature extraction network employed a residual network architecture, and the optimizer was set to stochastic gradient descent. The initial learning rate was set to 0.01, the momentum parameter to 0.9, and the batch size to 16, with a total of 200 training iterations. The model performance was evaluated using mean precision and frames per second (fps).
[0056] like Figure 4As shown in the figure, the detection performance of four schemes is compared: the baseline model, the scheme with only added filtering and constraint algorithms, the scheme with added multi-scale feature fusion module, and the scheme using the complete process of this invention. Experimental results show that: the baseline model, using a backbone network for defect detection, achieves an average accuracy of 78.5% and a detection frame rate of 45 per second; after adding only edge-preserving filtering and a segmented constraint map based on texture complexity index to the baseline model, the average accuracy increases to 83.2%, while the detection frame rate decreases to 41 per second; after adding a multi-scale feature region weighted sampling module, the average accuracy reaches 87.6%, and the detection frame rate is 38 per second; when using the complete scheme including defect candidate feature map channel grouping compression and rearrangement mechanism, the average accuracy reaches a maximum of 91.3%, and the detection frame rate recovers to 42 per second. This mechanism effectively enhances the weight response of local geometric features, reduces the false negative rate, and ensures inference efficiency while improving detection accuracy.
[0057] This invention also discloses an intelligent detection system for surface defects of pump body castings, including a processor and a memory. The memory stores computer program instructions, and when the computer program instructions are executed by the processor, an intelligent detection method for surface defects of pump body castings according to the present invention is implemented.
[0058] The above description represents the preferred embodiments of the present invention. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principles of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.
Claims
1. A method for intelligent detection of surface defects in pump body castings, characterized in that, include: The surface image of the pump body casting is acquired, and the absolute values of local gradient magnitude, orientation consistency, and brightness residual are calculated. The normalized local gradient magnitude and orientation consistency are linearly fused using preset weighting coefficients to generate a texture complexity index. The texture complexity index is then used to perform segmented suppression and gain compensation on the absolute value of the brightness residual to obtain a texture constraint map. The surface image of the pump body casting is input into the backbone convolutional network to extract multi-scale features. The texture constraint map is combined to perform region-weighted sampling on the features of each layer of the multi-scale features to extract multi-layer structure representation features. Based on the texture constraint map and the multi-layer structure representation features, the intra-layer allocation coefficient of each pyramid level and the inter-layer allocation coefficient between adjacent layers are calculated. The multi-scale features are weighted and fused using the intra-layer allocation coefficient and the inter-layer allocation coefficient to obtain the defect candidate feature map. The candidate feature map of defects is subjected to channel grouping compression and rearrangement to obtain defect discrimination features. The defect discrimination features are then input into the detection head to generate the surface defect detection results of the pump body casting.
2. The intelligent detection method for surface defects of pump body castings according to claim 1, characterized in that, Before calculating the local gradient magnitude, the method further includes: mapping the pixel values of each channel of the surface image of the pump body casting to a preset range through linear transformation to complete the grayscale normalization preprocessing. Set up a bilateral filtering window, and calculate the spatial distance weight and gray value difference weight between the center pixel and its neighboring pixels respectively within the filtering window; The spatial distance weight and the gray value difference weight are multiplied to obtain the comprehensive filtering weight. The comprehensive filtering weight is then used to perform a weighted summation on the neighboring pixels within the filtering window to output a smooth filtered image that retains edge features.
3. The intelligent detection method for surface defects of pump body castings according to claim 1, characterized in that, The text complexity index is generated based on the consistency between the local gradient magnitude and the direction, including: using a two-dimensional edge detection operator to calculate the horizontal and vertical gradients of the image pixels respectively, and obtaining the local gradient magnitude by taking the square root of the sum of the squares of the two. Construct a structure tensor matrix within the neighborhood of each pixel, calculate the eigenvalues of the structure tensor matrix, and use the difference between 1 and the ratio of the smaller eigenvalue to the sum of the larger eigenvalue and a preset minimal constant as the directional consistency. The local gradient magnitude and the direction consistency are linearly fused according to a preset weighting coefficient to generate a texture complexity index that represents the texture complexity of a local region.
4. The intelligent detection method for surface defects of pump body castings according to claim 1, characterized in that, The calculation of the absolute value of the luminance residual includes: The surface image of the pump body casting is obtained, and a background image is obtained by applying a large kernel mean filter to the surface image of the pump body casting; the absolute value of the difference between the two is used as the absolute value of the basic brightness residual.
5. The intelligent detection method for surface defects of pump body castings according to claim 1, characterized in that, The texture constraint map is obtained by segmenting and suppressing the absolute value of the brightness residual using the texture complexity index and performing gain compensation. This includes: setting a first threshold and a second threshold, where the second threshold is greater than the first threshold; when the texture complexity index of a pixel is less than the first threshold, a preset gain coefficient is used to amplify and compensate the absolute value of the brightness residual of the pixel; when the texture complexity index of a pixel is between the first threshold and the second threshold, the absolute value of the brightness residual of the pixel remains unchanged; when the texture complexity index of a pixel is greater than the second threshold, a preset attenuation coefficient is used to multiply and suppress the absolute value of the brightness residual of the pixel; and integrating the absolute values of the brightness residuals of each pixel after segmentation processing to generate the texture constraint map.
6. The intelligent detection method for surface defects of pump body castings according to claim 1, characterized in that, The texture constraint map is used to perform region-weighted sampling of features at each layer of the multi-scale features, including: An interpolation algorithm is used to spatially downsample the texture constraint map so that the spatial resolution matches the scale of the multi-scale features output by each layer of the backbone convolutional network. The downsampled texture constraint map is normalized into a spatial weight matrix; The spatial weight matrix is multiplied pixel-by-pixel with the corresponding multi-scale features in the spatial dimension to obtain an activation feature map weighted by texture constraints.
7. The intelligent detection method for surface defects of pump body castings according to claim 6, characterized in that, The extraction of multi-layer structure representation features includes: The activation feature map is sampled by performing region pooling operation, and the center neighborhood difference response, directional segmentation statistics and connected region features after threshold segmentation are calculated by combining the texture constraint map, and then spliced to obtain the multi-layer structure representation feature.
8. The intelligent detection method for surface defects of pump body castings according to claim 1, characterized in that, The calculation of the intra-layer allocation coefficients of each pyramid level and the inter-layer allocation coefficients between adjacent layers, and the weighted fusion of multi-scale features using the intra-layer allocation coefficients and inter-layer allocation coefficients, includes: inputting the texture constraint map and multi-layer structure representation features into the spatial attention module, and calculating the two-dimensional spatial weight matrix of each feature scale as the intra-layer allocation coefficients. The features of adjacent layers are globally averaged and then input into the multilayer perceptron. The weight vector of the channel dimension is calculated as the inter-layer allocation coefficient. Based on the intra-layer and inter-layer allocation coefficients, multi-scale features are aggregated by layer-by-layer dot multiplication and addition, and a fused defect candidate feature map is generated through bottom-up feature pyramid network information transmission.
9. The intelligent detection method for surface defects of pump body castings according to claim 1, characterized in that, Perform channel grouping compression and rearrangement on the defect candidate feature map to obtain defect discrimination features, including: dividing the defect candidate feature map into multiple feature groups according to the channel dimension; For each feature group, a convolutional layer with a kernel size of 1×1 is applied to perform channel dimensionality reduction compression to obtain the corresponding compressed feature group; All compressed feature groups are rearranged and merged in an interleaved channel order to achieve cross-group channel information exchange and generate the defect discrimination features.
10. An intelligent detection system for surface defects in pump body castings, characterized in that, include: A processor and a memory, wherein the memory stores computer program instructions that, when executed by the processor, implement the intelligent detection method for surface defects of pump body castings according to any one of claims 1-9.