A method and device for detecting defects on the inner surface of steel
By introducing a cross-fusion module and CIoU loss function into the steel inner surface detection model, the problem of insufficient feature fusion in the existing technology is solved, thereby improving the accuracy of steel inner surface defect detection and the ability to identify defects in complex environments.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- JIANGNAN UNIV
- Filing Date
- 2025-05-15
- Publication Date
- 2026-06-26
AI Technical Summary
Existing steel internal surface defect detection models suffer from insufficient feature fusion during the feature fusion stage, leading to reduced detection accuracy.
A cross-fusion module is used to construct the neck network. The self-attention mechanism is used to fuse deep semantic features to shallow features during the upsampling process. The CIoU loss function is combined for model training to improve the overall integrity and accuracy of feature fusion.
It can effectively acquire overall information about the defect area on the inner surface of steel, reduce information loss, and improve detection accuracy, especially the ability to identify defects in complex scenarios.
Smart Images

Figure CN120495253B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of steel production technology, and in particular to a method and apparatus for detecting defects on the inner surface of steel. Background Technology
[0002] Steel, as a crucial material in modern industry and construction, faces varying requirements for its specific strength, durability, and versatility under different circumstances. The overall quality of steel depends on factors such as mechanical properties, shape accuracy, and surface quality. During the production process, limitations at different stages and with the equipment used can lead to various internal surface defects in steel products, including scratches, pits, patches, and corrosion. These defects negatively impact the overall performance of the material, shorten its service life, and pose significant safety risks. Therefore, accurately detecting and assessing internal surface defects in steel is a critical step in maintaining product quality and ensuring safe use.
[0003] In recent years, the development of advanced imaging technology and artificial intelligence algorithms has led to significant breakthroughs in the detection methods of internal surface defects in steel. In particular, models such as CNN, YOLO series and ResNet can automatically extract rich feature information from images, effectively overcoming the problems of low accuracy and time consumption of traditional detection methods.
[0004] However, defects on the inner surface of steel are usually characterized by complexity, diversity, and variability. Although existing detection models can extract deeper semantic features, they only use simple cascading or addition methods in the fusion stage of semantic features and shallow details. This makes it difficult to effectively fuse semantic information and detailed features, which can easily lead to insufficient feature fusion and reduce the detection accuracy of the inner surface of steel. Summary of the Invention
[0005] Therefore, the technical problem to be solved by the present invention is to overcome the fact that the existing detection model uses simple cascading or addition methods to fuse features, which easily leads to insufficient feature fusion and reduces the detection accuracy of the inner surface of steel.
[0006] To solve the above-mentioned technical problems, the present invention provides a method for detecting defects on the inner surface of steel, comprising:
[0007] The image of the inner surface of the steel to be detected is input into the steel inner surface detection model, and the first scale feature P1, the second scale feature P2, the third scale feature P3 and the fourth scale feature P′4 are obtained through the backbone network.
[0008] Input the fourth scale feature P′4 and the third scale feature P3 into the third cross-fusion module to obtain the third fusion feature P′3; input the third fusion feature P′3 and the second scale feature P2 into the second cross-fusion module to obtain the second fusion feature P′2; input the second fusion feature P′2 and the first scale feature P1 into the first cross-fusion module to obtain the first fusion feature P′1;
[0009] In each cross-fusion module, deep features P′ i The size is adjusted to match the shallow feature P through an upsampling operation. i-1 Consistent, the upsampled deep feature P″ is obtained. i The above sampled deep features P″ i Generate query sequences with shallow features P i-1 Generate key and value sequences; calculate the similarity between the query and key sequences; multiply the similarity between the query and key sequences by the value sequences and pass the result through a Sigmoid activation function to obtain a fusion gating matrix; weight the fusion gating matrix onto the shallow feature P. i-1 The fusion feature P′ is obtained. i-1 ;
[0010] The first fusion feature P′1, the second fusion feature P′2, and the third fusion feature P′3 are input into the first detection head, the second detection head, and the third detection head, respectively, to obtain the detection results of defects on the inner surface of the steel.
[0011] Preferably, the backbone network and detection head of the steel inner surface detection model both adopt the YOLOv8 model backbone network and detection head.
[0012] Preferably, the deep features P″ after the above sampling i Generate query sequences with shallow features P i-1 The formulas for generating key and value sequences include:
[0013] Q = P" i W Q K = P i-1 W K V = P i-1 W V
[0014] Where Q represents the query sequence, K represents the key sequence, and V represents the value sequence; W Q W K and W V These are the linear projections corresponding to Q, K, and V, respectively.
[0015] Preferably, the similarity between the query sequence and the key sequence is calculated using the following formula:
[0016] C′=σ(Q×K)
[0017] Where Q represents the query sequence, K represents the key sequence, σ represents the Sigmoid activation function, and C′ represents the similarity between the query sequence and the key sequence.
[0018] Preferably, the similarity between the query sequence and the key sequence is multiplied by the value sequence and then passed through a Sigmoid activation function to obtain the fusion gating matrix, as shown in the formula:
[0019] C″=σ(C′×V
[0020] Where C″ represents the fusion gating matrix, σ represents the Sigmoid activation function, C′ represents the similarity between the query sequence and the key sequence, and V represents the value sequence.
[0021] Preferably, the fusion gating matrix is weighted onto the shallow feature P. i-1 The fusion feature P′ is obtained. i-1 The formula is:
[0022] P′ i-1 =C″⊙P i-1
[0023] Where C″ represents the fusion gate matrix and ⊙ represents pixel-level dot product.
[0024] Preferably, when training the steel inner surface detection model, the CIoU loss function is used as the regression loss;
[0025] The CIoU loss function is expressed as follows:
[0026]
[0027] Among them, L CIoU Let CIoU represent the loss function, IoU represent the intersection-union ratio of the predicted box and the ground truth box, d represent the distance between the center points of the predicted box and the ground truth box, c represent the diagonal distance of the minimum bounding rectangle of the predicted box and the ground truth box, α represent the balancing parameter, and v represent the correction factor.
[0028] Preferably, the formula for calculating the correction factor is:
[0029]
[0030] Where v represents the correction factor, w G and h G These represent the width and height of the actual bounding box, respectively. p and h p These represent the width and height of the prediction box, respectively.
[0031] Preferably, the formula for calculating the equilibrium parameter is:
[0032]
[0033] Where α represents the balancing parameter, IoU represents the intersection-union ratio of the predicted box and the ground truth box, and v represents the correction factor.
[0034] The present invention also provides a device for detecting defects on the inner surface of steel, comprising:
[0035] The feature extraction module is used to input the image of the inner surface of the steel to be detected into the steel inner surface detection model, and obtain the first scale feature P1, the second scale feature P2, the third scale feature P3 and the fourth scale feature P′4 through the backbone network;
[0036] The feature fusion module is used to input the fourth-scale feature P′4 and the third-scale feature P3 into the third cross-fusion module to obtain the third fused feature P′3; input the third fused feature P′3 and the second-scale feature P2 into the second cross-fusion module to obtain the second fused feature P′2; and input the second fused feature P′2 and the first-scale feature P1 into the first cross-fusion module to obtain the first fused feature P′1.
[0037] In each cross-fusion module, deep features P′ i The size is adjusted to match the shallow feature P through an upsampling operation. i-1 Consistent, the upsampled deep feature P″ is obtained. i The deep features P″ after sampling above i Generate query sequences with shallow features P i-1 Generate key and value sequences; calculate the similarity between the query and key sequences; multiply the similarity between the query and key sequences by the value sequences and pass the result through a Sigmoid activation function to obtain a fusion gating matrix; weight the fusion gating matrix onto the shallow feature P. i-1 The fusion feature P′ is obtained. i-1 ;
[0038] The detection module is used to input the first fusion feature P′1, the second fusion feature P′2, and the third fusion feature P′3 into the first detection head, the second detection head, and the third detection head, respectively, to obtain the detection results of defects on the inner surface of the steel.
[0039] Compared with the prior art, the above-described technical solution of the present invention has the following advantages:
[0040] The present invention discloses a method for detecting defects on the inner surface of steel. Based on the YOLOv8 model, a neck network is constructed using a cross-fusion module to fuse multi-scale features output by the backbone network. The cross-fusion module utilizes a self-attention mechanism to selectively fuse higher-level information into shallower features based on deep semantic features during the upsampling process. This achieves holistic processing of upsampling and feature fusion, enabling targeted fusion of specific features from contextual information to effectively obtain overall information about the defect region on the inner surface of the steel. Furthermore, it indirectly ensures the principle of minimizing information loss during the fusion stage, reducing information loss in the feature fusion stage and contributing to improving the accuracy of detecting defects on the inner surface of steel.
[0041] Furthermore, this invention employs the CIoU loss function as the regression loss of the model. The CIoU loss function focuses on the complete intersection between target boxes and introduces a correction factor to more accurately measure the similarity between target boxes. This enables the model to better understand the accurate location and specific shape of the target boxes during training, which not only allows for more accurate evaluation of the quality of the target boxes but also improves the accuracy of the model in detecting defects on the inner surface of steel in complex scenarios. Attached Figure Description
[0042] To make the content of this invention easier to understand, the invention will be further described in detail below with reference to specific embodiments and accompanying drawings, wherein:
[0043] Figure 1 This is a structural diagram of the steel inner surface detection model of the present invention;
[0044] Figure 2 This is a structural diagram of the cross-integration module;
[0045] Figure 3 This is a schematic diagram of the CIoU loss function;
[0046] Figure 4 This is an example image showing the results of detecting internal surface defects in steel using the YOLOv8 model;
[0047] Figure 5 This is an example image showing the results of detecting internal surface defects in steel using the C-YOLOv8 model.
[0048] Figure 6 This is a flowchart for detecting defects on the inner surface of steel. Detailed Implementation
[0049] The present invention will be further described below with reference to the accompanying drawings and specific embodiments, so that those skilled in the art can better understand and implement the present invention. However, the embodiments described are not intended to limit the present invention.
[0050] Existing detection algorithms commonly suffer from misaligned and insufficient feature fusion when dealing with irregular defects in complex scenarios. This is because traditional algorithms often employ simple concatenation or addition operations during the fusion phase, resulting in the ineffective fusion of semantic information and detailed features, indirectly affecting subsequent detection results. Furthermore, traditional algorithms use a fusion-then-sampling strategy to compensate for lost detailed features during upsampling. This strategy can easily lead to the loss of subtle characteristics in the fused features during upsampling, failing to comprehensively model the relationship between the two and further reducing subsequent detection results.
[0051] To address the issues of insufficient and misaligned feature fusion in existing detection methods, this invention constructs a steel internal surface detection model based on the YOLOv8 model, referred to as the C-YOLOv8 model. The C-YOLOv8 model is an end-to-end network employing an "encoder-decoder" architecture, and its structure is referenced... Figure 1 As shown, it includes a backbone network, a neck network, and a detection head.
[0052] The backbone network of the steel inner surface inspection model adopts the YOLOv8 model, which aims to achieve the extraction of deeper semantic information with a smaller model.
[0053] The backbone network includes an initial convolutional layer, a first feature extraction module, a second feature extraction module, a third feature extraction module, a fourth feature extraction module, and an SPPF module connected in sequence.
[0054] The initial convolutional layer is a 3×3 convolutional block.
[0055] Each n×n convolutional block in the network structure includes n×n two-dimensional convolutional layers, normalization layers, and SiLU activation functions connected in sequence.
[0056] Each feature extraction module consists of a 3×3 convolutional block and a C2f module connected in sequence.
[0057] The input features of the C2f module first pass through an input convolutional block. The output features of the input convolutional block are split into two parts by a splitting operation. One part undergoes feature processing through multiple Bottleneck blocks, while the other part is retained as the retained features. The features processed by the multiple Bottleneck blocks are concatenated with the retained features, and then passed through an output convolutional block to obtain the output features of the C2f module. Both the input and output convolutional blocks of the C2f module are 1×1 convolutional blocks.
[0058] The input features of the Bottleneck block are sequentially passed through a 3×3 convolutional block and a 1×1 convolutional block. Then, the output features of the 3×3 convolutional block and the output features of the 1×1 convolutional block are added together via a residual connection to obtain the output features of the Bottleneck block.
[0059] Preferably, the first feature extraction module includes 3 Bottleneck blocks, the second feature extraction module includes 6 Bottleneck blocks, the third feature extraction module includes 6 Bottleneck blocks, and the fourth feature extraction module includes 3 Bottleneck blocks.
[0060] The input features of the SPPF module are sequentially passed through an input convolutional block and three cascaded max-pooling layers. The output features of the input convolutional block and the output features of each max-pooling layer are concatenated and then passed through an output convolutional block to obtain the output features of the SPPF module. Both the input and output convolutional blocks of the SPPF module are 1×1 convolutional blocks.
[0061] The neck network of the steel inner surface inspection model consists of three cross fusion modules (CFM). The cross fusion module adopts a self-attention fusion strategy, which integrates the fusion and sampling processes. During the upsampling process, it selectively fuses adjacent features to obtain the complete overall area of defects on the inner surface of the steel.
[0062] Specifically, the structure of the cross-integration module refers to... Figure 2 As shown. In each cross-fusion module, deep features First, an upsampling operation is performed to make its size match the shallow features. Consistent, resulting in upsampled deep features Among them, H i W i and C i H represents the height, width, and number of channels of the deep feature, respectively. i-1 W i-1 and C i-1 These represent the height, width, and number of channels of the shallow layer feature, respectively.
[0063] The above sampled deep features P″ i Generate query sequences with shallow features P i-1 The formula for generating key and value sequences is as follows:
[0064] Q = P″ i W Q K = P i-1 W K V = P i-1 W V
[0065] in, Indicates the query sequence. Represents a key sequence. W represents a sequence of values. Q WK and W V These are the linear projections corresponding to Q, K, and V, respectively.
[0066] The similarity between the query sequence and the key sequence is calculated using the following formula:
[0067] C′=σ(Q×K)
[0068] Where Q represents the query sequence, K represents the key sequence, σ represents the Sigmoid activation function, and C′ represents the similarity between the query sequence and the key sequence.
[0069] The similarity between the query sequence and the key sequence is multiplied by the value sequence and then passed through a Sigmoid activation function to obtain a fusion gating matrix, further improving the pixel-level fusion between adjacent stages. The Sigmoid activation function acts as a gate, filtering out irrelevant factors between the two sequences and focusing on capturing regions of common interest. The formula for generating the fusion gating matrix is expressed as:
[0070] C″=σ(C′×V)
[0071] Where C″ represents the fusion gating matrix, σ represents the Sigmoid activation function, C′ represents the similarity between the query sequence and the key sequence, and V represents the value sequence.
[0072] The fusion gate matrix is weighted to the shallow feature P. i-1 This guides shallow features to focus on small regions after fusion, resulting in fused feature P′. i-1 The formula is expressed as:
[0073] P′ i-1 =C″⊙P i-1
[0074] Here, ⊙ represents pixel-level dot product.
[0075] This invention constructs a cross-fusion module and combines it with the backbone network of the YOLOv8 model. The cross-fusion module employs a self-attention fusion strategy, using deep features as a benchmark to holistically process the fusion and upsampling processes. It selectively fuses specific features from contextual information, thereby effectively acquiring overall information about the defect region on the inner surface of steel and indirectly ensuring the principle of minimizing information loss during the fusion stage. This innovation provides a novel and efficient solution for the research of steel inner surface defect detection algorithms in complex environments, and has significant practical application value.
[0076] The detection head of the steel inner surface inspection model adopts the YOLOv8 model detection head, which includes three detection heads, used to detect large targets, medium targets and small targets respectively.
[0077] The detection head includes a regression branch and a classification branch. Each branch consists of two 3×3 convolutional blocks and a 1×1 two-dimensional convolutional layer connected in sequence. The regression branch outputs the position and size of the predicted bounding box; the classification branch outputs the probability that the predicted target belongs to each category.
[0078] The specific steps for detecting internal surface defects in steel using an internal surface inspection model include:
[0079] The image of the inner surface of the steel to be detected is input into the backbone network of the steel inner surface detection model. The initial features are obtained after passing through the initial convolutional layer. The initial features are then processed by the first feature extraction module to obtain the first scale feature P1. The first scale feature P1 is processed by the second feature extraction module to obtain the second scale feature P2. The second scale feature P2 is processed by the third feature extraction module to obtain the third scale feature P3. The third scale feature P3 is then processed by the fourth feature extraction module and the SPPF module to obtain the fourth scale feature P′4.
[0080] Input the fourth scale feature P′4 and the third scale feature P3 into the third cross-fusion module to obtain the third fusion feature P′3; input the third fusion feature P′3 and the second scale feature P2 into the second cross-fusion module to obtain the second fusion feature P′2; input the second fusion feature P′2 and the first scale feature P1 into the first cross-fusion module to obtain the first fusion feature P′1;
[0081] The first fusion feature P′1, the second fusion feature P′2, and the third fusion feature P′3 are input into the first detection head, the second detection head, and the third detection head, respectively, to obtain the detection results of defects on the inner surface of the steel.
[0082] Current models use only a simple loss function to evaluate the difference between predicted and actual bounding boxes during training. However, when the environment is complex, the difference in defect size is small, or the defect dimensions are irregular, this simple loss function cannot obtain the optimal weight values for the model. Therefore, it is necessary to introduce a loss function more suited to the specific environment to train the model and achieve optimal results.
[0083] Preferably, when training the steel inner surface detection model, the present invention uses the CIoU loss function as the regression loss. The CIoU loss function considers the complete intersection between target boxes and introduces a correction factor to more accurately measure the similarity between target boxes, enabling the steel inner surface detection model to better understand the accurate position and shape of the target boxes during training.
[0084] Reference Figure 3 As shown, the formula for the CIoU loss function is:
[0085]
[0086] Among them, LCIoU Let CIoU represent the loss function, IoU represent the intersection-union ratio of the predicted box and the ground truth box, d represent the distance between the center points of the predicted box and the ground truth box, c represent the diagonal distance of the minimum bounding rectangle of the predicted box and the ground truth box, α represent the balancing parameter, and v represent the correction factor.
[0087] The formula for calculating the correction factor is:
[0088]
[0089] Among them, w G and h G These represent the width and height of the actual bounding box, respectively. p and h p These represent the width and height of the prediction box, respectively.
[0090] The formula for calculating the equilibrium parameters is:
[0091]
[0092] During model training, to address the issue of incomplete overlap between predicted and ground truth bounding boxes, this invention employs the CIoU loss function as the regression loss, with a special correction factor, enabling the model to more accurately measure the specific similarity between the target box and the actual bounding box. This solves the problem that traditional IoU loss functions cannot better locate the accurate position and specific shape of the target box. This innovation not only more accurately evaluates the quality of the target box but also improves the model's superior performance in complex scenes.
[0093] Figure 4 and Figure 5 The figures show example results of detecting internal surface defects in steel using the YOLOv8 model and the C-YOLOv8 model, respectively. As can be seen from the figures, the C-YOLOv8 model proposed in this invention is significantly superior to the YOLOv8 model. These results highlight the effectiveness of the C-YOLOv8 model in addressing the challenges posed by defect diversity, irregular shapes, and complex backgrounds, providing reliable technical support for the field of steel internal surface defect detection algorithms.
[0094] In summary, the steel internal surface defect detection method of this invention, based on the YOLOv8 model, constructs a neck network using a cross-fusion module to fuse multi-scale features output by the backbone network. This cross-fusion module utilizes a self-attention mechanism to selectively fuse higher-level information into shallower features during upsampling, based on deep semantic features. This achieves holistic processing of upsampling and feature fusion, enabling targeted fusion of specific features from contextual information to effectively acquire overall information about the steel internal surface defect region. Furthermore, it indirectly ensures the principle of minimizing information loss during the fusion stage, reducing information loss in the feature fusion stage and contributing to improved accuracy in steel internal surface defect detection.
[0095] Furthermore, this invention employs the CIoU loss function as the regression loss of the model. The CIoU loss function focuses on the complete intersection between target boxes and introduces a correction factor to more accurately measure the similarity between target boxes. This enables the model to better understand the accurate location and specific shape of the target boxes during training, which not only allows for more accurate evaluation of the quality of the target boxes but also improves the accuracy of the model in detecting defects on the inner surface of steel in complex scenarios.
[0096] In practical applications, refer to Figure 6 As shown, the main steps for detecting defects on the inner surface of steel are as follows:
[0097] Dataset preprocessing includes: annotating the collected steel inner surface defects with roLabelImg to construct the dataset; generating the dataset's XML file; converting the dataset's XML file to a TXT file; expanding the dataset through data augmentation operations such as flipping and mirroring; and dividing the dataset into training and testing sets.
[0098] A steel inner surface inspection model was built, including: the backbone network of the YOLOv8 model, the neck network constructed based on the cross-fusion module, and the inspection head of the YOLOv8 model;
[0099] Model training includes: determining the weights required for training the steel inner surface detection model; setting model parameters; training the weights of the steel inner surface detection model using the training set; and obtaining the trained steel inner surface detection model.
[0100] The model prediction includes: inputting the image of the inner surface of the steel to be detected into the trained steel inner surface detection model to obtain the detection results of the defects on the inner surface of the steel; randomly sampling within a certain area based on the detection results to calculate various indicators; extracting the color features of the detection results; and assigning different RGB values to different types of defects to represent the type of defects and their specific locations.
[0101] Based on the above-described method for detecting internal surface defects in steel, this embodiment also provides a device for detecting internal surface defects in steel, comprising:
[0102] The feature extraction module is used to input the image of the inner surface of the steel to be detected into the steel inner surface detection model, and obtain the first scale feature P1, the second scale feature P2, the third scale feature P3 and the fourth scale feature P′4 through the backbone network;
[0103] The feature fusion module is used to input the fourth-scale feature P′4 and the third-scale feature P3 into the third cross-fusion module to obtain the third fused feature P′3; input the third fused feature P′3 and the second-scale feature P2 into the second cross-fusion module to obtain the second fused feature P′2; and input the second fused feature P′2 and the first-scale feature P1 into the first cross-fusion module to obtain the first fused feature P′1.
[0104] In each cross-fusion module, deep features P′ i The size is adjusted to match the shallow feature P through an upsampling operation. i-1 Consistent, the upsampled deep feature P″ is obtained. i The above sampled deep features P″ i Generate query sequences with shallow features P i-1 Generate key and value sequences; calculate the similarity between the query and key sequences; multiply the similarity between the query and key sequences by the value sequences and pass the result through a Sigmoid activation function to obtain a fusion gating matrix; weight the fusion gating matrix onto the shallow feature P. i-1 The fusion feature P′ is obtained. i-1 ;
[0105] The detection module is used to input the first fusion feature P′1, the second fusion feature P′2, and the third fusion feature P′3 into the first detection head, the second detection head, and the third detection head, respectively, to obtain the detection results of defects on the inner surface of the steel.
[0106] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0107] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0108] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0109] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0110] Obviously, the above embodiments are merely illustrative examples for clear explanation and are not intended to limit the implementation. Those skilled in the art will recognize that other variations or modifications can be made based on the above description. It is neither necessary nor possible to exhaustively list all possible implementations. However, obvious variations or modifications derived therefrom are still within the scope of protection of this invention.
Claims
1. A method for detecting defects on the inner surface of steel, characterized in that, include: The image of the inner surface of the steel to be detected is input into the steel inner surface detection model, and the first-scale features are obtained through the backbone network. Second-scale features Third-scale features and fourth-scale features ; Fourth scale features and third-scale features Input the third cross-fusion module to obtain the third fusion feature. ; to integrate the third feature Second-scale features Input the second cross-fusion module to obtain the second fusion feature. ; the second fusion feature and first-scale features Input the first cross-fusion module to obtain the first fusion feature. ; In each cross-fusion module, fusion features The size is adjusted to match the shallow features through an upsampling operation. Consistent, resulting in upsampled deep features. The above sampled deep features Generate query sequences with shallow features Generate key sequences and value sequences; calculate the similarity between query sequences and key sequences; The similarity between the query sequence and the key sequence is multiplied by the value sequence and then passed through a Sigmoid activation function to obtain a fusion gating matrix; this fusion gating matrix is then weighted and applied to shallow features. , obtain fusion features ; Among them, the above-sampled deep features Generate query sequences with shallow features The formulas for generating key and value sequences include: ; in, Indicates the query sequence. Represents a key sequence. Represents a sequence of values; , and They are respectively , and The corresponding linear projection; The formula for calculating the similarity between the query sequence and the key sequence is: ; in, This represents the Sigmoid activation function. Indicates the similarity between the query sequence and the key sequence; The similarity between the query sequence and the key sequence is multiplied by the value sequence and then passed through the Sigmoid activation function to obtain the fusion gating matrix, as shown in the formula: ; in, Represents the fusion gate matrix; The fusion gate matrix is weighted to shallow features. , obtain fusion features The formula is: ; in, Represents pixel-level dot product; The first fusion feature Second fusion feature and third fusion features Input the first, second, and third detection heads respectively to obtain the detection results of defects on the inner surface of the steel.
2. The method for detecting defects on the inner surface of steel according to claim 1, characterized in that, The backbone network and detection head of the steel inner surface inspection model both adopt the YOLOv8 model.
3. The method for detecting defects on the inner surface of steel according to claim 1, characterized in that, When training the steel inner surface detection model, the CIoU loss function is used as the regression loss; The CIoU loss function is expressed as follows: ; in, Represents the CIoU loss function. This represents the intersection-union ratio (IoU) between the predicted bounding box and the ground truth bounding box. This represents the distance between the center points of the predicted bounding box and the ground truth bounding box. This represents the diagonal distance between the smallest bounding rectangle of the predicted bounding box and the ground truth bounding box. Represents the balance parameters. This represents the correction factor.
4. The method for detecting defects on the inner surface of steel according to claim 3, characterized in that, The formula for calculating the correction factor is: ; in, Indicates the correction factor. and These represent the width and height of the actual bounding box, respectively. and These represent the width and height of the prediction box, respectively.
5. The method for detecting defects on the inner surface of steel according to claim 4, characterized in that, The formula for calculating the equilibrium parameters is: ; in, Represents the balance parameters. This represents the intersection-union ratio (IoU) between the predicted bounding box and the ground truth bounding box. This represents the correction factor.
6. A device for detecting defects on the inner surface of steel, characterized in that, The apparatus is used to implement a method for detecting defects on the inner surface of steel as described in any one of claims 1 to 5, comprising: The feature extraction module is used to input the image of the inner surface of the steel to be detected into the steel inner surface detection model, and obtain the first-scale features through the backbone network. Second-scale features Third-scale features and fourth-scale features ; The feature fusion module is used to integrate fourth-scale features. and third-scale features Input the third cross-fusion module to obtain the third fusion feature. ; to integrate the third feature Second-scale features Input the second cross-fusion module to obtain the second fusion feature. ; the second fusion feature and first-scale features Input the first cross-fusion module to obtain the first fusion feature. ; In each cross-fusion module, fusion features The size is adjusted to match the shallow features through an upsampling operation. Consistent, resulting in upsampled deep features. The above sampled deep features Generate query sequences with shallow features Generate key and value sequences; calculate the similarity between the query and key sequences; multiply the similarity between the query and key sequences by the value sequences and pass the result through a Sigmoid activation function to obtain a fusion gating matrix; weight the fusion gating matrix to the shallow features. , obtain fusion features ; The detection module is used to integrate the first fused features. Second fusion feature and third fusion features Input the first, second, and third detection heads respectively to obtain the detection results of defects on the inner surface of the steel.