Laser vision real-time weld seam tracking method based on lightweight target detection network

By constructing a lightweight target detection network and combining knowledge distillation and convolution kernel pruning techniques, the weld seam tracking method was optimized, solving the problem of low real-time performance in weld seam tracking and achieving efficient welding on resource-constrained equipment.

CN116652387BActive Publication Date: 2026-06-26SOUTH CHINA UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SOUTH CHINA UNIV OF TECH
Filing Date
2023-06-19
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing weld seam tracking methods based on laser vision systems have low real-time performance under resource constraints, resulting in reduced welding accuracy.

Method used

A lightweight object detection network is adopted. By constructing teacher network models and student network models, and combining knowledge distillation and convolution kernel pruning techniques, the student network model is optimized to reduce computational overhead and improve real-time performance.

Benefits of technology

While maintaining high welding accuracy, the number of model parameters and computational resource requirements have been reduced, while the real-time performance and computing speed of weld seam tracking have been improved, making it suitable for equipment with limited computing resources.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116652387B_ABST
    Figure CN116652387B_ABST
Patent Text Reader

Abstract

The application discloses a laser vision real-time weld seam tracking method based on a lightweight target detection network, and the method comprises the following steps: before welding, an industrial camera collects an initial weld seam image and transmits the initial weld seam image to an embedded industrial computer, pixel coordinate values of initial weld seam feature points are acquired, and a starting position of the weld seam is obtained; a lightweight target neural network is constructed; during welding, the industrial camera continuously collects weld seam images and sends the weld seam images to the embedded industrial computer, pixel coordinate values of weld seam feature points in the weld seam images are extracted by using the lightweight target neural network; a welding position to be reached by a robot is obtained through the pixel coordinate values of the weld seam feature points; a difference between the welding position of the robot and a current position of the robot is calculated, and a deviation value is sent to a control cabinet, so that the welding gun of the robot is controlled to automatically track and move along the weld seam. The weld seam image is detected by using the lightweight target neural network, and the problem that the real-time rate of the existing weld seam tracking method is not high on a resource-limited device is solved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of weld seam tracking technology, specifically relating to a laser vision real-time weld seam tracking method, storage medium, and system based on a lightweight target detection network. Background Technology

[0002] Welding, as a crucial component of industrial manufacturing, is widely used in industries such as automotive processing, shipbuilding, and general machinery. With the development of industrial automation, high-efficiency, automated robotic welding has been widely adopted in industrial production, solving problems such as harsh working environments, high labor intensity, and inconsistent weld quality associated with traditional manual welding. Traditional welding robots primarily use a teach-and-write programming model. This method is not only inflexible but also struggles to meet the high precision requirements of welding processes. Therefore, with the rapid development of deep learning and laser vision technologies, welding methods combining robots and laser vision sensors for weld seam tracking are gradually replacing the traditional robot-taught welding approach.

[0003] The weld seam tracking method based on a laser vision system first detects the position of the weld seam from the image data acquired by the laser vision sensor using an algorithm. Then, based on the position of the weld seam, it controls the robot to complete the welding work efficiently and accurately.

[0004] However, due to the high real-time requirements of weld seam tracking, the existing weld seam tracking methods based on laser vision systems require a large computational overhead. Under resource constraints, this leads to a reduction in the real-time performance of weld seam tracking, thus failing to meet the real-time requirements of weld seam tracking and affecting the accuracy of welding. Summary of the Invention

[0005] The purpose of this invention is to overcome the shortcomings of the existing technology and provide a laser vision real-time weld seam tracking method based on a lightweight target detection network. This method effectively solves the problem that existing weld seam tracking methods require large computational overhead, which leads to low real-time performance of weld seam tracking under resource-constrained equipment, thus affecting welding accuracy.

[0006] Meanwhile, another objective of the present invention is to provide a storage medium.

[0007] Meanwhile, another objective of the present invention is to provide a laser vision real-time weld seam tracking system based on a lightweight target detection network.

[0008] The objective of this invention is achieved through the following technical solution:

[0009] A laser vision real-time weld seam tracking method based on a lightweight target detection network includes the following steps:

[0010] S1. Before the welding work begins, the industrial camera in the laser vision sensor acquires an initial weld seam image characterized by laser stripes and transmits it to the embedded industrial control computer. The embedded industrial control computer initializes the initial weld seam image, obtains the pixel coordinate values ​​of the initial weld seam feature points in the initial weld seam image, and converts them into three-dimensional coordinate values ​​in the base coordinate system of the welding robot to obtain the starting position of the weld seam.

[0011] S2. Construct a teacher network model and a student network model. Train the teacher network model using the preprocessed weld seam image dataset to obtain an optimized teacher network model. Use knowledge distillation to extract the knowledge obtained from training the optimized teacher network model and transfer it to the student network model to obtain a preliminary optimized student network model. Finally, apply the convolution kernel pruning method to remove redundant parameters in the preliminary optimized student network model to obtain the optimal student network model. Use the optimal student network model as the lightweight object detection network.

[0012] S3. When the welding work begins, the industrial camera continuously acquires weld seam images and sends them to the embedded industrial control computer. The lightweight target detection network is used to process the acquired weld seam images and extract the pixel coordinate values ​​of the weld seam feature points in the acquired weld seam images.

[0013] S4. Convert the pixel coordinate values ​​of the weld feature points obtained in step S3 into three-dimensional coordinate values ​​in the base coordinate system of the welding robot to obtain the welding position that the welding robot needs to reach. Calculate the difference between the welding position and the current position of the welding robot, and send the obtained deviation value to the welding robot control cabinet in real time. The welding robot control cabinet then controls the welding torch of the welding robot to track the weld seam, thereby completing the real-time automatic tracking of the weld seam.

[0014] Preferably, step S2 includes the following steps:

[0015] S21. Construct a teacher network model and a student network model based on the SSD algorithm; both the teacher network model and the student network model adopt deep convolutional neural networks, and the number of parameters of the student network model is smaller than that of the teacher network model.

[0016] S22. Obtain weld seam images from historical welding processes, construct the weld seam image dataset, preprocess the weld seam image dataset, and train the teacher network model using the preprocessed weld seam image dataset to obtain an optimized teacher network model.

[0017] S23. Knowledge distillation is used to extract the knowledge obtained from training the optimized teacher network model and transfer it to the student network model. The student network model is then trained using the weld seam image dataset to obtain a preliminary optimized student network model.

[0018] S24. Apply the convolution kernel pruning method to remove redundant parameters in the initially optimized student network model, and fine-tune the pruned student network model using the weld seam image dataset to obtain the optimal student network model, and use the optimal student network model as the lightweight object detection network.

[0019] Preferably, the training process of the teacher network model in step S22 is as follows:

[0020] The optimization method used for training is gradient descent. Before training begins, weld image training data is obtained from the preprocessed weld image dataset. At the start of training, the weight parameters of the teacher network model are randomly initialized. Then, the weld image training data is input into the teacher network model for inference, and the output of the teacher network model is compared with the real weld to obtain the loss function value. Then, gradient backpropagation is performed to update the weights of the teacher network model, and the model continues to be trained iteratively. When the loss function value no longer decreases and the model test accuracy meets the welding requirements, training is stopped, and the optimized teacher network model is saved.

[0021] Preferably, the knowledge distillation operation described in step S23 is as follows:

[0022] For knowledge distillation, the distillation loss is defined as follows:

[0023]

[0024] in, It is a distillation loss. It is a convolution operation that matches the number of channels in the student's feature layer with the number of channels in the teacher's layer. and These are the feature maps of the student network model and the teacher network model, respectively, where N is the total number of elements in the distilled feature layer. The feature layer represents the knowledge transfer layer, where C, W, and H are the number of channels, width, and height of the feature map, respectively.

[0025] Then, in order to enable the student network model to learn the knowledge from the optimized teacher network model, a distillation operation was performed on the student network model, with distillation loss... The original loss of the SSD algorithm of the network All of them participated in the model training process, and the total loss is expressed as:

[0026]

[0027] For distillation losses, The original loss of the SSD algorithm, An adjustable hyperparameter to balance the loss term;

[0028] During training, the weights of the teacher network model are frozen and do not participate in gradient updates, while the student network model continuously learns from the teacher network model and updates its own weights, ultimately minimizing the total loss function. To obtain an optimized student network model.

[0029] Preferably, the convolution kernel pruning process of the initially optimized student network model in step S24 is as follows:

[0030] First, let the pruning ratio of the i-th layer convolutional kernel in the initially optimized student network model be . Then the number of convolution kernels in this layer Reduce to Output feature layer The size of the tensor becomes ;

[0031] Generally, a smaller norm of a convolutional kernel leads to smaller activation outputs, thus having a smaller impact on the final model's predictions. Based on this understanding, the norm is used to evaluate the importance of each convolutional kernel. The norm of the j-th convolutional kernel in the i-th convolutional layer of the model is represented as follows:

[0032]

[0033] in, For the j-th convolutional kernel of the i-th convolutional layer in the model, Let be the norm of the j-th convolutional kernel in the i-th convolutional layer of the model. These represent the indices of the corresponding convolutional kernel's channel count, width, and height, respectively. These represent the number of channels, length, and width of the corresponding convolutional kernel, respectively, with p taking a value of 2. Norm;

[0034] In each convolutional layer, the kernels are sorted according to their norm values, and kernels with smaller norm values ​​are removed according to a pre-set pruning ratio.

[0035] The convolutional kernels to be removed from the i-th layer of the preliminarily optimized student network model are obtained and removed. Then, the cropped student network model is fine-tuned and trained using the preprocessed weld seam image dataset to obtain the optimal student network model. The optimal student network model is then used as the lightweight object detection network.

[0036] Preferably, the specific process of step S3 is as follows:

[0037] When welding begins, the industrial camera continuously acquires images at a sampling frequency of 60 frames per second and sends them to the embedded industrial control computer for image processing. The embedded industrial control computer crops out the regions containing weld stripe features from the acquired images and inputs them into the lightweight target detection network. The lightweight target detection network processes the continuously input weld images and finally outputs a single feature map with a scale of 7*7 based on the input weld images. The feature map contains the position coordinates of the weld. Then, a non-maximum suppression algorithm is performed on the feature map to finally obtain the center coordinates of the target candidate box, which are the pixel coordinate values ​​of the weld feature points in the input weld image.

[0038] Preferably, the lightweight object detection network includes a first convolutional block, a second convolutional block, a third convolutional block, a fourth convolutional block, a fifth convolutional block, a sixth convolutional block, and a seventh convolutional block connected in sequence. Each of the first, second, third, and fourth convolutional blocks is followed by a max pooling layer. Each of the first, second, third, fourth, fifth, sixth, and seventh convolutional blocks contains a batch normalization layer and an activation function layer.

[0039] Preferably, step S1 specifically includes the following steps:

[0040] S11. Before the welding work begins, place the workpiece to be welded on the worktable, adjust the position and posture of the welding robot's robotic arm so that the end of the welding torch is above the weld seam of the workpiece to be welded, and make the laser vision sensor fixed on the welding torch in the optimal working position.

[0041] S12. The industrial camera in the laser vision sensor acquires an initial weld image characterized by laser stripes and sends it to an embedded industrial control computer. The embedded industrial control computer performs thresholding by calling the library function of Halcon software and performs initialization processing using morphological methods to obtain the pixel coordinate values ​​of the initial weld feature points.

[0042] S13. Using a calibration algorithm, the pixel coordinate values ​​of the initial weld feature points are converted into three-dimensional coordinate values ​​in the base coordinate system of the welding robot to obtain the starting position of the weld.

[0043] A storage medium for storing non-transitory computer instructions, which, when executed, perform the laser vision real-time weld seam tracking method based on a lightweight target detection network.

[0044] A laser vision real-time weld seam tracking system based on a lightweight target detection network includes a welding robot, a welding torch, a robot control cabinet, supporting welding equipment, a laser vision sensor, an embedded industrial computer, and a workbench. The embedded industrial computer includes a processor and a memory. The memory stores non-transitory computer instructions. When the non-transitory computer instructions are executed by the processor, the laser vision real-time weld seam tracking method based on the lightweight target detection network is executed.

[0045] The present invention has the following advantages over the prior art:

[0046] (1) The laser vision real-time weld seam tracking method based on a lightweight target detection network of the present invention extracts features of the weld seam through a lightweight target detection network. Compared with the original SSD network, the lightweight target detection network of the present invention reduces the number of model parameters by knowledge distillation and convolution kernel pruning while maintaining high welding accuracy. This results in the lightweight target detection network model occupying less memory, having a faster operation speed, and requiring and consuming less computing resources. It can be widely used in various application scenarios of target detection, especially embedded platforms and mobile devices with limited computing resources. This effectively solves the problem that the existing weld seam tracking method based on laser vision system requires a large computing overhead, which leads to a decrease in the real-time performance of weld seam tracking under resource constraints, thus failing to meet the real-time requirements of weld seam tracking and affecting the welding accuracy.

[0047] (2) The laser vision real-time weld tracking system based on a lightweight target detection network of the present invention collects weld images and weld feature points through the laser vision sensor for detection, and performs subsequent communication, calculation and processing through an embedded industrial control computer. It has a high degree of automation and high welding efficiency. Attached Figure Description

[0048] Figure 1 This is a schematic diagram of the overall structure of the laser vision real-time weld seam tracking system based on a lightweight target detection network according to an embodiment of the present invention;

[0049] Figure 2This is a schematic diagram of the structure of the laser vision sensor in the laser vision real-time weld seam tracking system based on a lightweight target detection network according to an embodiment of the present invention.

[0050] Figure 3 This is a schematic diagram of the laser vision real-time weld seam tracking method based on a lightweight target detection network according to an embodiment of the present invention.

[0051] Figure 4 This is a schematic diagram of typical noise in a laser vision real-time weld seam tracking method based on a lightweight target detection network according to an embodiment of the present invention.

[0052] Figure 5 This is a schematic diagram of knowledge distillation and transfer based on a lightweight target detection network according to an embodiment of the present invention;

[0053] Figure 6 This is a schematic diagram of the structure of a lightweight target detection network according to an embodiment of the present invention;

[0054] In the diagram: 1-Welding equipment, 2-Welding robot, 3-Welding torch, 4-Laser vision sensor; 41-Sensor protective housing, 42-Industrial camera, 43-Optical filter, 44-Protective glass, 45-Three-line laser generator, 5-Workpiece, 6-Workbench, 7-Embedded industrial computer, 8-Welding robot control cabinet, 9-Laser stripe, 10-Feature points of weld, 11-Arc light. Detailed Implementation

[0055] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0056] The technical solution of the present invention will be further described below with reference to the accompanying drawings and specific embodiments.

[0057] Example 1

[0058] like Figures 1-6 As shown, the laser vision real-time weld seam tracking method based on a lightweight target detection network includes the following steps:

[0059] S1. Before the welding work begins, the industrial camera 42 in the laser vision sensor 4 acquires an initial weld seam image characterized by laser stripes 9 and transmits it to the embedded industrial control computer 7. The embedded industrial control computer 7 initializes the initial weld seam image, obtains the pixel coordinate values ​​of the initial weld seam feature points in the initial weld seam image, and converts them into three-dimensional coordinate values ​​in the base coordinate system of the welding robot 2 to obtain the starting position of the weld seam.

[0060] Specifically, step S1 includes the following steps:

[0061] S11. Before the welding work begins, the workpiece to be welded is placed on the worktable 6. The position and posture of the robotic arm of the welding robot 2 are adjusted so that the end of the welding torch 3 is above the weld seam of the workpiece to be welded, and the laser vision sensor 4 fixed on the welding torch 3 is in the optimal working position, that is, it can capture a clear image during the welding process, and will not cause interference between the laser vision sensor 4 and the workpiece to be welded and the fixture.

[0062] S12. The industrial camera 42 in the laser vision sensor 4 acquires an initial weld seam image characterized by laser stripes and sends it to the embedded industrial control computer 7. The embedded industrial control computer 7 performs threshold processing by calling the library function of Halcon software and performs initialization processing using morphological methods to obtain the pixel coordinate values ​​of the initial weld seam feature points.

[0063] S13. Using a calibration algorithm, the pixel coordinate values ​​of the initial weld feature points are converted into three-dimensional coordinate values ​​in the base coordinate system of the welding robot 2 to obtain the starting position of the weld.

[0064] S2. Construct a teacher network model and a student network model. Train the teacher network model using the preprocessed weld seam image dataset to obtain an optimized teacher network model. Use knowledge distillation to extract the knowledge obtained from training the optimized teacher network model and transfer it to the student network model to obtain a preliminary optimized student network model. Finally, apply the convolution kernel pruning method to remove redundant parameters in the preliminary optimized student network model to obtain the optimal student network model. Use the optimal student network model as the lightweight object detection network.

[0065] Specifically, step S2 includes the following steps:

[0066] S21. Construct a teacher network model and a student network model based on the SSD algorithm; both the teacher network model and the student network model adopt deep convolutional neural networks, and the number of parameters of the student network model is smaller than that of the teacher network model.

[0067] like Figure 5As shown, the backbone network of the teacher network model is a VGG-16 convolutional neural network model. Compared with the original VGG-16 convolutional neural network, the teacher network model removes the last convolutional block of the original VGG-16 network model, adds three convolutional layers as feature extraction heads, and sets the stride of the middle layer of the three added feature layers to 2. At the same time, the 6-scale feature map prediction method of the original SSD algorithm in the teacher network model is changed to a single feature map prediction method, and the number of default boxes at each position is changed to 3, with sizes of 64, 112, and 150 respectively. This setting can reduce the number of network parameters, reduce the computational load of the model, and speed up the inference speed of the model.

[0068] Specifically, the teacher network model has a total of 13 convolutional layers and 4 max pooling layers, and a batch normalization layer and an activation function layer are added after each convolution operation.

[0069] Specifically, such as Figure 5 As shown, the student network model has a similar model structure to the teacher network model. Compared with the teacher network model, the student network model removes one convolutional layer from the first convolutional block, the third convolutional block, and the fourth convolutional block, respectively.

[0070] S22. Obtain weld seam images from historical welding processes, construct a weld seam image dataset, preprocess the weld seam image dataset, and train the teacher network model using the preprocessed weld seam image dataset to obtain an optimized teacher network model.

[0071] Specifically, before model training begins, weld seam images from historical welding processes are acquired. These images are weld seam pictures with arc noise. The weld seam images from historical welding processes are used to construct a weld seam image dataset and are used as the training dataset to train the model. In addition, to increase the diversity of training samples and to conform to real welding scenarios, random offsets and regional noise are added to the images during training.

[0072] The optimization method used for training is gradient descent. The SGD optimizer in the PyTorch framework is used to optimize the network model parameters, and the optimized teacher network model is saved. The training process of the teacher network model is as follows:

[0073] Before training begins, weld image training data is obtained from the preprocessed weld image dataset. At the start of training, the weight parameters of the teacher network model are randomly initialized. Then, the weld image training data is input into the teacher network model for inference. The output of the teacher network model is compared with the real weld to obtain the loss function value. Then, gradient backpropagation is performed to update the weights of the teacher network model. The model is iteratively trained. When the loss function value no longer decreases and the model test accuracy meets the welding requirements, training is stopped and the optimized teacher network model is saved.

[0074] Specifically, the training conditions for the teacher network model described in step S22 are set as follows: learning rate of 0.3, number of iterations of 100, and model testing accuracy of error within 0.5 mm.

[0075] S23. Knowledge distillation is used to extract the knowledge obtained from training the optimized teacher network model and transfer it to the student network model. The student network model is then trained using the weld seam image dataset to obtain a preliminary optimized student network model.

[0076] like Figure 5 As shown, the knowledge distillation operation described in step S23 is as follows:

[0077] For knowledge distillation, the distillation loss is defined as follows:

[0078]

[0079] in, It is a distillation loss. It is a convolution operation that matches the number of channels in the student's feature layer with the number of channels in the teacher's layer. and These are the feature maps of the student network model and the teacher network model, respectively, where N is the total number of elements in the distilled feature layer. The feature layer represents the knowledge transfer layer, where C, W, and H are the number of channels, width, and height of the feature map, respectively.

[0080] Then, in order to enable the student network model to learn the knowledge from the optimized teacher network model, a distillation operation was performed on the student network model, with distillation loss... The original loss of the SSD algorithm of the network All of them participated in the model training process, and the total loss is expressed as:

[0081]

[0082] For distillation losses, The original loss of the SSD algorithm, An adjustable hyperparameter to balance the loss term;

[0083] During training, the weights of the teacher network model are frozen and do not participate in gradient updates, while the student network model continuously learns from the teacher network model and updates its own weights, ultimately minimizing the total loss function. To obtain an optimized student network model.

[0084] S24. Apply the convolution kernel pruning method to remove redundant parameters in the initially optimized student network model, and fine-tune the pruned student network model using the weld seam image dataset to obtain the optimal student network model, and use the optimal student network model as the lightweight object detection network.

[0085] Specifically, the convolution kernel pruning process of the initially optimized student network model described in step S24 is as follows:

[0086] First, let the pruning ratio of the i-th layer convolutional kernel in the initially optimized student network model be . Then the number of convolution kernels in this layer Reduce to Output feature layer The size of the tensor becomes ;

[0087] Generally, a smaller norm of a convolutional kernel leads to smaller activation outputs, thus having a smaller impact on the final model's predictions. Based on this understanding, the norm is used to evaluate the importance of each convolutional kernel. The norm of the j-th convolutional kernel in the i-th convolutional layer of the model is represented as follows:

[0088]

[0089] in, For the j-th convolutional kernel of the i-th convolutional layer in the model, Let be the norm of the j-th convolutional kernel in the i-th convolutional layer of the model. These represent the indices of the corresponding convolutional kernel's channel count, width, and height, respectively. These represent the number of channels, length, and width of the corresponding convolutional kernel, respectively, with p taking a value of 2. Norm;

[0090] In each convolutional layer, the kernels are sorted according to their norm values, and kernels with smaller norm values ​​are removed according to a pre-set pruning ratio.

[0091] The definition of kernel removal is as follows:

[0092] The convolution operation of the i-th layer of the preliminarily optimized student network model is represented as follows:

[0093]

[0094] This represents the feature map of the i-th layer. For the feature map of the (i-1)th layer, For the j-th convolutional kernel of the i-th convolutional layer, This is the cutting factor;

[0095] When the j-th convolution kernel is pruned, let At this time, the corresponding It should also be 0, which means the convolution kernel has been pruned; otherwise, let... ;

[0096] The convolutional kernels to be removed from the i-th layer of the preliminarily optimized student network model are obtained and removed. Then, the cropped student network model is fine-tuned and trained using the preprocessed weld seam image dataset to obtain the optimal student network model. The optimal student network model is then used as the lightweight object detection network.

[0097] Specifically, the training conditions for the fine-tuning training are set to a learning rate of 0.2 and an iteration count of 30.

[0098] S3. When the welding work begins, the industrial camera 42 continuously acquires weld seam images and sends them to the embedded industrial control computer 7. The lightweight target detection network is used to process the acquired weld seam images and extract the pixel coordinate values ​​of the weld seam feature points in the acquired weld seam images.

[0099] Specifically, the process of step S3 is as follows:

[0100] When welding begins, the industrial camera continuously acquires images at a sampling frequency of 60 frames per second and sends them to the embedded industrial control computer for image processing. The embedded industrial control computer crops out the regions containing weld stripe features from the acquired images and inputs them into the lightweight target detection network. The lightweight target detection network processes the continuously input weld images and finally outputs a single feature map with a scale of 7*7 based on the input weld images. The feature map contains the position coordinates of the weld. Then, a non-maximum suppression algorithm is performed on the feature map to finally obtain the center coordinates of the target candidate box, which are the pixel coordinate values ​​of the weld feature points in the input weld image.

[0101] like Figure 6The lightweight object detection network shown includes a first convolutional block, a second convolutional block, a third convolutional block, a fourth convolutional block, a fifth convolutional block, a sixth convolutional block, and a seventh convolutional block connected in sequence. Each of the first, second, third, and fourth convolutional blocks is followed by a max pooling layer. Each of the first, second, third, fourth, fifth, sixth, and seventh convolutional blocks contains a batch normalization layer and an activation function layer. The number of channels in the first to seventh convolutional blocks of the lightweight object detection network are 8, 16, 32, 62, 62, 32, and 32, respectively.

[0102] The lightweight target detection network works as follows:

[0103] A 224×224 pixel weld seam image is input into the lightweight object detection network for inference. During the inference process, the convolutional kernel slides across the feature map with a fixed stride, performs convolution operations at each position, and outputs the calculated value to obtain the feature map of the next convolutional layer. The lightweight object detection network finally outputs a single feature map with a scale of 7*7, which contains the position coordinates of the weld seam. Then, a non-maximum suppression algorithm is applied to this feature map to obtain the center coordinates of the target candidate box, which are the pixel coordinates of the weld seam feature points in the input weld seam image.

[0104] Specifically, the comparison table between the lightweight object detection network and the original SSD detection network is as follows:

[0105]

[0106] Table 1 Comparison between Lightweight Target Detection Network and Original SSD Detection Network

[0107] As can be seen from the table above, the lightweight object detection network has fewer parameters, resulting in lower memory usage, faster computation speed, and less computational resources required and consumed. It can be widely used in embedded platforms and mobile devices with limited computing resources.

[0108] S4. The pixel coordinate values ​​of the weld feature points obtained in step S3 are converted into three-dimensional coordinate values ​​in the base coordinate system of the welding robot 2 to obtain the welding position that the welding robot 2 needs to reach. The difference between the welding position and the current position of the welding robot 2 is calculated, and the obtained deviation value is sent to the welding robot control cabinet 8 in real time. The welding robot control cabinet 8 then controls the welding torch 3 of the welding robot 2 to track the weld along the weld, thereby completing the real-time automatic tracking of the weld.

[0109] Example 2

[0110] A storage medium for storing non-transitory computer instructions, which, when executed, perform the laser vision real-time weld seam tracking method based on a lightweight target detection network as described in Embodiment 1.

[0111] Example 3

[0112] like Figures 1-2 As shown, a laser vision real-time weld seam tracking system based on a lightweight target detection network includes a welding robot 2, a welding torch 3, a robot control cabinet, a supporting welding equipment 1, a laser vision sensor 4, an embedded industrial control computer, and a workbench 6. The embedded industrial control computer includes a processor and a memory. The memory stores non-transitory computer instructions. When the non-transitory computer instructions are executed by the processor, the laser vision real-time weld seam tracking method based on a lightweight target detection network described in Embodiment 1 is executed.

[0113] Specifically, the laser vision sensor 4 is fixedly installed in front of the welding torch 3. This device mainly consists of a clamping base and bolts and nuts. The welding torch 3 is placed in the clamping base and fastened with bolts and nuts. It is installed on the end of the welding robot 2. The embedded industrial control computer 7 is connected to the laser vision sensor 4 via an Ethernet cable. The welding torch 3 and the matching welding equipment 1 are connected to the welding power supply via cables. The embedded industrial control computer 7 is connected to the welding robot control cabinet 8 via an Ethernet cable. The welding robot control cabinet 8, the welding robot 2, and the matching welding equipment 1 are connected via cables. The laser vision sensor 4 and the welding torch 3 change their spatial positions through the movement of the welding robot 2. The worktable 6 is equipped with G-type clamps. The workpiece 5 is placed on the worktable 6 and clamped and positioned by two or more G-type clamps.

[0114] Specifically, the welding robot 2 is an arc welding robot of model Yaskawa MA1440. The supporting welding equipment 1 includes a welding machine and a protective gas cylinder. The welding machine is Yaskawa MOTOWELD-RD350, which is used for wire feeding and wire retraction of the welding robot 2. The contents of the protective gas cylinder are CO2 (20%) and N2 (80%).

[0115] like Figure 2As shown, the laser vision sensor 4 includes a black anodized sensor protective housing 41, an industrial camera 42, an optical filter 43, a protective glass 44, and a three-line laser generator 45. The industrial camera 42 and the three-line laser generator 45 are fixedly disposed inside the sensor protective housing 41. The optical filter 43 is installed at the front end of the industrial camera 42. The protective glass 44 is fixedly installed on the sensor protective housing 41 and located at the front end of the industrial camera 42 and the three-line laser generator 45. The three-line laser generator 45 is fastened to the sensor protective housing 41 by bolts and nuts, and forms a 30° angle with the industrial camera 42.

[0116] The working principle of the laser vision real-time weld seam tracking system based on a lightweight target detection network is as follows:

[0117] Before welding begins, the workpiece 5 is fixed on the worktable 6 using a fixture. The visual laser sensor 4 continuously acquires weld seam images and transmits them to the embedded industrial computer 7. The processor in the embedded industrial computer 7 extracts the coordinate values ​​of the weld seam feature points based on the acquired weld seam images. The welding position that the welding robot 2 needs to reach is obtained through the coordinate values ​​of the weld seam feature points. Then, the welding position of the welding robot 2 is subtracted from the current position, and the obtained deviation value is transmitted to the welding robot control cabinet 8. The welding robot control cabinet 8 outputs a signal to control the movement trajectory of the welding torch 3, realizing automatic tracking of the weld seam of the workpiece to be welded on the worktable 6, thereby completing the automated welding of the welding robot 2.

[0118] In the description of this invention, it should be noted that, unless otherwise explicitly specified and agreed, the terms "set," "install," and "connect" should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral connection; they can refer to a mechanical connection or an electrical connection; they can refer to a direct connection or an indirect connection through an intermediate medium; and they can refer to the internal connection of two components. Those skilled in the art can understand the specific meaning of the above terms in this invention according to the specific circumstances.

[0119] The above embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above embodiments. Any changes, modifications, substitutions, combinations, or simplifications made without departing from the spirit and principle of the present invention shall be considered equivalent substitutions and shall be included within the protection scope of the present invention.

Claims

1. A laser vision real-time weld seam tracking method based on a lightweight target detection network, characterized in that, A laser vision real-time weld seam tracking system based on a lightweight target detection network is adopted. The system includes: a welding robot, a welding torch, a welding robot control cabinet, supporting welding equipment, a laser vision sensor, an embedded industrial control computer, and a workbench. The embedded industrial control computer includes a processor and a memory. Includes the following steps: S1. Before the welding work begins, the industrial camera in the laser vision sensor acquires an initial weld seam image characterized by laser stripes and transmits it to the embedded industrial control computer. The embedded industrial control computer initializes the initial weld seam image, obtains the pixel coordinate values ​​of the initial weld seam feature points in the initial weld seam image, and converts them into three-dimensional coordinate values ​​in the base coordinate system of the welding robot to obtain the starting position of the weld seam. S2. Construct a teacher network model and a student network model. Train the teacher network model using the preprocessed weld seam image dataset to obtain an optimized teacher network model. Extract the knowledge obtained from training the optimized teacher network model using knowledge distillation and transfer it to the student network model to obtain a preliminary optimized student network model. Finally, apply the convolution kernel pruning method to remove redundant parameters in the preliminary optimized student network model to obtain the optimal student network model. Use the optimal student network model as the lightweight object detection network. Step S2 includes the following steps: S21. Construct a teacher network model and a student network model based on the SSD algorithm; both the teacher network model and the student network model adopt deep convolutional neural networks, and the number of parameters of the student network model is smaller than that of the teacher network model. The teacher network model uses an improved VGG-16 convolutional neural network model as the backbone network. Compared with the original VGG-16 network model, the improved VGG-16 convolutional neural network model removes the last convolutional block in the model and adds three convolutional layers as feature extraction heads. At the same time, the feature map prediction method of the original SSD algorithm in the original VGG-16 network model is changed to a single feature map prediction method. S22. Obtain weld seam images from historical welding processes, construct the weld seam image dataset, preprocess the weld seam image dataset, and train the teacher network model using the preprocessed weld seam image dataset to obtain an optimized teacher network model. The training process of the teacher network model described in step S22 is as follows: The optimization method used for training is gradient descent. Before training begins, weld image training data is obtained from the preprocessed weld image dataset. At the start of training, the weight parameters of the teacher network model are randomly initialized. Then, the weld image training data is input into the teacher network model for inference. The output of the teacher network model is compared with the real weld to obtain the loss function value. Then, gradient backpropagation is performed to update the weights of the teacher network model. The model is iteratively trained. When the loss function value no longer decreases and the model test accuracy meets the welding requirements, training is stopped and the optimized teacher network model is saved. S23. Knowledge distillation is used to extract the knowledge obtained from training the optimized teacher network model and transfer it to the student network model. The student network model is then trained using the weld seam image dataset to obtain a preliminary optimized student network model. The knowledge distillation operation described in step S23 is as follows: For knowledge distillation, the distillation loss is defined as follows: in, It is a distillation loss. It is a convolution operation that matches the number of channels in the student's feature layer with the number of channels in the teacher's layer. and These are the feature maps of the student network model and the teacher network model, respectively, where N is the total number of elements in the distilled feature layer. The feature layer represents the knowledge transfer layer, where C, W, and H are the number of channels, width, and height of the feature map, respectively. Then, in order to enable the student network model to learn the knowledge from the optimized teacher network model, a distillation operation was performed on the student network model, with distillation loss... The original loss of the SSD algorithm of the network All of them participated in the model training process, and the total loss is expressed as: For distillation losses, The original loss of the SSD algorithm, An adjustable hyperparameter to balance the loss term; During training, the weights of the teacher network model are frozen and do not participate in gradient updates, while the student network model continuously learns from the teacher network model and updates its own weights, ultimately minimizing the total loss function. To obtain an optimized student network model; S24. Apply the convolution kernel pruning method to remove redundant parameters in the initially optimized student network model, fine-tune and train the pruned student network model using the weld seam image dataset to obtain the optimal student network model, and use the optimal student network model as the lightweight object detection network. The convolution kernel pruning process of the initially optimized student network model described in step S24 is as follows: First, let the pruning ratio of the i-th layer convolutional kernel in the initially optimized student network model be . Then the number of convolution kernels in this layer Reduce to Output feature layer The size of the tensor becomes ; Generally, a smaller norm of a convolutional kernel leads to smaller activation outputs, thus having a smaller impact on the final model's predictions. Based on this understanding, the norm is used to evaluate the importance of each convolutional kernel. The norm of the j-th convolutional kernel in the i-th convolutional layer of the model is represented as follows: in, For the j-th convolutional kernel of the i-th convolutional layer in the model, Let be the norm of the j-th convolutional kernel in the i-th convolutional layer of the model. These represent the indices of the corresponding convolutional kernel's channel count, width, and height, respectively. These represent the number of channels, length, and width of the corresponding convolutional kernel, respectively, with p taking a value of 2. Norm; In each convolutional layer, the kernels are sorted according to their norm values, and kernels with smaller norm values ​​are removed according to a pre-set pruning ratio. The convolutional kernels to be removed from the i-th layer of the pre-optimized student network model are obtained and removed. Then, the cropped student network model is fine-tuned and trained using the pre-processed weld seam image dataset to obtain the optimal student network model. The optimal student network model is then used as the lightweight object detection network. S3. When the welding work begins, the industrial camera continuously acquires weld seam images and sends them to the embedded industrial control computer. The lightweight target detection network is used to process the acquired weld seam images and extract the pixel coordinate values ​​of the weld seam feature points in the acquired weld seam images. S4. Convert the pixel coordinate values ​​of the weld feature points obtained in step S3 into three-dimensional coordinate values ​​in the base coordinate system of the welding robot to obtain the welding position that the welding robot needs to reach. Calculate the difference between the welding position and the current position of the welding robot, and send the obtained deviation value to the welding robot control cabinet in real time. The welding robot control cabinet then controls the welding torch of the welding robot to track the weld seam, thereby completing the real-time automatic tracking of the weld seam.

2. The laser vision real-time weld seam tracking method based on a lightweight target detection network according to claim 1, characterized in that, The specific process of step S3 is as follows: When welding begins, the industrial camera continuously acquires images at a sampling frequency of 60 frames per second and sends them to the embedded industrial control computer for image processing. The embedded industrial control computer crops out the regions containing weld stripe features from the acquired images and inputs them into the lightweight target detection network. The lightweight target detection network processes the continuously input weld images and finally outputs a single feature map with a scale of 7*7 based on the input weld images. The feature map contains the position coordinates of the weld. Then, a non-maximum suppression algorithm is performed on the feature map to finally obtain the center coordinates of the target candidate box, which are the pixel coordinate values ​​of the weld feature points in the input weld image.

3. The laser vision real-time weld seam tracking method based on a lightweight target detection network according to claim 1, characterized in that, The lightweight object detection network includes a first convolutional block, a second convolutional block, a third convolutional block, a fourth convolutional block, a fifth convolutional block, a sixth convolutional block, and a seventh convolutional block connected in sequence. Each of the first, second, third, and fourth convolutional blocks is followed by a max pooling layer. Each of the first, second, third, fourth, fifth, sixth, and seventh convolutional blocks contains a batch normalization layer and an activation function layer.

4. The laser vision real-time weld seam tracking method based on a lightweight target detection network according to claim 1, characterized in that, Step S1 specifically includes the following steps: S11. Before the welding work begins, place the workpiece to be welded on the worktable, adjust the position and posture of the welding robot's robotic arm so that the end of the welding torch is above the weld seam of the workpiece to be welded, and make the laser vision sensor fixed on the welding torch in the optimal working position. S12. The industrial camera in the laser vision sensor acquires an initial weld image characterized by laser stripes and sends it to an embedded industrial control computer. The embedded industrial control computer performs thresholding by calling the library function of Halcon software and performs initialization processing using morphological methods to obtain the pixel coordinate values ​​of the initial weld feature points. S13. Using a calibration algorithm, the pixel coordinate values ​​of the initial weld feature points are converted into three-dimensional coordinate values ​​in the base coordinate system of the welding robot to obtain the starting position of the weld.

5. A storage medium, characterized in that, Used to store non-transitory computer instructions, which, when executed, perform the laser vision real-time weld seam tracking method based on a lightweight target detection network as described in any one of claims 1-4.