A path motion control method and system for a welding robot

By combining third-order Bézier curves and a periodically updated residual strategy network, the adaptiveness problem of path planning for welding robots in narrow canyon environments is solved, enabling adaptive transitions for the welding robot and improving the safety and stability of the welding process.

CN121973249BActive Publication Date: 2026-06-23HUNAN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HUNAN UNIV
Filing Date
2026-04-07
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing welding robots struggle to achieve adaptive transitions during obstacle avoidance control, making it difficult to adapt to environmental changes in real time. This results in collisions and interference between the welding torch and the workpiece edge, leading to unsatisfactory welding process results.

Method used

An initial collision-free baseline path is constructed using a third-order Bézier curve, and the path is corrected in real time by updating the residual strategy network cycle by cycle. The welding path is optimized by combining environmental perception and motion reward scores, thereby achieving adaptive transition of the welding robot.

Benefits of technology

It achieves real-time adaptive transition of welding robot path, effectively avoids collision interference between welding torch and workpiece edge, ensures dynamic constraints and smoothness of path motion, and improves the safety and process stability of welding process.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121973249B_ABST
    Figure CN121973249B_ABST
Patent Text Reader

Abstract

The application discloses a kind of path motion control method and system of welding robot, it is related to welding robot intelligent control technical field, including steps: constructing initial collision-free reference path;Residual error strategy network and current real-time position point are updated periodically, based on the current state set corresponding to the updated residual error strategy network and the current real-time position point, the remaining path is corrected to obtain remaining correction planning path, based on interpolation step, the current local displacement segment is extracted from the remaining correction planning path;The current local displacement segment corresponding to each control cycle is sequentially spliced, and the target walking path optimized based on the initial collision-free reference path is formed.The method of the application, through periodic self-adaptive correction, dynamic parameter adjustment, breaks through the through communication limitation of fixed welding control parameters, solves the difficulty in real-time adaptation to environmental changes in the prior art, and the poor adaptive transition effect leads to welding process effect.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of intelligent control technology for welding robots, and in particular, to a path motion control method and system for a welding robot. Background Technology

[0002] Please refer to Figure 1 In shipbuilding, bridge engineering, and welding of large building steel structures, process through-weld holes are generally reserved at the connection between stiffening ribs and main plates to release residual welding stress. These through-weld holes are usually fan-shaped with small openings. As a typical confined space narrow channel scenario, the path planning difficulty of through-weld holes is much higher than that of general free space obstacle avoidance. Such discontinuous weld seam scenarios require welding robots to cross through-weld holes and accurately connect to the next weld seam after completing a section of welding.

[0003] In the robot's configuration space, in a narrow canyon environment where two large feasible regions are connected only by a narrow, elongated passage, general random sampling algorithms (such as RRT) have difficulty sampling effective points in narrow canyons, leading to planning failures or extremely distorted paths. Existing path planning technologies have several technical shortcomings when facing obstacles in narrow canyon environments, including: low efficiency of idle travel. Current technologies often employ manual teaching or gate-shaped (lift-translate-fall) obstacle avoidance trajectories based on fixed geometric rules. To ensure safety, excessive safety margins are often reserved, resulting in excessively long invalid idle travel paths, which severely slows down the overall production cycle. Impact and accuracy loss are also issues. Traditional hard-coded transition trajectories exhibit curvature abrupt changes at inflection points, causing the robotic arm joints to experience significant jerk impacts. This dynamic discontinuity not only exacerbates equipment wear but also causes residual end-effector vibrations when the robot reaches the target point, directly affecting the positioning accuracy and arc initiation success rate of the next weld segment. Furthermore, poor adaptability to unstructured environments is problematic. In actual working conditions, the dimensional accuracy of weld holes, edge burrs, and assembly errors are highly uncertain. Strategies based on traditional offline programming (OLP) or rigid obstacle avoidance algorithms (such as artificial potential field methods) struggle to adapt to real-time environmental changes, making interference and collisions between the welding torch and workpiece edges highly likely.

[0004] In summary, existing welding robots struggle to achieve adaptive transitions based on mechanical factors during obstacle avoidance control. Therefore, it is necessary to provide a path motion control method and system for welding robots to solve or at least partially solve the aforementioned technical problems. Summary of the Invention

[0005] The path motion control method and system for welding robots provided by this invention solve the technical problem that existing welding robots have difficulty in achieving adaptive transition and adapting to environmental changes in real time (due to fixed welding control parameters) during obstacle avoidance control, resulting in unsatisfactory welding process effects when avoiding collision interference between the welding torch and the workpiece edge.

[0006] To achieve the above objectives, the technical solution adopted by the present invention is as follows:

[0007] A path motion control method for a welding robot, comprising the following steps:

[0008] S10, Construct an initial collision-free reference path. The initial collision-free reference path is a third-order Bézier curve and has a path start point and a path end point. The collision-free reference path also has reference control points for constraining the bending shape and tangent direction.

[0009] S20, update the residual policy network and the current real-time position point periodically, correct the remaining path based on the updated residual policy network and the current state set corresponding to the current real-time position point to obtain the remaining corrected planning path, and extract the current local displacement segment from the remaining corrected planning path based on the interpolation step size; specifically including:

[0010] During control period t, construct the current real-time position Remaining path to the end of the path Extract the real-time path control points carried by the remaining path P(t) itself, where t≥1 and t is a positive integer;

[0011] The updated residual policy network is obtained when the motion reward score of the control period t converges. The motion reward score is used to characterize the obstacle avoidance safety, motion smoothness adaptation and motion acceleration of the motion process.

[0012] Based on the current state set Using the updated residual policy network prediction, the spatial position correction vector corresponding to the control period t is output. and time scaling factor The current state set This includes the distance vector between the current real-time position and the obstacle surface at the control period t, as well as the obstacle surface normal vector;

[0013] The spatial position correction vector [ The remaining path P(t) is superimposed onto the real-time path control point to correct the remaining path, thus obtaining the remaining corrected planned path;

[0014] Based on the current real-time location The time scaling factor In addition to the preset speed, the interpolation step size and target landing point of the control period t are calculated and obtained, and the interpolation step size is updated to the current local displacement segment;

[0015] The current local displacement segment is moved to the target landing point, and then the current local displacement segment is removed from the remaining corrected planning path;

[0016] Update the target landing point to the current real-time location and enter the next control cycle. Repeat the above steps until the current real-time location moves to the end point of the path.

[0017] S30, the current local displacement segments corresponding to each control cycle t are sequentially spliced ​​together to form the target walking path optimized based on the initial collision-free reference path.

[0018] Furthermore, step S20 also includes the following step:

[0019] Based on the output of the residual strategy network updated in each control period t, the physical motion control parameters corresponding to each discrete time step are obtained based on the preset single period duration. The physical motion control parameters include the motion velocity parameters of the target landing point and / or the landing point direction vector.

[0020] Furthermore, step S10 specifically includes:

[0021] S11, Obtain obstacle geometry information and weld start and end point information based on environmental perception;

[0022] S12, construct an axis-aligned bounding box based on a preset safety margin and the geometric information of the obstacle;

[0023] S13, using the axis alignment bounding box and the weld start and end point information, an initial collision-free reference path is generated based on a third-order Bézier curve. The collision-free reference path has corresponding reference control points, including a first control point for controlling the bending shape and the tangent direction. Second control point .

[0024] Furthermore, using the formula , The exercise reward score is calculated and obtained, wherein, This indicates the sports reward score. This represents the score of the path space safety evaluation item, which reflects the assumed landing point location. This represents the score for the motion time efficiency evaluation item, which reflects the assumed landing point position. This represents the score for the dynamic smoothness evaluation item under the assumed landing position. This represents the score for the attitude alignment evaluation item based on the assumed landing position. Indicates the arrival indicator function, Indicates the endpoint of the path. This indicates the landing point tolerance threshold. , , and These represent the safety evaluation coefficient, time efficiency evaluation coefficient, dynamic smoothness evaluation coefficient, and endpoint consistency rating coefficient, respectively; if the motion reward score converges, the assumed landing point is updated to the target landing point.

[0025] Furthermore, using the formula Calculate and obtain the score of the path space security evaluation item, where, Indicates the distance incentive weight. Indicates the collision penalty weight. Indicates control cycle It is assumed that the Euclidean distance between the landing point and the obstacle surface is... To prevent the denominator from having a singular smooth term, Indicates control cycle The collision indication function is set to 1 if a collision occurs at the assumed landing position, and 0 if no collision occurs at the assumed landing position. .

[0026] Furthermore, using the formula Calculate and obtain the score of the exercise time efficiency evaluation item. Indicates the current real-time location. , Indicates control cycle Assuming the instantaneous composite velocity corresponding to the landing point position, Indicates the preset speed.

[0027] Furthermore, using the formula

[0028]

[0029] Calculate and obtain the score of the dynamic smoothness evaluation item. This represents the assumed landing position at time period t. , , These represent the current real-time position calculated for the first three periods of time period t. This indicates the preset single-cycle duration;

[0030] Where, if the time period t does not have three preceding periods, then The value is 0.

[0031] Furthermore, using the formula Calculate and obtain the score of the attitude alignment evaluation item. This represents the terminal velocity vector at the assumed impact point. This represents the standard tangential direction vector of the second target weld at the starting point.

[0032] The present invention also provides a path motion control system for a welding robot, including a processing device for implementing the steps of the above-described path motion control method for a welding robot.

[0033] The present invention has the following beneficial effects:

[0034] The path motion control method and system for welding robots of the present invention constructs an initial collision-free reference path for a pre-defined welding task based on a third-order Bézier curve. The welding task defines at least a path start point and a path end point. The initial collision-free reference path based on the third-order Bézier curve has a reference control point located outside it. The reference control point is used to constrain the welding path to avoid collisions and to conform to obstacles as closely as possible. Based on the cycle-by-cycle update of the residual strategy network for control period t until the motion reward score of control period t converges, the walking welding point information (current real-time position) is updated in real time, and the target landing point and current local displacement segment are predicted and obtained based on the updated residual strategy network. At each control period t, the remaining path P(t) between the current real-time position and the path end point is locally optimized based on the updated residual strategy network, the remaining corrected planning path is obtained, and the position is updated based on the interpolation step size. The motion reward score is used to characterize the obstacle avoidance safety, motion smoothness adaptation, and motion acceleration during the motion process. The current state set... This includes the distance vector between the current real-time position and the obstacle surface, as well as the obstacle surface normal vector, during the control period t. The solution of this invention employs a period-by-period update of the residual policy network, based on the current state set. The scheme of using an initial collision-free reference path for segment-by-segment correction and gradual extraction of target landing points and local displacement segments breaks the limitations of traditional fixed welding control parameters and realizes real-time adaptive transition of the welding robot path. It effectively avoids collision interference between the welding torch and the workpiece edge, and ensures that the path motion conforms to dynamic constraints and is smooth and stable. It solves the technical problems of existing technologies that are difficult to adapt to environmental changes in real time and have poor adaptive transition effects, resulting in unsatisfactory welding process effects. It significantly improves the safety, motion stability and process stability of the welding process.

[0035] In addition to the objectives, features, and advantages described above, the present invention has other objectives, features, and advantages. The invention will now be described in further detail with reference to the figures. Attached Figure Description

[0036] The accompanying drawings, which form part of this application, are used to provide a further understanding of the invention. The illustrative embodiments of the invention and their descriptions are used to explain the invention and do not constitute an undue limitation of the invention. In the drawings:

[0037] Figure 1 This is a schematic diagram of a welding scenario where stiffening ribs are connected to the motherboard in existing technology;

[0038] Figure 2 This is a flowchart illustrating the path motion control method for a welding robot according to one embodiment of the present invention.

[0039] Figure 3 This is a schematic diagram of a Markov decision-based process in an optional embodiment of the present invention. Detailed Implementation

[0040] It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

[0041] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present invention.

[0042] Research has revealed that in the fields of shipbuilding and large steel structure welding, the industry mainly employs the following three mainstream technical methods for crossing and path planning of discontinuous obstacles such as stiffening ribs passing through weld holes. While these methods address basic obstacle avoidance issues to some extent, they suffer from significant drawbacks in efficiency, accuracy, and process adaptability. Solution 1: Fixed geometric obstacle avoidance based on manual teaching or offline programming (OLP). Technical principle: Fixed geometric motion sequences are preset through manual teaching or offline programming software, most commonly generating standard gate-shaped (or trapezoidal) trajectories. The welding torch is first vertically raised to a certain height (safety height), then horizontally moved across the obstacle, and finally vertically descended to the next welding point. Drawbacks: Extremely large idle travel redundancy. To prevent collisions caused by workpiece assembly errors or installation deviations, a large safety margin is usually required (e.g., the lifting height often far exceeds the actual height of the obstacle), resulting in a large amount of ineffective idle travel time, severely impacting production cycle time. Lack of adaptability: The trajectory is fixed and cannot perceive changes in the on-site environment. If the actual workpiece's weld hole size and position deviate from the CAD model, or if edge burrs exist, welding torch collisions or insufficient safety distances are highly likely. Option 2 is online planning based on traditional optimization algorithms (such as Artificial Potential Field Method (APF) and convex optimization). The technical principle is to use sensors to acquire local environmental information, construct a repulsive field (Artificial Potential Field Method) or establish geometric constraint equations (Convex Optimization, SQP, etc.), and calculate obstacle avoidance paths online. Some advanced solutions attempt to find paths that closely follow obstacles through optimization algorithms. However, these solutions have drawbacks: poor real-time computation (computing power bottleneck): to achieve extreme wall-hugging for irregular weld holes (e.g., 5mm gaps), complex non-convex constraint equations need to be solved in extremely high-resolution meshes. As accuracy requirements increase, the computational load grows exponentially, making it difficult to meet the millisecond-level real-time control response requirements of industrial robots, easily leading to motion stuttering; prone to getting trapped in local optima: algorithms such as the Artificial Potential Field Method are prone to getting trapped in local minima when facing narrow or U-shaped obstacles such as stiffening ribs, causing the robot to hesitate or even fail to plan.Option 3: Path search based on discrete sampling (such as RRT, PRM). Technical principle: Randomly sample a large number of points in the configuration space (C-Space) and filter out collision-free connecting paths through collision detection. Drawbacks: Low path smoothness (poor manufacturability). Paths generated by random sampling algorithms are usually polygonal with many tiny curvature abrupt changes (sawtooth jitter). Although theoretically there is no collision, this jitter will cause the robot motor to be subjected to huge jerk impacts, causing vibration of the end-effector welding torch. Post-processing compromises safety: To eliminate jitter, complex spline smoothing post-processing must be performed. However, smoothing operations often change the geometry of the original path, which may cause the originally safe path to re-enter the obstacle area after smoothing (i.e., collision risk after smoothing). It is difficult to achieve a balance between smoothness and wall-hugging gap. In summary, existing traditional solutions generally suffer from the problem of decoupling path geometry and motion speed. That is, they only plan how to walk (path) and often ignore the constraints of how to run (speed and dynamics). This causes the robot to usually need to decelerate or even stop when crossing obstacles (start-stop motion), and cannot use momentum to cross smoothly. Furthermore, it cannot compensate for time loss by automatically accelerating when the path length is increased by detouring. Ultimately, this results in low overall efficiency of welding operations and difficulty in guaranteeing arc initiation accuracy.

[0043] like Figure 1 , Figure 2 and Figure 3 As shown, the present invention provides a path motion control method for a welding robot, comprising the following steps:

[0044] S10, Construct an initial collision-free reference path. The initial collision-free reference path is a third-order Bézier curve with a path start point and a path end point. The collision-free reference path also has reference control points for constraining the bending shape and tangent direction. It can be understood that the third-order Bézier curve naturally has high-order continuity. In the solution of the present invention, the use of the third-order Bézier curve ensures that the generated trajectory curvature is smooth and without inflection points. Furthermore, when updating the third-order Bézier curve segment by segment, the flexible deformation and obstacle avoidance of the path can be achieved at low cost by fine-tuning the curve control points (there are two control points in the solution of the present invention).

[0045] S20, update the residual policy network and the current real-time position point periodically, correct the remaining path based on the updated residual policy network and the current state set corresponding to the current real-time position point to obtain the remaining corrected planning path, and extract the current local displacement segment from the remaining corrected planning path based on the interpolation step size; specifically including:

[0046] During control period t, construct the current real-time position Remaining path to the end of the path Extract the real-time path control points carried by the remaining path P(t) itself. The real-time path control points are obtained based on the evolution of the baseline control points, where t≥1 and t is a positive integer.

[0047] When the motion reward score of the control period t converges, the updated residual policy network is obtained. The motion reward score is used to characterize the obstacle avoidance safety, motion smoothness adaptation and motion acceleration of the motion process. It can be understood that in each control period t, the residual policy network continuously attempts to output the proposed landing point and calculates the motion reward score corresponding to the proposed landing point, and continuously updates the residual policy network until the motion reward score of the control period t converges.

[0048] Based on the current state set Using the updated residual policy network prediction, the spatial position correction vector corresponding to the control period t is output. and time scaling factor The current state set This includes the distance vector between the current real-time position and the obstacle surface at the control period t, as well as the obstacle surface normal vector;

[0049] The spatial position correction vector [ The remaining path P(t) is superimposed onto the real-time path control point to correct the remaining path, thus obtaining the remaining corrected planned path;

[0050] Based on the current real-time location The time scaling factor In addition to the preset speed, the interpolation step size and target landing point of the control period t are calculated and obtained, and the interpolation step size is updated to the current local displacement segment;

[0051] The current local displacement segment is executed to reach the target landing point, and the current local displacement segment is removed from the remaining corrected planning path. It can be understood that the current local displacement segment is the trajectory curve between two points.

[0052] Update the target landing point to the current real-time location and enter the next control cycle. Repeat the above steps until the current real-time location moves to the end point of the path.

[0053] S30, the current local displacement segments corresponding to each control cycle t are sequentially spliced ​​together to form the target walking path optimized based on the initial collision-free reference path.

[0054] The path motion control method and system for welding robots of the present invention constructs an initial collision-free reference path for a pre-defined welding task based on a third-order Bézier curve. The welding task defines at least a path start point and a path end point. The initial collision-free reference path based on the third-order Bézier curve has a reference control point located outside it. The reference control point is used to constrain the welding path to avoid collisions and to conform to obstacles as closely as possible. Based on the cycle-by-cycle update of the residual strategy network for control period t until the motion reward score of control period t converges, the walking welding point information (current real-time position) is updated in real time, and the target landing point and current local displacement segment are predicted and obtained based on the updated residual strategy network. At each control period t, the remaining path P(t) between the current real-time position and the path end point is locally optimized based on the updated residual strategy network, the remaining corrected planning path is obtained, and the position is updated based on the interpolation step size. The motion reward score is used to characterize the obstacle avoidance safety, motion smoothness adaptation, and motion acceleration during the motion process. The current state set... This includes the distance vector between the current real-time position and the obstacle surface, as well as the obstacle surface normal vector, during the control period t. The solution of this invention employs a period-by-period update of the residual policy network, based on the current state set. The scheme of using an initial collision-free reference path for segment-by-segment correction and gradual extraction of target landing points and local displacement segments breaks the limitations of traditional fixed welding control parameters and realizes real-time adaptive transition of the welding robot path. It effectively avoids collision interference between the welding torch and the workpiece edge, and ensures that the path motion conforms to dynamic constraints and is smooth and stable. It solves the technical problems of existing technologies that are difficult to adapt to environmental changes in real time and have poor adaptive transition effects, resulting in unsatisfactory welding process effects. It significantly improves the safety, motion stability and process stability of the welding process.

[0055] Understandably, in the scheme of the present invention, the output time scaling factor is used in each control cycle t. Based on time scaling factor The preset speed and preset single-cycle duration can calculate the time trajectory points and accelerometer corresponding to each discrete time step, driving the welding robot to move along the target walking path and the corresponding time trajectory points, achieving a collision-free, smooth and continuous transition from the end point of the first weld (path start point) to the start point of the second weld (path end point).

[0056] The relevant terms and terms in the present invention are explained as follows: Running state set: The relative geometric relationship between the robot end effector (TCP) and the surrounding environment is selected as the network input. Specifically, the input vector of the residual policy network includes the current TCP coordinates of the welding torch (current real-time position). The Euclidean distance vector between the robot and the obstacle surface, and the normal vector of the obstacle surface, are physical quantities that can accurately characterize the robot's current obstacle avoidance state and potential collision trends; Output parameter set: The output of the residual policy network is defined as the spatial position correction vector relative to the control point of the reference Bezier curve [ By superimposing the correction vector onto the control point coordinates of the baseline path, local deformation and fine-tuning of the path shape are achieved, thereby adjusting the path height without changing the path continuity. In a specific embodiment of the present invention, a spatial position correction vector is output based on the running state set and the motion reward score. and time scaling factor Jerk: refers to the rate of change of acceleration. In robot motion, excessive acceleration can severely damage motors. Residual Correction-Based Adaptive Optimization (RRL): a hybrid algorithm combining baseline and correction. It acquires obstacle distance data from sensors; generates baseline trajectory commands based on preset geometric rules; inputs the distance data into a pre-trained residual correction model, outputting a position compensation signal; and superimposes the position compensation signal onto the baseline trajectory command to generate the target position command driving the servo motor. Axis-Aligned Bounding Box (AABB): a simplified geometric bounding box technique. Its core feature is that all edges remain parallel to the coordinate axes of the world coordinate system. It only needs to record the coordinates of the two largest and smallest vertices to uniquely determine the bounding box. Its main advantages are simple construction and extremely fast collision detection calculation speed, often used in the primary screening (broad-phase) of physics engines. However, its limitations include the inability to rotate with the object, and for tilted or irregular objects, its bounding space is not tight enough, with large redundant gaps. Momentum preservation: This strategy utilizes inertia to optimize motion. It eliminates the start-stop losses caused by right-angle turns in traditional obstacle avoidance, instead generating a smooth, continuous spiral trajectory (like a teardrop shape). This strategy allows the robot to maintain a smooth speed transition without decelerating when crossing obstacle apexes, effectively eliminating motor shock and residual vibration. It achieves efficient and smooth continuous obstacle crossing while ensuring accuracy. Third-order Bézier curves: These are parametric smooth curves defined by four control points (start point, two intermediate points, and end point). The two intermediate control points are not directly on the curve but act like magnets, guiding the curve's curvature and tangent direction. They naturally possess high-order continuity, ensuring a smooth, inflection-free trajectory. By fine-tuning the two intermediate points, flexible path deformation and obstacle avoidance can be achieved at low cost.

[0057] Furthermore, step S20 specifically includes the following steps:

[0058] Combining the output results of the residual policy network updated in each control period t, the physical motion control parameters corresponding to each discrete time step are obtained based on a preset single-period duration. These physical motion control parameters include the motion velocity parameter and / or the landing point direction vector of the target landing point. Understandably, in a preferred embodiment of the present invention, the physical motion control parameters include the motion velocity parameter and the landing point direction vector of the target landing point. By constraining the landing point direction vector with a motion reward score, zero-wait welding can be achieved upon reaching the end of the path.

[0059] Understandably, the real-time path control point is the baseline control point during the first control cycle; motion reward score. Motion reward score is obtained based on the results of dynamic constraints. Used to characterize the obstacle avoidance safety, motion smoothness adaptation, and motion acceleration range of the remaining path P(t).

[0060] Furthermore, step S10 specifically includes:

[0061] S11, Obtain obstacle geometry information and weld start and end point information based on environmental perception;

[0062] S12, construct an axis-aligned bounding box based on a preset safety margin and the geometric information of the obstacle;

[0063] S13, using the axis alignment bounding box and the weld start and end point information, an initial collision-free reference path is generated based on a third-order Bézier curve. The collision-free reference path has corresponding reference control points, including a first control point for controlling the bending shape and the tangent direction. Second control point .

[0064] Understandably, the initial remaining path P(t) is obtained based on the initial collision-free baseline path, and the real-time path control points of the remaining path P(t) are third-order Bézier curve control points. These real-time path control points are obtained based on the evolution of the baseline control points. In a specific embodiment of the present invention, the real-time path control points include a first real-time control point and a second real-time control point, and a spatial position correction vector […]. This is used to correct the positions of the first and second real-time control points to obtain the remaining corrected planning path.

[0065] Furthermore, using the formula , The exercise reward score is calculated and obtained, wherein, This indicates the sports reward score. This represents the score of the path space safety evaluation item, which reflects the assumed landing point location. This represents the score for the motion time efficiency evaluation item, which reflects the assumed landing point position. This represents the score for the dynamic smoothness evaluation item under the assumed landing position. This represents the score for the attitude alignment evaluation item based on the assumed landing position. Indicates the arrival indicator function, Indicates the endpoint of the path. This indicates the landing point tolerance threshold. , , and These represent the safety evaluation coefficient, time efficiency evaluation coefficient, dynamic smoothness evaluation coefficient, and endpoint consistency rating coefficient, respectively; if the motion reward score converges, the assumed landing point is updated to the target landing point. In the solution of this invention, This represents the score of the path space safety evaluation item, which reflects the degree to which the spatial constraints between the welding torch tip (assuming landing point) and obstacles are satisfied. This represents the score for the motion time efficiency evaluation item, which reflects the impact of the transition path on the work cycle time. This represents the score of the dynamic smoothing evaluation term used to constrain changes in jerk during motion. The score represents the end-effector alignment evaluation item, which characterizes the consistency between the welding torch tip velocity direction and the target weld tangential direction. The arrival indication function indicates that if the distance between the target landing point and the end point of the path is within the landing point tolerance threshold, it means that the last local displacement segment has been reached. At this time, it is necessary to constrain the zero-wait welding start based on reinforcement learning to achieve fast welding start.

[0066] Optionally, during the offline training of the residual policy network, an early stopping mechanism based on a multi-objective reward threshold is adopted. This mechanism is used when the collision-free success rate of the transition trajectory reaches 100% in consecutive random verification rounds, and the multi-objective performance evaluation function... When the average cumulative score of each round converges to a stable interval (i.e., the variance of the ratio of the two is below a preset threshold), the network model is considered to have completed training, and the update of the internal weight parameters is stopped. For example, by obtaining the sum of the large window reward scores obtained for each complete trajectory and the number of large windows, and based on the fact that the ratio of the sum of the large window reward scores to the number of large windows (S) converges to a stable interval, an early stopping mechanism will be triggered; specifically, the formula is used. , N represents the threshold number of steps to truncate the time domain, and K is the current window number. Markov future return discount factor The path motion control method for the welding robot provided in this solution adopts a strategy of counting the number of steps taken (within a preset time window) to avoid non-convergence caused by sparse rewards.

[0067] Furthermore, using the formula Calculate and obtain the score of the path space security evaluation item, where, Indicates the distance incentive weight. Indicates the collision penalty weight. Indicates control cycle It is assumed that the Euclidean distance between the landing point and the obstacle surface is... To prevent the denominator from having a singular smooth term, Indicates control cycle The collision indication function is set to 1 if a collision occurs at the assumed landing position, and 0 if no collision occurs at the assumed landing position. .

[0068] Furthermore, using the formula Calculate and obtain the score of the exercise time efficiency evaluation item. Indicates the current real-time location. , Indicates control cycle Assuming the instantaneous composite velocity corresponding to the landing point position, Indicates the preset speed.

[0069] Furthermore, using the formula

[0070]

[0071] The score of the dynamic smoothing evaluation item is obtained. This represents the assumed landing position at time period t. , , These represent the current real-time position calculated for the first three periods of time period t. This indicates the preset single-cycle duration;

[0072] Where, if the time period t does not have three preceding periods, then The value is 0.

[0073] Furthermore, using the formula Calculate and obtain the score of the attitude alignment evaluation item. This represents the terminal velocity vector at the assumed impact point. This represents the standard tangential direction vector at the starting point of the second target weld segment. Specifically, the formula is used. Calculate and obtain the score of the end-effector attitude alignment evaluation item. This represents the actual end velocity vector at the last discrete interpolation point. This represents the standard tangential direction vector of the second target weld at the starting point.

[0074] This invention provides an optional path motion control method for welding robots, employing a spatiotemporal joint planning architecture based on residual correction-based adaptive optimization. It aims to address the problems of large idle travel redundancy, strong start-stop impact, and low efficiency in traditional path planning when welding robots traverse tall obstacles such as stiffening ribs (approximately 100mm high). Its overall logic does not rely on a single geometric obstacle avoidance or offline teaching, but rather decomposes the acquisition of the target walking path into two parts: a baseline strategy and a progressively (cycle-by-cycle) updated residual correction strategy. The baseline strategy determines a collision-free baseline path, providing basic passability. The residual strategy learns the geometric and dynamic characteristics of obstacles through a deep neural network, and outputs spatial and temporal corrections in real time after constraints are met, achieving adaptive traversal with extreme wall adhesion and momentum maintenance. The main steps are as follows:

[0075] Step 1: Environmental Perception: Acquire the geometric information of obstacles (stiffening ribs) and the start and end points of welds; acquire point cloud images of objects using a depth camera to determine depth information such as the color, shape, size, and relative position of obstacles; construct axis-aligned bounding boxes (AABBs) for obstacles. To ensure the absolute safety of the baseline path, the initial bounding box adopts a traditional conservative strategy with a large safety margin. .

[0076] Step 2: Generate an initial collision-free reference path based on a third-order Bézier curve; use the third-order Bézier curve to generate a smooth curve connecting the endpoint of weld A (the endpoint of the first weld) and the starting point of weld B (the starting point of the second weld). The collision-free reference path has two reference control points, including control points. and control points Control points , Setting the baseline path at a safe height directly above the obstacle ensures no collisions, but the path is long and spans too much height, serving as the skeleton for subsequent optimization and as the basis for residual learning.

[0077] Please refer to Figure 3 Step 3: Based on the Markov Decision Process (MDP), the gradual rolling correction and control execution are implemented to gradually acquire the current local displacement segment and motion physical parameters. This abandons the traditional open-loop model of global planning followed by static execution. The task of crossing the weld hole is modeled as an online closed-loop control flow with a bottom-level servo cycle (preset single cycle duration) Ts as the beat. Before reaching the starting point of the next weld segment (at the current control cycle t / current real-time position)... The following sub-steps are executed repeatedly:

[0078] Transient perception and feature extraction: In the current control period t, the distance vector and local normal vector from the welding torch TCP to the obstacle are calculated in real time and concatenated into the current state set St; the motion reward score is calculated based on the assumed landing point.

[0079] The residual policy network performs single-step inference. The pre-trained residual policy network receives the current state set St and the motion reward score, performs one forward propagation, and outputs the spatial position correction of the current control period t instantaneously when the motion reward score converges. and time scaling factor ;

[0080] Instantaneous trajectory deformation and dynamic interpolation correct the spatial position vector. The current real-time position is superimposed onto the third-order Bezier real-time path control point corresponding to the remaining path P(t), making the current real-time position... The unexecuted reference trajectory undergoes micro-deformation to obtain the remaining corrected planning path; simultaneously, the interpolator is based on the velocity formula. Given the period Ts, calculate the interpolation step size used only for the current step. ;

[0081] Scrolling time-domain physical execution, driving motor simulation or actual interpolation step size After moving a small step along the remaining corrected planning path after deformation, it enters the next control cycle, re-sensing the environment and generating the next instruction until the assumed landing point of TCP is the target weld start point B, instantly triggering zero-wait arc initiation;

[0082] Step 4: Obtain the target walking path based on the spatial correction vector at the current real-time position. The remaining path corresponding to control cycle t is deformed and the motion speed is compensated according to the time scaling factor to generate a welding transition trajectory. Throughout the entire transition cycle of trajectory generation and execution, the system does not statically generate a fixed path and then blindly execute it. Instead, the residual policy network continuously receives the latest environmental perception state and outputs a space correction vector at high frequency in each control cycle. With time scaling factor The underlying interpolator reconstructs the local Bézier curve in real time and dynamically adjusts the current interpolation step size.

[0083] Under the above scheme, the discrete trajectory interpolation point sequence actually sent to the robot servo driver is in a continuous dynamic refresh state, realizing the real-time dynamic evolution mechanism of the underlying discrete points. This enables the robot to perform millisecond-level flexible correction based on the tiny changes in the edge size of the weld hole and the assembly error, completely breaking the limitations of traditional rigid trajectory planning, which is fixed, and ensuring absolute physical safety and spatiotemporal optimization during the extreme wall-hugging sliding process.

[0084] Step 5: Execute control to drive the welding robot to move segment by segment along the current local displacement segment, so as to achieve a continuous transition from the end point of the first weld to the starting point of the second weld.

[0085] In step three:

[0086] Specifically, based on the spatial residual correction of the compact bounding box, reinforcement learning is used to compress the safety margin and solve the problem of redundant empty travel paths; when constructing the residual policy network, according to the robot's end-effector pose, obstacle geometry information, and the degree of reward convergence calculated based on the motion reward score, the distance vector and normal vector between the current welding torch TCP (intended landing point) and the obstacle surface are input to the running state set, and the position correction vector relative to the reference Bézier curve control point is output from the control parameter set. When the object is in extreme contact with the wall, a negative distance reciprocal objective function is introduced for optimization. Calculate and obtain the score of the path space security evaluation item, where, Indicates the distance incentive weight. Indicates the collision penalty weight. Indicates control cycle The Euclidean distance between the inner welding gun TCP and the obstacle surface. To prevent the denominator from having a singular smooth term, Indicates control cycle The collision indication function is set to 1 if a collision is assumed at the landing point, and to 0 if the entire path is safe. When choosing a specific value, take... =1, take =1000, This represents the collision indication function for the current control cycle. Under the premise of satisfying collision-free spatial constraints, a path optimization evaluation mechanism is constructed with the minimum safe clearance between the end effector and the obstacle as a constraint. This guides the control strategy to reduce the distance between the path and the obstacle contour within an allowable safe range. After offline parameter optimization, the control strategy can spatially correct the control points of the baseline path during operation, causing the transition path to shift downwards along the outer contour of the obstacle in the top region of the stiffening rib, forming a tight transition trajectory that closely adheres to the obstacle surface, thereby significantly reducing the ineffective lift height during the crossing process.

[0087] Specifically, a dynamic velocity planning method with spatiotemporal joint optimization is adopted to solve the time efficiency loss problem caused by the decoupling of path geometry and velocity planning in traditional algorithms; dynamic residual pipeline construction expands the planning object from a single three-dimensional path to a four-dimensional spatiotemporal pipeline, and the adaptive motion controller not only outputs the spatial position correction vector[ It also outputs the time scaling factor ( ). Its core lies in constructing a nonlinear coupling model between spatial displacement compensation and time scaling factor, which is achieved by constructing a performance index function that includes a time efficiency optimization term. Calculate and obtain the score of the path space security evaluation item. Indicates control cycle The instantaneous composite velocity of the remaining path P(t), This indicates the preset speed (baseline speed). A larger value and fewer steps indicate higher curve crossing efficiency. This invention constrains the working time by changing the interpolation step size (i.e., the displacement increment within each servo cycle). It employs a spatiotemporal coupling mapping mechanism and the principle of optimality, based on a space-for-time (acceleration compensation) strategy. When the motion controller chooses to detour in space to avoid obstacle corners or maintain a greater safety margin, resulting in an increased path length, the network automatically outputs a positive [response]. The interpolator behavior at this point is considered within the servo cycle. Without changing the position, the step size (range) of the next position calculated by the interpolator becomes larger. By utilizing the dynamic redundancy of the robot in open space, the time loss caused by the path extension is compensated by increasing the instantaneous speed, ensuring that the total operation cycle is not affected by obstacle avoidance behavior.

[0088] Specifically, a safe approach strategy (deceleration landing strategy) is adopted: when the welding torch tip transitions from the crossing state to the contact state (i.e., the landing area), due to the tightening of spatial constraints and the increased risk of collision, the network will automatically output a negative signal based on its perception of the environmental state. The interpolator behavior at this time is during the servo cycle. Without changing the position, the step size (step size) of the next position calculated by the interpolator becomes smaller. This strategy imitates the operation logic of human operators who move slowly in and quickly out. Under the premise of ensuring absolute safety and high-precision alignment, it uses low-speed fine operation to complete the transition to the wall.

[0089] Understandably, through the above mechanism, the present invention achieves an upgrade from discrete path points to continuous dynamic trajectories. The robot no longer moves mechanically at a uniform speed, but adopts human-like motion characteristics, flying over the safest area at the top of the obstacle at the maximum speed allowed by the motor, and automatically decelerating at complex edges. This not only ensures no collisions throughout the process, but also achieves global time optimization of the welding operation cycle.

[0090] Specifically, the present invention further addresses the problems of motor impact and residual vibration caused by start-stop motion, ensuring the arc-starting accuracy of the next weld segment. It employs a jerk minimization constraint strategy, introducing a higher-order dynamic penalty term into the reinforcement learning objective function.

[0091]

[0092] in, This represents the current real-time position calculated over a time period t. This represents the actual three-dimensional discrete coordinates calculated in the current period t. , , These represent the current real-time position calculated for the first three periods of time period t. , , Let represent the true three-dimensional discrete coordinates calculated for the first three periods of period t. This indicates the preset single-cycle duration. Traditional gate-shaped trajectories experience significant acceleration at inflection points, which is severely penalized by the objective function. In this invention, the adaptive spiral trajectory generation minimizes acceleration while maintaining momentum continuity during the robot's crossing process. The control strategy introduces curvature continuity constraints when generating the transition path, transforming the path from a traditional straight-line ascent / descend form into a spiral trajectory with continuous curvature. Utilizing the robot's inertia, it does not decelerate or pause when crossing the highest point of the stiffening rib, but instead traces a smooth arc directly towards the target point.

[0093] Specifically, the landing attitude adjustment strategy utilizes a residual strategy network to fine-tune the end attitude during the swirling descent, ensuring that the velocity vector of the welding torch at the moment of contact with the workpiece is tangentially collinear with the welding direction of the next weld segment, achieving direct arc initiation with zero waiting time. Calculate and obtain the score of the end-effector attitude alignment evaluation item. This represents the actual end velocity vector at the last discrete interpolation point. This represents the standard tangential direction vector at the starting point of the second target weld segment, i.e. This represents the actual end velocity vector of the trajectory at the last discrete interpolation point (i.e., the instant of contact with the starting point of the second weld segment).

[0094] In a complete optimization implementation, a multi-objective performance evaluation function for joint optimization of path and motion states is constructed. , Calculate and obtain the sports reward score, where, Indicates the reward points for exercise. This represents the score of the path space safety evaluation item, which reflects the degree to which the spatial constraints between the welding torch tip and obstacles are met. This represents the score for the motion time efficiency evaluation item, which reflects the impact of the transition path on the work cycle time. This represents the score of the dynamic smoothing evaluation term used to constrain changes in jerk during motion. The score represents the end-effector alignment evaluation item, which characterizes the consistency between the welding torch tip velocity direction and the target weld tangential direction. Indicates the arrival indicator function, Indicates the location of the end point of the path. This indicates the landing point tolerance threshold. , , and These represent the safety evaluation coefficient, time efficiency evaluation coefficient, dynamic smoothness evaluation coefficient, and endpoint consistency rating coefficient, respectively. =1.0, =0.2, =0.001, =1.0, The value is 1.0 mm.

[0095] It should be specifically stated that the multi-objective joint evaluation model constructed in this invention is by no means a simple linear superposition or mechanical patching of four isolated physical indicators: spatial obstacle avoidance, time efficiency, dynamic smoothing, and end attitude. In traditional hierarchical control architectures, each indicator is often calculated independently, which can easily lead to local deadlocks where one problem is solved only to cause another to arise. In contrast, this solution implements a strongly coupled dynamic game based on third-order Bessel analytic geometry at the underlying mathematical architecture.

[0096] Firstly, spatial distance, time span, jerk, and spatial angle are four completely different high-dimensional physical quantities. However, this system cleverly utilizes the analytical dimensionality reduction properties of third-order Bézier curves to map and bind these four macroscopic physical feedbacks to three microscopic control variables output by the residual strategy network—control point spatial offset. and time scaling factor This means that any minor adjustment to these three parameters by the network will simultaneously, in a linked and non-linear manner, trigger drastic changes in the four evaluation dimensions, which mathematically constitute a unified whole where a change in one affects the whole.

[0097] Secondly, under the extremely narrow three-dimensional constraints of the weld hole, these four indicators naturally exhibit extremely strict physical mutual exclusion. For example, to ensure absolute spatial safety (raising control point P2), the angle of entry for landing posture will inevitably become steeper, disrupting the end-alignment score; pursuing ultimate time efficiency (positive amplification) This inevitably triggers higher-order dynamic shocks, leading to a deduction in smoothness scores. This invention forces the network to engage in deep self-game through trial and error, not seeking the absolute maximization of a single metric, but autonomously exploring the Pareto optimal solution under the current physical interference limit—that is, finding the ultimate equilibrium state with the shortest time consumption, least impact, and smoothest entry angle at the edge of the limit where there is no collision. Therefore, this scheme is not only a multi-objective scorer, but also a closed-loop control framework that endows the robot with environmental adaptive flexibility and anthropomorphic game-theoretic decision-making capabilities, possessing substantial innovation that is indivisible and irreplaceable.

[0098] The innovations of the path motion control method for welding robots in this invention include: a limit-fitting obstacle avoidance mechanism based on spatial residual learning. Traditional path planning algorithms typically set fixed and conservative safety margins to avoid collisions, causing the robot to lift the path too high when crossing obstacles such as stiffening ribs, resulting in a large amount of invalid idle travel. Although some methods can achieve close-fitting, existing robots mainly rely on high-dimensional online trajectory optimization algorithms (such as convex optimization, artificial potential field method, etc.) when dealing with complex obstacles (such as stiffening ribs) for limit-fitting obstacle avoidance. However, these algorithms also have some obvious drawbacks: 1. Poor real-time computation: In order to ensure that no collision occurs and maintain a very small gap (such as 5mm), the algorithm must solve complex non-convex constraint equations under extremely high resolution grids. The computational load increases exponentially with the accuracy requirements, making it difficult to meet the millisecond-level real-time control response requirements of industrial robots. 2. Low path smoothness: Paths generated based on discrete sampling or local obstacle avoidance often have a large number of small curvature abrupt changes (jitter), which must be smoothed by complex post-processing before they can be used for welding. Moreover, the post-processing process often destroys the original safety gap, leading to the risk of collision after smoothing. This invention abandons the traditional approach of directly planning the entire path and innovatively proposes a planning framework of baseline path plus residual correction. By constructing a spatial residual policy network based on deep reinforcement learning, the motion controller can superimpose a small spatial correction amount on the baseline Bézier curve according to the local geometric features of the obstacle (distance field and normal vector). This mechanism enables the robot to slide as if it were close to the surface of the obstacle, constructing a tight bounding box trajectory, and significantly reducing the invalid lift height while ensuring zero physical collision.

[0099] The innovations of the path motion control method for welding robots in this invention also include: constructing a 4D dynamic residual pipe to achieve spatiotemporal coupling compensation, resolving the contradiction between efficiency and obstacle avoidance. Existing technologies typically process path planning (space) and velocity planning (time) step by step, inevitably increasing the robot's operation time when avoiding obstacles. This invention establishes a spatiotemporal coupling mechanism that maps spatial deformation and time scaling, combining the two processes. A time scaling factor is introduced at the network output. This expands the planning dimension from three-dimensional space to a four-dimensional spatiotemporal pipeline. When the motion controller chooses to increase the spatial path length to safely avoid sharp corners, it automatically outputs a positive time factor for acceleration compensation. This mechanism breaks the conventional wisdom that detours are time-consuming, ensuring optimal global operation cycle time for the robot in complex obstacle avoidance conditions.

[0100] The innovative aspects of the path motion control method for the welding robot of this invention also include: a momentum-maintaining gyroscopic strategy based on dynamic perception and zero-wait arc initiation, solving the problem of start-stop impact. Addressing the issue that traditional gate-shaped obstacle avoidance trajectories require deceleration and stopping at inflection points, resulting in significant motor impact and low arc initiation efficiency, this invention introduces high-order dynamic smoothing constraints and end-effector attitude alignment constraints during path planning. This jointly optimizes the geometry and motion state of the transition trajectory. By constraining the rate of change of jerk, the generated crossing trajectory possesses continuous curvature characteristics, forming an arc-shaped gyroscopic transition path. This maintains motion continuity when crossing the highest point of an obstacle, avoiding vibration and impact caused by rigid transitions. Simultaneously, during the descent phase of the transition trajectory, the attitude of the welding torch end is pre-modulated, ensuring that the velocity direction of the welding torch end is consistent with the tangential direction of the weld seam when contacting the starting point of the next weld segment. This achieves continuous arc initiation the instant the welding torch lands, reducing waiting time and improving welding process efficiency.

[0101] Please refer to Table 1, which compares the detailed results and parameters of the technical solution of this application with those of existing mainstream technologies. The path motion control method for welding robots proposed in this invention adopts a spatiotemporal collaborative path planning strategy based on an adaptive optimization method with residual correction. The beneficial effects include: significantly shortening the welding idle travel time, achieving global optimization of the work cycle; in the spatial dimension, through the spatial residual correction mechanism, breaking the conservative limitations of traditional fixed safety margins, generating a compact bounding box trajectory, eliminating invalid lift heights, and greatly shortening the path length; in the temporal dimension, through the spatiotemporal coupling mechanism, achieving dynamic compensation for detours and acceleration, when the robot needs to avoid the sharp corners of obstacles... When increasing the path length, the system automatically utilizes dynamic redundancy for acceleration. Compared to traditional constant speed or deceleration obstacle avoidance, this solution ensures that the total operation time does not increase due to obstacle avoidance, significantly improving production efficiency. It eliminates motion shocks, extends equipment lifespan, and improves operational stability. A higher-order dynamic constraint (Jerk minimization) is introduced into the multi-objective constraint function, eliminating the start-stop shock caused by right-angle turns in traditional "gate" shaped trajectories. The generated spiral trajectory has continuous curvature and smoothly crosses the highest point of obstacles using the momentum conservation principle. This not only avoids the risk of motor overload due to sudden acceleration changes but also effectively extends the service life of the robot's joint reducers. Furthermore, it eliminates residual vibration at the end, ensuring extremely smooth motion; it achieves direct arc ignition with zero waiting time, guaranteeing welding process quality. Traditional methods often require a pause to adjust the welding torch posture after landing over an obstacle. This invention utilizes a residual strategy network to fine-tune the end posture in advance during the descent. At the instant the welding torch contacts the starting point of the next weld segment, its velocity vector direction is perfectly collinear with the welding tangent. Employing a flight relay-style posture alignment mechanism, it achieves direct arc ignition with zero waiting time at the moment of landing, effectively avoiding welding defects (such as insufficient penetration depth at the arc ignition point) caused by arc ignition position deviation or pauses, ensuring consistency across multiple weld segments, and possessing extremely high real-time computational capabilities. In terms of robustness and environmental robustness, compared to traditional online optimization algorithms (such as convex optimization) that rely on high computing power to solve complex non-convex equations, the residual strategy network of this invention adopts a correction mode that combines offline training and online inference, reducing complex planning problems to simple matrix operations. It has millisecond-level real-time response capability, fully meeting the dynamic control requirements of industrial sites. At the same time, the baseline plus residual architecture ensures the system's safety baseline. Even in the case of sensor noise or network output fluctuations, the baseline path can still provide basic safety constraints, making the system extremely robust and adaptable in unstructured environments (such as workpiece assembly errors and edge burrs).

[0102] Table 1

[0103]

[0104] The present invention also provides a path motion control for a welding robot, including a processing device for implementing the steps of the above-described path motion control method for a welding robot.

[0105] The above are merely preferred embodiments of the present invention and are not intended to limit the present invention. Various modifications and variations can be made to the present invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A path motion control method for a welding robot, characterized in that, Including the following steps: S10, Construct an initial collision-free reference path. The initial collision-free reference path is a third-order Bézier curve with a path start point and a path end point. The collision-free reference path also has reference control points for constraining the bending shape and tangent direction; specifically including: S11, Obtain obstacle geometry information and weld start and end point information based on environmental perception; S12, construct an axis-aligned bounding box based on a preset safety margin and the geometric information of the obstacle; S13, using the axis alignment bounding box and the weld start and end point information, an initial collision-free reference path is generated based on a third-order Bézier curve. The collision-free reference path has corresponding reference control points, including a first control point for controlling the bending shape and the tangent direction. Second control point ; S20, update the residual policy network and the current real-time position point periodically, correct the remaining path based on the updated residual policy network and the current state set corresponding to the current real-time position point to obtain the remaining corrected planning path, and extract the current local displacement segment from the remaining corrected planning path based on the interpolation step size; specifically including: During the control period t, construct the remaining path from the current real-time location to the path endpoint, and extract the real-time path control points carried by the remaining path itself, where t≥1 and t is a positive integer; The updated residual policy network is obtained when the motion reward score of the control period t converges. The motion reward score is used to characterize the obstacle avoidance safety, motion smoothness adaptation and motion acceleration of the motion process. Using formula , The exercise reward score is calculated and obtained, wherein, This indicates the sports reward score. This represents the score of the path space safety evaluation item, which reflects the assumed landing point location. This represents the score for the motion time efficiency evaluation item, which reflects the assumed landing point position. This represents the score for the dynamic smoothness evaluation item under the assumed landing position. This represents the score for the attitude alignment evaluation item based on the assumed landing position. Indicates the arrival indicator function, Indicates the endpoint of the path. This indicates the landing point tolerance threshold. , , and These represent the safety evaluation coefficient, time efficiency evaluation coefficient, dynamic smoothness evaluation coefficient, and endpoint consistency rating coefficient, respectively; if the motion reward score converges, the assumed landing point is updated to the target landing point; This represents the assumed landing point position at time period t; Based on the current state set, the updated residual policy network is used to predict and output the spatial position correction vector and time scaling factor corresponding to the control period t. The current state set includes the distance vector between the current real-time position and the obstacle surface and the obstacle surface normal vector at the control period t. The spatial location correction vector is superimposed on the real-time path control point to correct the remaining path, thus obtaining the remaining corrected planning path; Based on the current real-time position, the time scaling factor, and the preset speed, the interpolation step size and target landing point of the control period t are calculated and obtained, and the interpolation step size is updated to the current local displacement segment. The current local displacement segment is moved to the target landing point, and then the current local displacement segment is removed from the remaining corrected planning path; Update the target landing point to the current real-time location and enter the next control cycle. Repeat the above steps until the current real-time location moves to the end point of the path. Based on the output of the residual policy network updated in each control period t, the physical motion control parameters corresponding to each discrete time step are obtained based on the preset single period duration. The physical motion control parameters include the motion velocity parameters of the target landing point and / or the landing point direction vector. S30, the current local displacement segments corresponding to each control cycle t are sequentially spliced ​​together to form the target walking path optimized based on the initial collision-free reference path.

2. The path motion control method for a welding robot according to claim 1, characterized in that, Using formula Calculate and obtain the score of the path space security evaluation item, where, Indicates the distance incentive weight. Indicates the collision penalty weight. Indicates control cycle It is assumed that the Euclidean distance between the landing point and the obstacle surface is... To prevent the denominator from having a singular smooth term, Indicates control cycle The collision indication function is set to 1 if a collision occurs at the assumed landing position, and 0 if no collision occurs at the assumed landing position. .

3. The path motion control method for a welding robot according to claim 1, characterized in that, Using formula Calculate and obtain the score of the exercise time efficiency evaluation item. Indicates the current real-time location. , Indicates control cycle Assuming the instantaneous composite velocity corresponding to the landing point position, Indicates the preset speed.

4. The path motion control method for a welding robot according to claim 1, characterized in that, Using formula Calculate and obtain the score of the dynamic smoothness evaluation item. , , These represent the current real-time position calculated for the first three periods of time period t. This indicates the preset single-cycle duration; Where, if the time period t does not have three preceding periods, then The value is 0.

5. The path motion control method for a welding robot according to claim 1, characterized in that, Using formula Calculate and obtain the score of the attitude alignment evaluation item. This represents the terminal velocity vector at the assumed impact point. This represents the standard tangential direction vector of the second target weld at the starting point.

6. A path motion control system for a welding robot, characterized in that, The device includes a processing unit for implementing the steps of the path motion control method for a welding robot as described in any one of claims 1 to 5.