A four-legged robot nonlinear trajectory planning and obstacle avoidance method for narrow space operation

By considering the anisotropy of motion and multimodal perception of quadruped robots, and combining nonlinear trajectory optimization and reinforcement learning, the problem of trajectory planning and obstacle avoidance in confined spaces is solved, achieving efficient, safe, and real-time trajectory planning and obstacle avoidance, which is applicable to a variety of quadruped robots.

CN122308377APending Publication Date: 2026-06-30WUHAN HANYANG POWER SUPPLY POWER ENG INSTALLATION TEAM

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
WUHAN HANYANG POWER SUPPLY POWER ENG INSTALLATION TEAM
Filing Date
2026-04-13
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing quadruped robots suffer from problems such as lack of consideration for motion anisotropy constraints, insufficient perception capabilities, poor obstacle avoidance robustness, difficulty in balancing global planning and local response, and high computational complexity when operating in confined spaces. These issues lead to infeasible trajectory planning, instability, collisions, and difficulty in real-time operation.

Method used

By employing coarse trajectory generation based on motion anisotropy perception, nonlinear trajectory optimization, real-time obstacle avoidance decision-making and hierarchical tracking control based on multimodal perception and reinforcement learning, combined with heterogeneous sensor data fusion and nonlinear model predictive control, dynamic, feasible, smooth trajectory planning and obstacle avoidance that satisfy anisotropic constraints are achieved.

Benefits of technology

It significantly improves safety and agility in confined spaces, achieves highly robust obstacle avoidance, balances global optimization and local real-time response, has high computational efficiency, is suitable for embedded deployment, and is applicable to a variety of quadruped robots.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122308377A_ABST
    Figure CN122308377A_ABST
Patent Text Reader

Abstract

This application relates to a nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces, comprising the following steps: S1, coarse trajectory generation based on motion anisotropy perception; S2, nonlinear trajectory optimization; S3, real-time obstacle avoidance decision based on multimodal perception and reinforcement learning; S4, hierarchical tracking control. The structure provided in this application significantly improves the safety and agility of passing through confined spaces; achieves highly robust obstacle avoidance in dynamic environments; and balances global optimization with local real-time response. The hierarchical architecture of "global coarse planning (S1) → local perception optimization (S2 and S3) → low-level tracking control (S4)" ensures the feasibility of the global path from the starting point to the end point, while also enabling local fine-tuning of the trajectory and obstacle avoidance based on real-time perception information. The response latency is ≤100ms, and it has strong adaptability to dynamic obstacles and sudden terrain changes.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of quadruped robot technology, and in particular to a nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces. Background Technology

[0002] Quadruped robots, with their strong terrain adaptability, have broad application prospects in search and rescue, inspection, and industrial operations. Among them, working in confined spaces (such as narrow alleys, indoor corners, and gaps in ruins) is one of their core application scenarios. However, confined spaces are characterized by limited space, complex terrain, and dense and potentially dynamic distribution of obstacles, which places extremely high demands on the trajectory planning and obstacle avoidance capabilities of quadruped robots.

[0003] Existing quadruped robot trajectory planning methods have the following main shortcomings:

[0004] The traditional method couples the robot’s planar position (x,y) with its yaw angle (θ) for planning, ignoring the inherent anisotropy of motion of the quadruped robot (i.e., there are significant differences in translation speed and turning ability under different orientations). This makes the planned trajectory dynamically infeasible and can easily lead to problems such as robot instability and collisions.

[0005] Insufficient perception capabilities: Single sensors (such as depth cameras) are prone to blind spots in confined spaces, while ultrasonic sensors suffer from high noise levels and cannot provide accurate environmental perception information for obstacle avoidance decisions.

[0006] Poor obstacle avoidance robustness: Traditional obstacle avoidance algorithms (such as artificial potential field method and dynamic window method) are weakly adaptable to dynamic and unknown narrow environments, while reinforcement learning algorithms have problems such as low training efficiency and poor experience reuse, making it difficult to meet the real-time obstacle avoidance requirements.

[0007] Global planning and local response are difficult to balance: Global planning algorithms (such as RRT) The first method can guarantee the global feasibility of the trajectory, but its real-time performance is poor; the second method has good real-time performance, but it is prone to getting trapped in local optima and cannot guarantee the effectiveness of the global path from the starting point to the target point.

[0008] In addition, the computational complexity of existing methods is high, making it difficult to implement real-time operation on the onboard embedded platform of quadruped robots, which limits their application in practical confined spaces.

[0009] Therefore, we propose a nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces. Summary of the Invention

[0010] This application provides a nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces, in order to solve the problems mentioned above.

[0011] This application provides a nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces, including the following methods:

[0012] S1. Coarse trajectory generation with motion anisotropy perception: A kinematic model containing the motion anisotropy constraint (OMA) of the quadruped robot is established. The robot's planar position and orientation are planned separately using a separate planning strategy. An improved sampling planning algorithm is used to search in the state space considering the OMA constraint to generate a collision-free coarse trajectory from the starting point to the target point.

[0013] S2. Nonlinear trajectory optimization: The collision-free coarse trajectory generated in step S1 is used as the initial guess. The trajectory is parameterized with a high-order polynomial. An optimization model is constructed that includes robot dynamics equations, joint torque constraints, foot contact constraints and real-time obstacle avoidance constraints. The optimized trajectory that is dynamically feasible, smooth and satisfies anisotropic constraints is obtained by solving the numerical optimization solver.

[0014] S3. Real-time obstacle avoidance decision based on multimodal perception and reinforcement learning: This process runs in parallel or alternately with steps S1 and S2. Through complex and narrow terrain modeling, heterogeneous sensor data fusion, and reinforcement learning strategies guided by waypoints, it outputs local obstacle avoidance waypoints or speed commands, which are then used as dynamic constraints input to the optimization model of step S2.

[0015] S4. Layered tracking control: The optimized trajectory from step S2 is used as a reference trajectory and fed into the underlying nonlinear model predictive controller (NMPC) for high-frequency tracking control, thereby achieving accurate trajectory tracking and stable body posture.

[0016] Preferably, the mathematical expression of the kinematic model of the quadruped robot with anisotropic motion constraints is as follows:

[0017]

[0018] in: These represent the robot's translational velocities in the x and y directions in the world coordinate system, respectively. Indicates yaw rate, Indicates the robot's facing angle. Represents the minimum and maximum translational velocities in the x-direction for different orientations. This represents the minimum and maximum translational velocities in the y-direction for different orientations. Indicates the upper and lower limits of the yaw rate.

[0019] Preferably, the separation planning strategy specifically involves: first planning the planar position trajectory, and then planning the orientation trajectory based on the curvature characteristics of the position trajectory, so that the orientation angle... It matches the tangent direction of the position trajectory and satisfies the OMA constraint.

[0020] Preferably, the objective function for nonlinear trajectory optimization is:

[0021]

[0022] The constraints are:

[0023]

[0024] in: This indicates the trajectory tracking term, which makes the optimized trajectory closer to the coarse trajectory; This represents the trajectory smoothing term, which minimizes trajectory acceleration to ensure smoothness. This indicates the obstacle avoidance penalty. Indicates the penalty coefficient; This represents an exponential function, taking the value 1 when the trajectory collides with an obstacle, and 0 otherwise; T represents the trajectory duration. Indicates a rough trajectory; Indicates the optimized trajectory; Represent the robot's dynamic equations; Indicates control input; Indicates joint torque; This represents the height of the i-th foot tip. Represents the normal vector of the contact surface. Indicates the distance from the trajectory to the obstacle. Indicates a safe distance.

[0025] Preferably, the method for heterogeneous sensor data fusion is as follows:

[0026] S31. Depth camera data processing: Extract the region of interest (ROI) from the depth image. The extracted region is a range of 1.5m in front of the robot and 0.8m to the left and right. Dimensionality reduction is performed through a convolutional neural network to extract channel geometric features.

[0027] S32. Ultrasonic sensor data processing: Perform median filtering and Kalman filtering on the 8-channel sensor data of the ultrasonic array to eliminate noise and obtain distance information in 8 directions around the robot.

[0028] S33. Data fusion: A weighted fusion strategy is used to obtain fused perception features.

[0029] Preferably, the pathpoint-guided reinforcement learning strategy includes the internal agent's state space S, action space A, and reward function R(s,a) as follows:

[0030] State space: Where: F represents the fused perception features, and p represents the robot's current position. Indicates the current speed. Indicates the current orientation. Indicates the location of the target point;

[0031] Action space: ,in: w represents the translational velocity, and w represents the yaw rate.

[0032] Reward function: ;

[0033] in:

[0034]

[0035] to Indicates the reward coefficient. Describe local path points. This indicates the distance information from eight methods around the robot to obstacles.

[0036] Preferably, the nonlinear model predictive controller (NMPC) has a control cycle of 50ms, a prediction time domain of 1.0s, and a control time domain of 0.2s. It uses the robot's position, velocity, attitude angle, and angular velocity as state variables and joint torque as control variables to achieve trajectory tracking through rolling optimization.

[0037] Preferably, the improved sampling planning algorithm adopts the improved RRT algorithm, which incorporates the sampling probability distribution with OMA constraints in the sampling phase and prioritizes the selection of nodes that satisfy the OMA constraints in the reconnection phase.

[0038] The technical solutions provided in this application have the following advantages compared with the prior art:

[0039] The structure provided in this application significantly improves safety and agility in navigating confined spaces: by explicitly modeling and considering the kinematic anisotropy (OMA) of the quadruped robot, motion instability caused by capability coupling is fundamentally avoided. In scenarios involving a straight narrow passage with a width of 0.8m and a right-angle bend with a turning radius of 0.5m, the robot achieves a success rate of 98.7%, reduces the passage time by 22.3% compared to traditional methods, and exhibits no instability or collisions.

[0040] Achieving highly robust obstacle avoidance in dynamic environments: Employing multimodal fusion sensing using a depth camera and ultrasonic array overcomes the blind spots and interference problems of single sensors in confined environments, reducing sensing errors to a minimum. By combining a reinforcement learning obstacle avoidance strategy based on multi-agent experience sharing, the obstacle avoidance success rate reaches 95.2% in dynamic, confined scenarios containing moving obstacles, which is more than 20% higher than the traditional PPO and SAC algorithms.

[0041] Balancing global optimization and local real-time response: The hierarchical architecture of “global coarse planning (S1) → local perception optimization (S2 and S3) → underlying tracking control (S4)” ensures the feasibility of the global path from the starting point to the end point, while also enabling local fine-tuning of the trajectory and obstacle avoidance based on real-time perception information. The response latency is ≤100ms, and it has strong adaptability to dynamic obstacles and sudden terrain changes.

[0042] High computational efficiency, suitable for embedded deployment: The trajectory optimization stage uses the IPOPT solver and leverages a coarse trajectory with warm start-up, achieving a solution time of ≤80ms; the obstacle avoidance decision module uses a three-layer perceptron to reduce the dimensionality of high-dimensional perception data, reducing computational complexity by 60%; the underlying NMPC operates at a frequency of 20Hz. The entire framework operates at a frequency of 12Hz on the NVIDIA Jetson Xavier NX airborne platform, meeting the real-time requirements of operations in confined spaces.

[0043] High versatility and scalability to various quadruped robots: The OMA kinematic model of this invention can be adjusted according to the mechanical structure of different quadruped robots, and the multimodal perception fusion and reinforcement learning strategies can also be adapted to different sensor configurations and operating scenarios, demonstrating good versatility and scalability. Attached Figure Description

[0044] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application.

[0045] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, for those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0046] Figure 1 This is a schematic diagram of the overall principle and structure of the present invention. Detailed Implementation

[0047] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0048] Various embodiments of this application may exist in the form of a range. It should be understood that the description in the form of a range is merely for convenience and brevity and should not be construed as a rigid limitation on the scope of this application. Therefore, it should be considered that the range description has specifically disclosed all possible sub-ranges and single numerical values ​​within that range. For example, it should be considered that the range description from 1 to 6 has specifically disclosed sub-ranges, such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6, etc., and single numbers within the range, such as 1, 2, 3, 4, 5, and 6, regardless of the range. In addition, whenever a numerical range is indicated in this application, it means including any referenced number (fraction or integer) within the indicated range. Unless otherwise specified, all raw materials, reagents, instruments, and equipment used in this application can be purchased commercially or prepared using existing equipment.

[0049] In this application, unless otherwise stated, directional terms such as "upper" and "lower" specifically refer to the drawing directions in the accompanying drawings. Furthermore, in this application, the terms "comprising," "including," etc., mean "including but not limited to." In this application, relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. In this application, "and / or" describes the relationship between related objects, indicating that three relationships may exist. For example, A and / or B can represent: A alone, A and B simultaneously, or B alone. A and B can be singular or plural. In this application, "at least one" means one or more, and "more than one" means two or more. "At least one," "at least one of the following," or similar expressions refer to any combination of these items, including any combination of a single item or a plural item. For example, "at least one of a, b, or c" or "at least one of a, b, and c" can both mean: a, b, c, ab (i.e., a and b), ac, bc, or abc, where a, b, and c can be a single or multiple.

[0050] like Figure 1As shown in the figure, this application provides a nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces, including the following methods:

[0051] S1. Coarse trajectory generation with motion anisotropy perception: A kinematic model containing the motion anisotropy constraint (OMA) of the quadruped robot is established. The robot's planar position and orientation are planned separately using a separate planning strategy. An improved sampling planning algorithm is used to search in the state space considering the OMA constraint to generate a collision-free coarse trajectory from the starting point to the target point.

[0052] S2. Nonlinear trajectory optimization: The collision-free coarse trajectory generated in step S1 is used as the initial guess. The trajectory is parameterized with a high-order polynomial. An optimization model is constructed that includes robot dynamics equations, joint torque constraints, foot contact constraints and real-time obstacle avoidance constraints. The optimized trajectory that is dynamically feasible, smooth and satisfies anisotropic constraints is obtained by solving the numerical optimization solver.

[0053] S3. Real-time obstacle avoidance decision based on multimodal perception and reinforcement learning: This process runs in parallel or alternately with steps S1 and S2. Through complex and narrow terrain modeling, heterogeneous sensor data fusion, and reinforcement learning strategies guided by waypoints, it outputs local obstacle avoidance waypoints or speed commands, which are then used as dynamic constraints input to the optimization model of step S2.

[0054] S4. Layered tracking control: The optimized trajectory from step S2 is used as a reference trajectory and fed into the underlying nonlinear model predictive controller (NMPC) for high-frequency tracking control, thereby achieving accurate trajectory tracking and stable body posture.

[0055] The mathematical expression for the kinematic model of the quadruped robot with anisotropic motion constraints is as follows:

[0056]

[0057] in: These represent the robot's translational velocities in the x and y directions in the world coordinate system, respectively. Indicates yaw rate, Indicates the robot's facing angle. Represents the minimum and maximum translational velocities in the x-direction for different orientations. This represents the minimum and maximum translational velocities in the y-direction for different orientations. Indicates the upper and lower limits of the yaw rate.

[0058] The separation planning strategy specifically involves: first planning the planar position trajectory, and then planning the orientation trajectory based on the curvature characteristics of the position trajectory, so that the orientation angle... It matches the tangent direction of the position trajectory and satisfies the OMA constraint.

[0059] The objective function for the nonlinear trajectory optimization is:

[0060]

[0061] The constraints are:

[0062]

[0063] in: This indicates the trajectory tracking term, which makes the optimized trajectory closer to the coarse trajectory; This represents the trajectory smoothing term, which minimizes trajectory acceleration to ensure smoothness. This indicates the obstacle avoidance penalty. Indicates the penalty coefficient; This represents an exponential function, taking the value 1 when the trajectory collides with an obstacle, and 0 otherwise; T represents the trajectory duration. Indicates a rough trajectory; Indicates the optimized trajectory; Represent the robot's dynamic equations; Indicates control input; Indicates joint torque; This represents the height of the i-th foot tip. Represents the normal vector of the contact surface. Indicates the distance from the trajectory to the obstacle. Indicates a safe distance.

[0064] The method for fusion of heterogeneous sensor data is as follows:

[0065] S31. Depth camera data processing: Extract the region of interest (ROI) from the depth image. The extracted region is a range of 1.5m in front of the robot and 0.8m to the left and right. Dimensionality reduction is performed through a convolutional neural network to extract channel geometric features.

[0066] S32. Ultrasonic sensor data processing: Perform median filtering and Kalman filtering on the 8-channel sensor data of the ultrasonic array to eliminate noise and obtain distance information in 8 directions around the robot.

[0067] S33. Data fusion: A weighted fusion strategy is used to obtain fused perception features.

[0068] The pathpoint-guided reinforcement learning strategy includes the internal agent's state space S, action space A, and reward function R(s,a) as follows:

[0069] State space: Where: F represents the fused perception features, and p represents the robot's current position. Indicates the current speed. Indicates the current orientation. Indicates the location of the target point;

[0070] Action space: ,in: w represents the translational velocity, and w represents the yaw rate.

[0071] Reward function: ;

[0072] in:

[0073]

[0074] to Indicates the reward coefficient. Describe local path points. This indicates the distance information from eight methods around the robot to obstacles.

[0075] The nonlinear model predictive controller (NMPC) has a control cycle of 50ms, a prediction time domain of 1.0s, and a control time domain of 0.2s. It uses the robot's position, velocity, attitude angle, and angular velocity as state variables and joint torque as control variables to achieve trajectory tracking through rolling optimization.

[0076] The improved sampling planning algorithm adopts the improved RRT algorithm, which incorporates the sampling probability distribution with OMA constraints in the sampling phase, and prioritizes the selection of nodes that satisfy the OMA constraints in the reconnection phase.

[0077] Example 1

[0078] Dimensions: Length 0.6m, Width 0.4m, Height 0.5m;

[0079] Total weight: 12kg;

[0080] Motion performance: maximum translational speed 0.6 m / s, maximum yaw rate 0.8 rad / s, joint torque range 0 to 50 N·m;

[0081] Sensor configuration: Depth camera (Intel RealSense D435i), 8-channel ultrasonic sensor array, IMU (Inertial Measurement Unit), GPS (Indoor Positioning Module);

[0082] Onboard computing platform: NVIDIA Jetson Xavier NX (8-core CPU, 16GB memory);

[0083] Software environment:

[0084] Operating system: Ubuntu 20.04 LTS;

[0085] Programming framework: Python 3.8, C++17, ROS Noetic;

[0086] Algorithm libraries: CasADi (nonlinear optimization), PyTorch (reinforcement learning), OpenCV (image processing), IPOPT (numerical solver);

[0087] Example 2

[0088] Experimental scenarios: Three types of real-world confined space experimental scenarios were constructed, consistent with the simulation scenarios:

[0089] Scene 1 (Straight Narrow Passage): 0.8m wide, 5m long, with fixed obstacles on both sides;

[0090] Scenario 2 (Right Angle Bend): Turning radius 0.5m, passage width 0.8m;

[0091] Scenario 3 (Dynamic Obstacle Narrow Passage): The passage is 0.8m wide and 5m long, and contains 1 moving obstacle (speed 0.2m / s).

[0092] Test metrics: Four core metrics were selected to evaluate the performance of this invention, and it was compared with the traditional RRT*+dynamic window method (comparison method 1) and the PPO reinforcement learning obstacle avoidance method (comparison method 2):

[0093] Indicator 1: Success Rate (%): The percentage of experiments that successfully reached the target point from the starting point out of the total number of experiments;

[0094] Metric 2: Average Passage Time (s): The average time taken to successfully pass through a scenario;

[0095] Indicator 3: Position tracking error (cm): The average error between the robot's actual position and the reference trajectory;

[0096] Metric 4: Obstacle Avoidance Response Latency (ms): The average delay from detecting an obstacle to taking an obstacle avoidance action.

[0097] Experimental Results and Analysis

[0098] Each scenario was tested 100 times, and the results are shown in Table 1:

[0099] Table 1 Comparison of experimental performance of different methods

[0100]

[0101] The experimental results show that:

[0102] 1. Success rate: The success rate of this invention in all three scenarios is over 95%, which is much higher than that of comparative method 1 and comparative method 2, indicating that this invention can effectively solve the problems of infeasibility of trajectory dynamics and poor obstacle avoidance robustness in confined spaces;

[0103] 2. Average transit time: The average transit time of the present invention is 20% to 30% shorter than that of comparative method 1 and 10% to 20% shorter than that of comparative method 2, indicating that the trajectory planning of the present invention is more efficient and the robot can pass through narrow spaces more nimbly;

[0104] 3. Position tracking error: The position tracking error of this invention is controlled within 5cm, which is much lower than that of the comparison method, indicating that the underlying NMPC tracking control has higher accuracy;

[0105] 4. Obstacle avoidance response delay: The obstacle avoidance response delay of the present invention is ≤100ms, which meets the requirements of real-time obstacle avoidance, while the delay of the comparison methods all exceed 120ms, making it difficult to cope with dynamic obstacles.

[0106] Furthermore, in actual experiments, the quadruped robot of the present invention moved smoothly in a confined space without instability or collision, verifying the safety and reliability of the present invention; the entire frame operated at a frequency of 12Hz on the airborne platform, verifying the computational efficiency and embedded deployment feasibility of the present invention.

[0107] The above description is merely a specific embodiment of this application, enabling those skilled in the art to understand or implement this application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of this application. Therefore, this application is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features claimed in this application.

Claims

1. A four-legged robot nonlinear trajectory planning and obstacle avoidance method for narrow space operation, characterized in that: Including the following methods: S1. Coarse trajectory generation with motion anisotropy perception: A kinematic model containing the motion anisotropy constraint (OMA) of the quadruped robot is established. The robot's planar position and orientation are planned separately using a separate planning strategy. An improved sampling planning algorithm is used to search in the state space considering the OMA constraint to generate a collision-free coarse trajectory from the starting point to the target point. S2. Nonlinear trajectory optimization: The collision-free coarse trajectory generated in step S1 is used as the initial guess. The trajectory is parameterized with a high-order polynomial. An optimization model is constructed that includes robot dynamics equations, joint torque constraints, foot contact constraints and real-time obstacle avoidance constraints. The optimized trajectory that is dynamically feasible, smooth and satisfies anisotropic constraints is obtained by solving the numerical optimization solver. S3. Real-time obstacle avoidance decision based on multimodal perception and reinforcement learning: This process runs in parallel or alternately with steps S1 and S2. Through complex and narrow terrain modeling, heterogeneous sensor data fusion, and reinforcement learning strategies guided by waypoints, it outputs local obstacle avoidance waypoints or speed commands, which are then used as dynamic constraints input to the optimization model of step S2. S4. Layered tracking control: The optimized trajectory from step S2 is used as a reference trajectory and fed into the underlying nonlinear model predictive controller (NMPC) for high-frequency tracking control, thereby achieving accurate trajectory tracking and stable body posture.

2. The four-legged robot nonlinear trajectory planning and obstacle avoidance method for narrow space operation according to claim 1, characterized in that: The mathematical expression for the kinematic model of the quadruped robot with anisotropic motion constraints is as follows: ; wherein: , are the x, y direction translational velocities of the robot in the world coordinate frame, respectively, is the yaw rate, is the heading angle of the robot, are the minimum and maximum translational velocities in the x direction for different headings, are the minimum and maximum translational velocities in the y direction for different headings, are the upper and lower bounds of the yaw rate.

3. The nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces according to claim 1, characterized in that: The separation planning strategy specifically involves: first planning the planar position trajectory, and then planning the orientation trajectory based on the curvature characteristics of the position trajectory, so that the orientation angle... It matches the tangent direction of the position trajectory and satisfies the OMA constraint.

4. The nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces according to claim 1, characterized in that: The objective function for the nonlinear trajectory optimization is: ; ; ; The constraints are: ; in: This indicates the trajectory tracking term, which makes the optimized trajectory closer to the coarse trajectory; This represents the trajectory smoothing term, which minimizes trajectory acceleration to ensure smoothness. This indicates the obstacle avoidance penalty. Indicates the penalty coefficient; This represents an exponential function, taking the value 1 when the trajectory collides with an obstacle, and 0 otherwise; T represents the trajectory duration. Indicates a rough trajectory; Indicates the optimized trajectory; Represent the robot's dynamic equations; Indicates control input; Indicates joint torque; This represents the height of the i-th foot tip. Represents the normal vector of the contact surface. Indicates the distance from the trajectory to the obstacle. Indicates a safe distance.

5. The nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces according to claim 1, characterized in that: The method for fusion of heterogeneous sensor data is as follows: S31. Depth camera data processing: Extract the region of interest (ROI) from the depth image. The extracted region is a range of 1.5m in front of the robot and 0.8m to the left and right. Dimensionality reduction is performed through a convolutional neural network to extract channel geometric features. S32. Ultrasonic sensor data processing: Perform median filtering and Kalman filtering on the 8-channel sensor data of the ultrasonic array to eliminate noise and obtain distance information in 8 directions around the robot. S33. Data fusion: A weighted fusion strategy is used to obtain fused perception features.

6. The nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces according to claim 1, characterized in that: The pathpoint-guided reinforcement learning strategy includes the internal agent's state space S, action space A, and reward function R(s,a) as follows: State space: ,in: Indicates fusion of perceptual features, Indicates the robot's current position. Indicates the current speed. Indicates the current orientation. Indicates the location of the target point; Action space: ,in: w represents the translational velocity, and w represents the yaw rate. Reward function: ; in: ; to Indicates the reward coefficient. Describe local path points. This indicates the distance information from eight methods around the robot to obstacles.

7. The nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces according to claim 1, characterized in that: The nonlinear model predictive controller (NMPC) has a control cycle of 50ms, a prediction time domain of 1.0s, and a control time domain of 0.2s. It uses the robot's position, velocity, attitude angle, and angular velocity as state variables and joint torque as control variables to achieve trajectory tracking through rolling optimization.

8. The nonlinear trajectory planning and obstacle avoidance method for quadruped robots operating in confined spaces according to claim 1, characterized in that: The improved sampling planning algorithm adopts the improved RRT algorithm, which incorporates the sampling probability distribution with OMA constraints in the sampling phase, and prioritizes the selection of nodes that satisfy the OMA constraints in the reconnection phase.