Method and system for controlling automatic climbing of steel-transmission-tower climbing robot

By acquiring environmental data and status information of power transmission towers, and combining point cloud processing and deep learning algorithms, an optimized obstacle-crossing strategy is generated, which solves the problem of insufficient obstacle recognition and landing point accuracy of climbing robots, and realizes efficient and stable automatic climbing.

WO2026137615A1PCT designated stage Publication Date: 2026-07-02SHANGHAI PLATFORM FOR SMART MFG CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
SHANGHAI PLATFORM FOR SMART MFG CO LTD
Filing Date
2025-03-18
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Existing power transmission tower climbing robots have shortcomings in obstacle recognition and landing point accuracy, which affect their stability and safety.

Method used

By acquiring robot environmental data and state information, performing point cloud processing and straight-line tracking, and combining Markov decision models and deep deterministic policy gradient methods, an optimized obstacle-crossing strategy is generated and mapped onto the robot's tree framework to achieve automatic climbing.

Benefits of technology

It improves the climbing efficiency and stability of robots in complex environments, reduces safety risks, and ensures the safety and accuracy of the climbing process.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2025083094_02072026_PF_FP_ABST
    Figure CN2025083094_02072026_PF_FP_ABST
Patent Text Reader

Abstract

Provided in the present invention are a method and system for controlling the automatic climbing of a steel-transmission-tower climbing robot. The method comprises: acquiring environmental data and state information of a robot; performing point cloud processing on the environmental data, so as to acquire information regarding the position and size of an obstacle; performing line tracking on the environmental data, and determining and adjusting a foothold position; on the basis of the obstacle information, the foothold position and the state information of the robot, establishing a Markov decision process model; optimizing an obstacle-crossing policy by means of a deep deterministic policy gradient (DDPG) method; and mapping the optimized obstacle-crossing policy to a behavior tree framework inside the robot, and controlling the robot to act so as to complete an automatic climbing task. The present invention can generate an adaptive climbing policy by means of environmental data in combination with reinforcement learning, thereby realizing the efficient climbing of a robot in a complex environment.
Need to check novelty before this filing date? Find Prior Art

Description

A control method and system for automatic climbing of a power transmission tower climbing robot Technical Field

[0001] This invention relates to the field of robot automatic control technology, specifically to a control method and system for automatic climbing of a power transmission tower climbing robot. Background Technology

[0002] Transmission towers, as an indispensable part of the power system, bear the vital responsibility of transmitting electricity. They are distributed across a vast geographical area, ensuring a stable power supply. However, the daily inspection and maintenance of these towers is an extremely labor-intensive and dangerous task. Workers need to conduct inspections at high altitudes, facing harsh weather conditions and complex terrain, which not only tests their physical fitness but also poses a challenge to their psychological endurance. Therefore, to ensure the safety of workers and improve work efficiency, the development and application of advanced inspection technologies and equipment are particularly important.

[0003] Transmission tower climbing robots can replace traditional manual labor in the inspection and maintenance of power towers, avoiding the need for technicians to work in inclement weather or complex tower truss environments. Furthermore, detachable fall arrestors can be installed on the towers, significantly reducing the risk of technicians falling compared to traditional safety rope climbing methods, thus improving the safety of tower maintenance operations. However, during the climbing process, the robot's stability and safety are significantly affected by the accuracy of obstacle recognition and footholds. Therefore, further improvements in these areas are necessary. Summary of the Invention

[0004] To address the shortcomings of existing technologies, the purpose of this invention is to provide a control method and system for the automatic climbing of power transmission tower climbing robots.

[0005] According to one aspect of the present invention, a control method for automatic climbing of a power transmission tower climbing robot is provided, comprising:

[0006] Acquire robot environmental data and status information;

[0007] The environmental data is processed into point cloud data to obtain information on the location and size of obstacles;

[0008] The environmental data is used for linear tracking, and the landing position is determined and adjusted in conjunction with the status information.

[0009] A Markov decision model is established based on the obstacle information, the landing position, and the robot's state information;

[0010] Using the Markov decision model as a framework, the optimized obstacle-crossing strategy is obtained through the deep deterministic policy gradient method.

[0011] The optimized obstacle-crossing strategy is mapped onto the tree framework inside the robot, controlling the robot to complete the automatic climbing task.

[0012] Preferably, the acquisition of robot environmental data and status information includes:

[0013] Pre-collect 3D point cloud data of the tower;

[0014] Real-time acquisition of 3D point cloud data of towers, obstacles, and background;

[0015] Pre-acquisition of images of main materials of iron towers

[0016] Real-time acquisition of images of the main materials of the iron tower;

[0017] Real-time measurement of the distance between obstacles and the landing position;

[0018] The acquisition of robot status information includes:

[0019] The status information is obtained by communicating with the robot motion controller via a network; the status information refers to pose information, including the position and posture of each foot relative to the robot body.

[0020] Preferably, the environmental data is processed into point cloud data to obtain information on the location and size of obstacles, including:

[0021] The pre-collected point cloud data and the real-time collected point cloud data are segmented.

[0022] For the segmented point cloud data, the bounding box method is used to calculate the geometric information of obstacles;

[0023] By combining real-time collected distance data, the location of obstacles can be determined.

[0024] Preferably, the step of performing linear tracking on the environmental data and combining it with the state information to determine and adjust the landing position includes:

[0025] Based on the pre-collected images of the main tower components, determine the straight lines of the main tower components;

[0026] Based on real-time acquired images of the main tower structure, the LSD straight line detection algorithm is used to extract the straight line features of the main tower structure edges.

[0027] Among the extracted straight line features of the main tower material edge, select the straight line with the smallest distance from the pre-collected straight line or the straight line selected in the previous frame to complete the tracking;

[0028] Based on the robot's state information and the selected straight line, the relative position between the moving foot and the main material of the tower, as well as the angle between the robot body and the main material of the tower, are calculated.

[0029] Based on the relative position and the included angle, with the center of the main tower material and the included angle being zero as a reference, the landing position of the moving foot is adjusted.

[0030] Preferably, the Markov decision model is only a qualitative description of the problem, including the definitions of states, actions, and transition probabilities, specifically including:

[0031] Set of states: S = {s1, s2, ..., s} n} represents the set of all possible states of the robot, where s i Indicates the i-th state;

[0032] Action set: A = {a1, a2, ..., a...} m} represents the set of all possible actions of the robot, where a j This represents the j-th action;

[0033] Transition probability: P(s′|s,a) represents the probability of transitioning to state s′ after performing action a in state s;

[0034] Reward function: R(s,a,s′) represents the immediate reward obtained when performing action a in state s and transitioning to state s′.

[0035] Preferably, the step of obtaining the optimized obstacle-crossing strategy using the Markov decision model as a framework and the deep deterministic policy gradient method includes:

[0036] An environment network, an actor network, and a critic network are constructed. The environment network is a tower model, including the tower and obstacles on its main structure. The positions and sizes of the obstacles are randomly generated during the learning process. Both the actor network and the critic network use a dual network of real-time network and target network, and employ policy gradient optimization.

[0037] The environmental network provides robot status s t The Critic network generates and evaluates the current action a. t Whether the obstacle can be overcome; when it is determined that it can be executed, the Actor network executes the current action a. t The environment network receives a reward. t and the generated state s t+1 ;

[0038] The empirical data s generated by the interaction between the Actor network and the environment network t ,a t ,rt ,s t+1 The data is stored in the experience pool R, and then a batch of data samples is extracted from the experience pool R for training and optimization of the Critic network.

[0039] Preferably, the tree framework includes:

[0040] Execute the environmental perception node to acquire environmental obstacle information and convert it into obstacle queue data;

[0041] The autonomous decision-making node is executed to obtain the optimized obstacle-crossing strategy and generate a robot action sequence.

[0042] Execute the action sequence node, and control each joint of the robot to complete the action process according to the action sequence.

[0043] According to a second aspect of the present invention, a control system for automatic climbing of a power transmission tower climbing robot is provided, comprising:

[0044] Data acquisition module: Acquires robot environmental data and status information;

[0045] The perception module performs point cloud processing on the environmental data to obtain information on the location and size of obstacles; it performs straight-line tracking on the environmental data and, in conjunction with the state information, determines and adjusts the landing position.

[0046] Autonomous decision-making module: Based on the obstacle information, the landing position and the robot's state information, a Markov decision model is established; using the Markov decision model as a framework, an optimized obstacle-crossing strategy is obtained through the deep deterministic policy gradient method;

[0047] Execution module: Maps the optimized obstacle-crossing strategy onto the tree framework inside the robot, controlling the robot to complete the automatic climbing task.

[0048] Preferably, the data acquisition module includes:

[0049] A binocular camera is used to acquire 3D point cloud data of the tower obstacle and calculate the size and position of the obstacle;

[0050] Industrial cameras are used to acquire images of the main structure of the iron tower, extract the straight line features of the main structure's edges, and display environmental images in real time.

[0051] Laser rangefinders are used to measure the distance between an obstacle and a landing point.

[0052] Preferably, the industrial camera is fixed next to the moving foot and is parallel to the frame.

[0053] Compared with the prior art, the embodiments of the present invention have at least one of the following beneficial effects:

[0054] The control method and system for automatic climbing of the power transmission tower climbing robot in this invention can generate an adaptive climbing strategy by combining environmental data with reinforcement learning, thereby enabling the robot to climb efficiently in complex environments.

[0055] The control method and system for automatic climbing of the power transmission tower climbing robot in this invention have precise perception capabilities. Its point cloud processing and straight line tracking ensure the accuracy of environmental information acquisition and provide high-quality data support for decision-making.

[0056] The control method and system for automatic climbing of the power transmission tower climbing robot in this invention have intelligent decision-making capabilities. The obstacle-crossing strategy generated by its reinforcement learning algorithm is adaptive and can achieve optimal motion planning in different environments.

[0057] The control method and system for automatic climbing of the power transmission tower climbing robot according to the present invention have high reliability and safety performance. By accurately adjusting the landing position, it significantly improves the stability of the robot's climbing action and reduces safety risks. Attached Figure Description

[0058] Other features, objects, and advantages of the present invention will become more apparent from the following detailed description of non-limiting embodiments with reference to the accompanying drawings:

[0059] Figure 1 is a flowchart of a control method for automatic climbing of a power transmission tower climbing robot according to an embodiment of the present invention;

[0060] Figure 2 is a structural diagram of a tower climbing robot according to an embodiment of the present invention;

[0061] Figure 3 is a point cloud processing framework diagram in one embodiment of the present invention;

[0062] Figure 4 is a framework diagram of the Depth Deterministic Strategy Gradient Method (DDPG) in one embodiment of the present invention;

[0063] Figure 5 is a schematic diagram of the behavior tree of the automatic climbing process in one embodiment of the present invention;

[0064] Figure 6 is a framework diagram of the data acquisition module of the tower climbing robot in one embodiment of the present invention;

[0065] Figure 7 shows the installation position of the tower climbing monitoring robot sensor in one embodiment of the present invention;

[0066] In Figure 2: 100 represents the robot's first leg, 200 represents the robot's main frame, and 300 represents the robot's second leg. Detailed Implementation

[0067] The present invention will now be described in detail with reference to specific embodiments. These embodiments will help those skilled in the art to further understand the present invention, but do not limit the invention in any way. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the present invention. These all fall within the scope of protection of the present invention.

[0068] In one embodiment of the present invention, a control method for automatic climbing of a power transmission tower climbing robot is provided, as shown in Figure 1. The main process is as follows:

[0069] S100, acquires robot environmental data and status information;

[0070] S200 performs point cloud processing on the environmental data obtained by S100 to obtain information on the location and size of obstacles;

[0071] S300 performs straight-line tracking on the environmental data obtained by S100, and determines and adjusts the landing position by combining the status information obtained by S100.

[0072] S400: Based on the obstacle information obtained in S200, the landing position obtained in S300, and the robot's state information obtained in S100, a Markov decision model is established.

[0073] S500, based on the Markov decision model established by S00, obtains the optimized obstacle-crossing strategy through the deep deterministic policy gradient method.

[0074] S600 maps the optimized obstacle-crossing strategy obtained from S500 onto the tree framework inside the robot, controlling the robot to complete the automatic climbing task.

[0075] The control method described in the above embodiments enables the robot to automatically climb along the main structure of the iron tower.

[0076] The robot used in this embodiment of the invention is shown in Figure 2, including a robot body 200 (also called a frame), and a first moving leg 100 and a second moving leg 300 located on the robot body.

[0077] In one embodiment of the present invention, step S100 is performed to acquire environmental data and robot status information.

[0078] The environmental data includes 3D point cloud data of the tower obstacle, which is used to obtain the size of the obstacle; it also includes the precise distance between the obstacle on the tower and the upper or lower foot, which is used to combine with the data from the binocular camera to obtain complete obstacle information, namely the obstacle's location and size.

[0079] The environmental data also includes images of the main tower structure, used to provide data for subsequent automatic adjustment of the foot's landing position. Alternatively, the tower's environmental images can be displayed to operators in real time to assist manual decision-making in some embodiments where manual mode is used.

[0080] In some specific embodiments, a binocular camera can be used to acquire three-dimensional point cloud data of the tower obstacle and calculate the size and position of the obstacle; an industrial camera can be used to acquire images of the main tower material, extract the straight line features of the main material edge, and display the environmental image in real time; a laser ranging device can be used to measure the distance between the obstacle and the footing position.

[0081] Based on the environmental data of the robot collected in the above embodiments, in a preferred embodiment, step S200 is implemented as shown in Figure 3, performing point cloud processing on the environmental data to obtain information on the location and size of obstacles. Specifically, the following steps can be adopted:

[0082] S201, performs segmentation processing on pre-collected point cloud data and real-time point cloud data;

[0083] The collected point cloud data includes the main structure of the transmission tower (the main structure of the power transmission line, shaped like an L-shaped angle iron column, along which the robot climbs), obstacles, and other background interference. To calculate the size of the obstacles on the tower, the corresponding parts of the obstacles need to be extracted from the point cloud, excluding the point cloud of the main tower structure and the background point cloud. Analysis of the robot's climbing posture on the tower shows that the point cloud data to be retained is the data on the side of the plane containing the main tower structure closest to the robot. Therefore, it is necessary to obtain the equation of the plane containing the main tower structure in the binocular camera coordinate system, using this equation as a boundary to discard the point cloud data on the side furthest from the robot. Since the relative position of the binocular camera and the robot is fixed, it can be assumed that the relative position of the plane containing the main tower structure and the binocular camera remains essentially unchanged at least during a single climbing task. Therefore, the plane equation of the main tower structure surface can be obtained through data pre-acquisition before the robot begins automatic climbing. After obtaining the plane coordinates, the point cloud can be segmented.

[0084] S202, using the bounding box algorithm to calculate the geometric information of obstacles;

[0085] After preprocessing the segmented point cloud, such as denoising, the size of the obstacle in the current depth image can be obtained by calculating the minimum bounding box of the point cloud.

[0086] It should be noted that during the lifting process of the robot frame (robot body), the depth image obtained by the binocular camera will change as the frame position changes, and the calculated obstacle size will also change. The actual size of the obstacle can be obtained by analyzing the changes in these two factors.

[0087] S203, combined with laser ranging data, determines the location of the obstacle.

[0088] In the above embodiments, steps S201-S203 perform point cloud processing on the collected environmental data, which can ensure the accuracy of environmental information acquisition and provide high-quality data support for subsequent control decisions.

[0089] Furthermore, based on the robot's environmental data collected in the above embodiments, in a preferred embodiment, step S300 is implemented as shown in Figure 3, performing straight-line tracking on the environmental data and determining and adjusting the landing position in conjunction with the robot's state information. Specifically, the following steps can be adopted:

[0090] S301 tracks the straight line of the main material edge in real time based on the pre-collected main material straight line parameters.

[0091] The main material identification of power transmission towers adopts the LSD straight line detection algorithm, which is an algorithm for detecting straight line segments in an image. It determines edge points by calculating the image gradient, attempts to construct straight line segments based on the edge points and verifies them, and finally returns a list of all straight line segments.

[0092] Images obtained by industrial cameras are easily affected by background interference and cannot identify the straight lines of the main material's edges. Therefore, data pre-acquisition is also required. Specifically, during pre-acquisition, a whiteboard or white paper is placed behind the main material to eliminate interference. The two longest straight lines in the image at this time, with an angle of less than 15° between them and a closest distance of more than 10cm, are selected and their parameters are recorded as the straight line parameters for pre-acquisition.

[0093] When the robot starts moving, it needs to track the straight lines along the edge of the main tower structure based on the pre-acquired straight line parameters. In the (N+1)th frame image, it needs to find the straight line with the smallest difference from the straight line found in the Nth frame among the straight lines detected by the LSD straight line detection method. Of course, the first frame uses the pre-acquired straight line parameters as the tracking standard.

[0094] S302, based on the robot's state information and the straight line selected in S301, calculates the relative position between the moving feet and the main structure of the tower, as well as the angle between the robot body and the main structure of the tower. Specifically, regarding the relative position, S301 detects the straight edge position of the main structure, and S100 has already obtained the robot's state information, which includes the position and orientation of each foot relative to the robot body. Therefore, the relative position information between the moving feet and the main structure can be obtained. Regarding the angle, as shown in Figure 7, the robot has two industrial cameras. In S301, each camera detects the edge position of the main structure. The angle θ = acos((d1-d2)|L) can be calculated based on the positional deviation (d1, d2) detected by the two cameras and the installation distance L between the cameras.

[0095] S303, based on the included angle and relative position obtained from S302, adjusts the sports foot with the center of the main material and the included angle as close to zero as possible.

[0096] In a preferred embodiment of the present invention, step S400 involves establishing a Markov decision model (MDP) for the robot's obstacle-crossing process based on obstacle information acquired by the environmental perception module and robot state information provided by the robot control system. This step abstracts the relationship between actions and state transitions during the robot's climbing process into a model, providing a foundation for strategy generation. This decision model is merely a description of the problem, defining the current state and actions of the system, specifically:

[0097] Set of states: S = {s1, s2, ..., s} n} represents the set of all possible states of the robot, where s i Indicates the i-th state;

[0098] Action set: A = {a1, a2, ..., a...} m} represents the set of all possible actions of the robot, where a j This represents the j-th action;

[0099] Transition probability: P(s′|s,a) represents the probability of transitioning to state s′ after performing action a in state s;

[0100] Reward function: R(s,a,s′) represents the immediate reward obtained when performing action a in state s and transitioning to state s′.

[0101] Using the Markov Decision Model (MDP) in the above embodiments as a framework, in a preferred embodiment of the present invention, step S500 is implemented, in which reinforcement learning optimization utilizes the Deep Deterministic Policy Gradient (DDPG) algorithm to iteratively optimize the obstacle-crossing strategy. Through policy evaluation, the optimal obstacle-crossing strategy for the robot in a complex obstacle environment is obtained. Specifically, as shown in Figure 4, the Deep Deterministic Policy Gradient (DDPG) method in this embodiment uses the Actor (policy network)-Critic (value network) method as its basic framework, approximates the policy and action value functions through a deep learning network, and uses stochastic gradient descent to train the parameters in the policy network and value network models.

[0102] Both the policy network and the value network utilize a dual neural network model consisting of a real-time network and a target network. Additionally, an experience replay mechanism is employed: the experiential data generated by the interaction between the Actor (policy network) and the environment is stored in an experience pool, and a batch of data samples is then extracted for training, making the algorithm more likely to converge.

[0103] In some specific embodiments, S500 may employ the following specific process:

[0104] S501, Constructing the DDPG framework: Constructing sequentially connected environment networks, actor networks, and critic networks; as well as experience pools between actor networks and critic networks;

[0105] S502, Initialization: θ Q and θ μ Initialize the Critic network Q(s,a|θ) Q ) and Actor network μ(s|θ μ );θ Q′ =θ Q ,θ μ′ =θ μ Initialize the weight parameters θ′ and μ′ of the target network (here, the target network refers to the entire DDPG network); and initialize the experience pool R (random); initialize random noise to obtain the initial state s1 of the environment network;

[0106] S503, Generate action: For the i-th state s t Generate the i-th action using the Actor network, a t =μ(s) t |θ μ )+N t N t Indicates noise;

[0107] S504, Execute Action: Execute action a using the Actor network. t ;

[0108] S505, Action Feedback: Rewards are obtained from the environmental network. t and state s t+1 ;

[0109] S506, Update Experience Pool: (s t ,a t ,r t ,s t+1 Store in R;

[0110] S507, randomly select N samples (s) from R. i ,a i ,r i ,s i+1 );

[0111] S508 updates the Critic network by minimizing the loss function L: y i =r i +γθ′(s i+1 ,μ′(si+1 |θ μ′ )|θ Q′ )

[0112] S509, updating the Actor policy network using policy gradients:

[0113] S510, update the target network: θ Q′ =τθ Q +(1-τ)θ Q′ ,θ μ′ =τθ μ +(1-τ)θ μ′ .

[0114] Cycle S503-S510.

[0115] In the above embodiments, the obstacle-crossing strategy generated by the reinforcement learning algorithm is adaptive and can achieve optimal action planning in different environments.

[0116] In a preferred embodiment of the present invention, in step S600, the optimized strategy is transmitted to the control software in list form and mapped to the robot behavior tree structure. The robot is guided to perform specific actions based on the node structure of the behavior tree to complete the obstacle-crossing climbing task.

[0117] The climbing process is executed based on a behavior tree framework. As shown in Figure 5.

[0118] First, the environmental perception node is executed to obtain environmental obstacle information and convert it into obstacle queue data.

[0119] Next, the autonomous decision-making node is executed to obtain the robot action sequence generated using the DDPG decision model.

[0120] Finally, the action sequence node is executed, and the robot's joints are controlled to complete the action process according to the action sequence.

[0121] The robot action sequence generated using the DDPG decision model includes raising the upper or lower leg to a designated position and raising the entire frame to the bottom. The single-leg movement strategy corresponds to a command sequence: single-leg demagnetization, single-leg retraction, single-leg adjustment (obtaining adjustment parameters visually, rotating the fixed leg, and rotating the moving leg), single-leg raising to a designated position, single-leg extension, single-leg readjustment (obtaining adjustment parameters visually, rotating the fixed leg, and rotating the moving leg), and single-leg de-energization. The command sequence corresponding to the overall lifting is to raise the entire robot to a designated position.

[0122] Based on the same inventive concept, other embodiments of the present invention provide a control system for automatic climbing of a power transmission tower climbing robot, comprising:

[0123] Data acquisition module: Acquires robot environmental data and status information;

[0124] Perception module: Performs point cloud processing on environmental data to obtain information on the location and size of obstacles; performs straight-line tracking on environmental data, and combines state information to determine and adjust the landing position;

[0125] Autonomous decision-making module: Based on obstacle information, landing position and robot state information, a Markov decision model is established; using the Markov decision model as a framework, the optimized obstacle-crossing strategy is obtained through the deep deterministic policy gradient method;

[0126] Execution module: Maps the optimized obstacle-crossing strategy onto the tree framework inside the robot, controlling the robot to complete the automatic climbing task.

[0127] The specific implementation techniques of each module / unit in the above examples of the present invention can be referred to the steps of the control method for automatic climbing of the transmission tower climbing robot in the above embodiments, and will not be repeated here.

[0128] In a preferred embodiment, as shown in Figure 6, the data acquisition module uses hardware including a binocular camera, an industrial camera, and a laser rangefinder. The binocular camera is used to acquire 3D point cloud data of the tower obstacle and calculate its size and position; the industrial camera is used to acquire images of the tower's main structure, extract the straight-line features of its edges, and display the environmental image in real time; the laser rangefinder is used to measure the distance between the obstacle and the footing position.

[0129] In some specific embodiments, the laser ranging range is 1200mm, with an error of approximately ±2mm. In other specific embodiments, considering that the industrial camera for capturing images of the main tower structure needs to be parallel to the main tower structure and able to capture the landing situation, the industrial camera can be fixed next to the moving foot and parallel to the frame, as shown in Figure 7.

[0130] In other specific embodiments, the autonomous decision-making module and the execution module are installed in the control software. That is, the control software receives the information obtained by the perception module and generates control commands to guide the robot to complete the climbing task.

[0131] Specific embodiments of the present invention have been described above. It should be understood that the present invention is not limited to the specific embodiments described above, and those skilled in the art can make various modifications or variations within the scope of the claims, which do not affect the essence of the present invention. The above preferred features can be used in any combination without conflict.

Claims

1. A control method for automatic climbing of a power transmission tower climbing robot, characterized in that, include: Acquire robot environmental data and status information; The environmental data is processed into point cloud data to obtain information on the location and size of obstacles; The environmental data is used for linear tracking, and the landing position is determined and adjusted in conjunction with the status information. A Markov decision model is established based on the obstacle information, the landing position, and the robot's state information; Using the Markov decision model as a framework, an optimized obstacle-crossing strategy is obtained through the deep deterministic policy gradient method. The optimized obstacle-crossing strategy is mapped onto the tree framework inside the robot, controlling the robot to complete the automatic climbing task.

2. The control method for automatic climbing of a power transmission tower climbing robot according to claim 1, characterized in that, Acquire robot environmental data, including: Pre-collect 3D point cloud data of the tower; Real-time acquisition of 3D point cloud data of towers, obstacles, and background; Pre-acquisition of images of the main materials of the iron tower; Real-time acquisition of images of the main materials of the iron tower; Real-time measurement of the distance between obstacles and the landing position; The acquisition of robot status information includes: Status information is obtained by communicating with the robot motion controller via a network. The status information includes the position and posture of each foot relative to the robot body.

3. The control method for automatic climbing of a power transmission tower climbing robot according to claim 2, characterized in that, The environmental data is processed into point cloud data to obtain information on the location and size of obstacles, including: The three-dimensional point cloud data of the pre-acquired iron tower and the three-dimensional point cloud data of the iron tower, obstacles and background acquired in real time are segmented and processed. For the segmented point cloud data, the bounding box method is used to calculate the geometric information, i.e., the pose, of the obstacles; The location of the obstacle is determined by combining the real-time measured distance between the obstacle and the landing position.

4. The control method for automatic climbing of a power transmission tower climbing robot according to claim 2, characterized in that, The step of performing linear tracking on the environmental data, combined with the status information, to determine and adjust the landing position includes: Based on the pre-collected images of the main tower components, determine the straight lines of the main tower components; Based on real-time acquired images of the main tower structure, the LSD straight line detection algorithm is used to extract the straight line features of the main tower structure edges. Among the extracted straight line features of the main tower material edge, select the straight line with the smallest distance from the pre-collected straight line or the straight line selected in the previous frame to complete the tracking; Based on the robot's state information and the selected straight line, the relative position between the moving foot and the main material of the tower, as well as the angle between the robot body and the main material of the tower, are calculated. Based on the relative position and the included angle, the landing position of the moving foot is adjusted with the center of the main tower material and the zero-degree included angle as a reference.

5. The control method for automatic climbing of a power transmission tower climbing robot according to claim 1, characterized in that, The Markov decision model provides a qualitative description of the robot climbing strategy problem, including the definitions of state, action, and transition probability, specifically including: Set of states: S = {s1, s2, ..., s} n } represents the set of all possible states of the robot, where s i Indicates the i-th state; Action set: A = {a1, a2, ..., a...} m } represents the set of all possible actions of the robot, where a j This represents the j-th action; Transition probability: P(s′|s,a) represents the probability of transitioning to state s′ after performing action a in state s; Reward function: R(s,a,s′) represents the immediate reward obtained when performing action a in state s and transitioning to state s′.

6. The control method for automatic climbing of a power transmission tower climbing robot according to claim 1, characterized in that, The process of obtaining an optimized obstacle-crossing strategy using the Markov decision model as a framework and the deep deterministic policy gradient method includes: An environment network, an actor network, and a critic network are constructed. The environment network is a tower model, including the tower and obstacles on its main structure. The positions and sizes of the obstacles are randomly generated during the learning process. The actor network and the critic network both use a dual network of real-time network and target network, and employ policy gradient optimization. The environmental network provides robot status s t The Critic network generates and evaluates the current action a. t Whether the obstacle can be overcome; when it is determined that it can be executed, the Actor network executes the current action a. t The environment network receives a reward. t and the generated state s t+1 ; The empirical data s generated by the interaction between the Actor network and the environment network t ,a t ,r t ,s t+1 The data is stored in the experience pool R, and then a batch of data samples is extracted from the experience pool R for training and optimization of the Critic network.

7. The control method for automatic climbing of a power transmission tower climbing robot according to claim 1, characterized in that, The tree framework includes: Execute the environmental perception node to acquire environmental obstacle information and convert it into obstacle queue data; The autonomous decision-making node is executed to obtain the optimized obstacle-crossing strategy and generate a robot action sequence; Execute the action sequence node, and control each joint of the robot to complete the action process according to the action sequence.

8. A control system for an automatic climbing robot for power transmission towers, characterized in that, include: Data acquisition module: Acquires robot environmental data and status information; Perception module: Performs point cloud processing on the environmental data to obtain information on the location and size of obstacles; The environmental data is used for linear tracking, and the landing position is determined and adjusted in conjunction with the status information. Autonomous decision-making module: Based on the obstacle information, the landing position and the robot's state information, a Markov decision model is established; using the Markov decision model as a framework, an optimized obstacle-crossing strategy is obtained through the deep deterministic policy gradient method; Execution module: Maps the optimized obstacle-crossing strategy to the tree framework inside the robot, controlling the robot to complete the automatic climbing task.

9. The control system for automatic climbing of a power transmission tower climbing robot according to claim 8, characterized in that, include: The data acquisition module includes: A binocular camera is used to acquire 3D point cloud data of the tower obstacle and calculate the size and position of the obstacle; Industrial cameras are used to acquire images of the main structure of the iron tower, extract the straight line features of the main structure's edges, and display environmental images in real time. Laser rangefinders are used to measure the distance between an obstacle and a landing point.

10. The control system for automatic climbing of a power transmission tower climbing robot according to claim 9, characterized in that, The industrial camera is fixed next to the moving foot and is parallel to the robot body.