A method and apparatus for determining the tilt of a power transmission tower
By combining UAV imagery data and attitude position data, and utilizing key point detection and camera calibration parameters, the tilt of power transmission towers can be accurately determined, solving the problem of inaccurate tilt calculation in existing technologies, especially for high-precision assessment in complex environments such as wildfire smoke.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- ELECTRIC POWER RES INST OF GUANGDONG POWER GRID CO LTD
- Filing Date
- 2026-03-30
- Publication Date
- 2026-06-30
AI Technical Summary
In existing technologies, the methods for determining the tilt of transmission towers are not accurate enough. In particular, in complex environments, it is difficult to overcome visual interference such as wildfire smoke, which makes it impossible to accurately analyze the horizontal projection offset and vertical height difference, thus failing to meet the requirements for high-precision evaluation.
By acquiring image data, attitude, and position data of power transmission towers collected by drones, the pixel coordinates of the tower feet and the center of the tower top are extracted using a key point detection model. Combined with camera calibration parameters, stereo matching is performed, and the data is converted to absolute three-dimensional coordinates in the world coordinate system. The horizontal projection offset and vertical height difference are calculated to determine the tilt.
It enables accurate determination of the tilt of transmission towers in complex environments, solves the problems of missing depth information and lack of absolute spatial coordinate reference in single-viewpoint imaging, and ensures high-precision tilt calculation.
Smart Images

Figure CN122306027A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of computer vision technology, specifically to a method and apparatus for determining the tilt of a power transmission tower. Background Technology
[0002] The verticality of transmission towers directly affects the safe and stable operation of the power grid. In geographically complex mountainous areas, transmission lines are susceptible to various natural disasters such as landslides, mudslides, strong winds, earthquakes, and wildfires, leading to tower deformation or even tilting. Especially after extreme disasters like wildfires, in addition to the physical effects of altered soil properties, there are often complex visual disturbances such as strong residual smoke obscuring the tower and blackening of the tower surface due to high temperatures. This poses extremely stringent challenges to traditional visual monitoring algorithms. Therefore, a high-precision tilt determination method is urgently needed that can adapt to various disaster scenarios, especially overcoming complex visual disturbances such as wildfire smoke.
[0003] Existing methods for determining the tilt of power transmission towers are not accurate enough. Most existing methods rely on single-viewpoint imagery (i.e., a single image) acquired by UAVs for geometric analysis, but this has two inherent technical limitations: First, the lack of spatial depth information. Single-viewpoint imaging only projects three-dimensional space onto a two-dimensional plane from a single image, missing the depth dimension. This leads to a coupling between the physical deformation of the power transmission tower and perspective distortion from the shooting perspective. Pixel coordinates alone cannot eliminate visual errors introduced by the shooting posture, making it difficult to accurately resolve horizontal projection offsets. Second, the lack of an absolute spatial coordinate reference. Existing calculations are mostly limited to pixel or camera coordinate systems, failing to establish a mapping link between observation data and the world coordinate system. This makes it impossible to effectively integrate UAV attitude and position data for spatial correction. Consequently, the calculation results cannot be restored to absolute three-dimensional coordinates, making it difficult to eliminate flight attitude interference to accurately quantify vertical height differences, and failing to meet the needs of high-precision assessment in complex environments. Summary of the Invention
[0004] This invention provides a method, apparatus, electronic device, and storage medium for determining the tilt of transmission towers, which can solve the problem of inaccurate determination of the tilt of transmission towers in the prior art.
[0005] An embodiment of the present invention provides a method for determining the tilt angle of a power transmission tower, comprising: Acquire first image data of the power transmission tower collected by the drone, second image data of the power transmission tower collected by the drone, attitude data of the drone, and position data of the drone; The first image data of the power transmission tower is input into a preset key point detection model, so that the key point detection model outputs the pixel coordinates of the key points of the power transmission tower based on the first image data; wherein, the key points include the tower feet and the center of the tower top; Based on preset camera calibration parameters, stereo matching is performed on the second image data of the power transmission tower according to the pixel coordinates of the tower foot and the pixel coordinates of the tower top center to determine the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center in the camera coordinate system. Based on the attitude data and position data of the UAV, coordinate transformation is performed on the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center to generate the absolute three-dimensional coordinates of the tower foot and the absolute three-dimensional coordinates of the tower top center in the world coordinate system. Based on the absolute three-dimensional coordinates of the tower feet, determine the reference center coordinates of the bottom of the transmission tower; based on the absolute three-dimensional coordinates of the tower top center, calculate and generate the horizontal projection offset and vertical height difference of the absolute three-dimensional coordinates of the tower top center relative to the reference center coordinates of the bottom. The tilt of the power transmission tower after the wildfire is determined based on the difference between the horizontal projection offset and the vertical height.
[0006] Furthermore, the pre-defined keypoint detection model is trained in the following manner: Obtain a 3D model of the power transmission tower; Based on the 3D model of the power transmission tower, preset color space offset parameters, preset semi-transparent smoke layer parameters, and preset noise parameters, a virtual scene is rendered to generate several virtual image samples. Based on the 3D model of the power transmission tower, key points are annotated for each virtual image sample, generating key point annotation data corresponding to each virtual image sample. Based on each virtual image sample and the corresponding key point annotation data, several training samples are constructed; wherein, each training sample includes a virtual image sample and a corresponding annotation label; the annotation label is the key point annotation data corresponding to the virtual image sample; Each training sample is sequentially input into the keypoint detection model to train the model until a preset convergence condition is met. The keypoint detection model outputs a predicted keypoint corresponding to each training sample received. A loss function value is calculated based on the predicted keypoint and its corresponding label. The keypoint detection model is then updated based on the loss function value. The loss function value includes a keypoint coordinate regression loss value and a bounding box regression loss value, with the weight of the keypoint coordinate regression loss value being greater than the weight of the bounding box regression loss value.
[0007] Furthermore, obtain a 3D model of the transmission tower, including: Obtain the standard design parameters of the transmission tower; wherein, the standard design parameters include the nominal height of the tower, the total height of the tower, the distance between the bases of the tower legs, the number of crossarm layers, the length of the crossarm, and the specifications of the insulator string; Based on the nominal height of the tower, the total height of the tower, and the distance between the bases of the tower legs, a main skeleton model of the power transmission tower is constructed. Based on the number of crossarm layers and the length of the crossarm, a crossarm model of the transmission tower is constructed; Based on the insulator string specifications, construct an insulator string model for the transmission tower; A three-dimensional model of the transmission tower is generated based on the main frame model, the crossarm model, and the insulator string model.
[0008] Furthermore, based on the 3D model of the transmission tower, preset color space offset parameters, preset semi-transparent smoke layer parameters, and preset noise parameters, virtual scene rendering is performed to generate several virtual image samples, including: Configure the metal material properties on the 3D model of the power transmission tower to generate the basic tower model; Based on preset color space offset parameters, the diffuse color channel of the metal material properties of the basic iron tower model is adjusted to generate an iron tower model with high-temperature burning characteristics. A virtual smoke model is constructed based on preset semi-transparent smoke layer parameters; The virtual smoke model is configured between the iron tower model with high-temperature burning characteristics and the virtual camera to construct a virtual rendering scene that includes smoke occlusion effect; Based on a virtual camera, the virtual rendering scene is captured from multiple perspectives to generate an initial rendered image; Based on preset noise parameters, noise is processed on the initial rendered image to generate virtual image samples.
[0009] Furthermore, based on the 3D model of the transmission tower, key points are annotated for each virtual image sample, generating key point annotation data corresponding to each virtual image sample, including: Based on the three-dimensional model of the transmission tower, determine the model space coordinates of the tower feet and the model space coordinates of the tower top center in the three-dimensional model of the transmission tower; For each virtual image sample, obtain the pose transformation matrix and intrinsic parameter projection matrix of the virtual camera when acquiring the current virtual image sample, and use them as the target pose transformation matrix and target intrinsic parameter projection matrix. Based on the target pose transformation matrix and the target intrinsic parameter projection matrix, perspective projection transformation is performed on the model space coordinates of the tower foot and the model space coordinates of the tower top center, respectively, to generate the projected pixel coordinates of the tower foot on the image plane where the current virtual image sample is located, and the projected pixel coordinates of the tower top center on the image plane where the current virtual image sample is located. The projected pixel coordinates of the tower base on the image plane where the current virtual image sample is located, and the projected pixel coordinates of the tower top center on the image plane where the current virtual image sample is located, are used as the key point annotation data corresponding to the current virtual image sample.
[0010] Furthermore, based on preset camera calibration parameters, stereo matching is performed on the second image data of the transmission tower according to the pixel coordinates of the tower feet and the pixel coordinates of the tower top center to determine the relative three-dimensional coordinates of the tower feet and the relative three-dimensional coordinates of the tower top center in the camera coordinate system, including: Based on the pixel coordinates of the tower feet and the pixel coordinates of the tower top center, feature blocks of a preset size are extracted from the first image data of the power transmission tower to generate tower foot matching templates and tower top center matching templates. Based on the pre-set epipolar correction parameters in the camera calibration parameters, the first epipolar search area corresponding to the pixel coordinates of the tower foot and the second epipolar search area corresponding to the pixel coordinates of the tower top center are determined in the second image data of the power transmission tower. The sliding similarity calculation of the tower foot matching template is performed within the first epipolar search area to determine the best matching pixel coordinates of the tower foot in the second image data. The matching template of the tower top center is used to perform sliding similarity calculation within the second epipolar line search area to determine the best matching pixel coordinates of the tower top center in the second image data. Based on the pixel coordinates of the tower base, the pixel coordinates of the tower top center, the best matching pixel coordinates of the tower base in the second image data, the best matching pixel coordinates of the tower top center in the second image data, and the baseline length and focal length parameters in the preset camera calibration parameters, triangulation is performed to generate the relative three-dimensional coordinates of the tower base and the relative three-dimensional coordinates of the tower top center in the camera coordinate system.
[0011] Furthermore, based on the UAV's attitude and position data, coordinate transformation is performed on the relative three-dimensional coordinates of the tower base and the tower top center to generate the absolute three-dimensional coordinates of the tower base and the tower top center in the world coordinate system, including: Based on the attitude data of the UAV, construct a three-dimensional rotation matrix to transform from the camera coordinate system to the world coordinate system; Based on the three-dimensional rotation matrix, a spatial rotation transformation is performed on the relative three-dimensional coordinates of the tower feet and the relative three-dimensional coordinates of the tower top center to generate the intermediate transition coordinates of the tower feet after attitude correction and the intermediate transition coordinates of the tower top center after attitude correction. The drone's position data is spatially translated and transformed with the intermediate transition coordinates of the tower foot after attitude correction to generate the absolute three-dimensional coordinates of the tower foot in the world coordinate system. The drone's position data is spatially translated and transformed with the intermediate transition coordinates after attitude correction at the center of the tower top to generate the absolute three-dimensional coordinates of the center of the tower top in the world coordinate system.
[0012] Furthermore, based on the absolute three-dimensional coordinates of the tower top center, the horizontal projection offset and vertical height difference of the absolute three-dimensional coordinates of the tower top center relative to the bottom reference center coordinates are calculated, including: From the absolute three-dimensional coordinates of the tower top center, extract the horizontal coordinate components, the vertical coordinate components, and the center of height of the tower top center. Extract the horizontal coordinate components, vertical coordinate components, and vertical coordinate components of the bottom reference center coordinates from the bottom reference center coordinates. Based on the horizontal coordinate components of the tower top center, the vertical coordinate components of the tower top center, the horizontal coordinate components of the bottom reference center coordinates, and the vertical coordinate components of the bottom reference center coordinates, calculate and generate the horizontal projection offset of the absolute three-dimensional coordinates of the tower top center relative to the bottom reference center coordinates. Based on the vertical coordinate components of the tower top center and the vertical coordinate components of the bottom reference center coordinates, calculate the vertical height difference between the absolute three-dimensional coordinates of the tower top center and the bottom reference center coordinates.
[0013] Furthermore, based on the difference between the horizontal projection offset and the vertical height, the tilt angle of the transmission tower after the wildfire is determined, including: The tilt tangent value is calculated based on the difference between the horizontal projection offset and the vertical height. The inclination tangent value is used to perform an arctangent function operation to generate the inclination of the power transmission tower after the wildfire.
[0014] Based on the above method embodiments, the present invention provides corresponding apparatus embodiments.
[0015] One embodiment of the present invention provides a device for determining the tilt of a power transmission tower, comprising: a data acquisition module, a three-dimensional matching module, a coordinate system transformation module, and a tilt determination module; The data acquisition module is used to acquire first image data of the power transmission tower collected by the drone, second image data of the power transmission tower collected by the drone, attitude data of the drone, and position data of the drone. The stereo matching module is used to input the first image data of the power transmission tower into a preset key point detection model, so that the key point detection model outputs the pixel coordinates of the key points of the power transmission tower according to the first image data; wherein, the key points include the tower foot and the tower top center; based on preset camera calibration parameters, stereo matching is performed on the second image data of the power transmission tower according to the pixel coordinates of the tower foot and the pixel coordinates of the tower top center to determine the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center in the camera coordinate system; The coordinate system transformation module is used to perform coordinate transformation on the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center based on the attitude data and position data of the UAV, so as to generate the absolute three-dimensional coordinates of the tower foot and the absolute three-dimensional coordinates of the tower top center in the world coordinate system. The tilt determination module is used to determine the bottom reference center coordinates of the transmission tower based on the absolute three-dimensional coordinates of the tower feet; calculate and generate the horizontal projection offset and vertical height difference of the absolute three-dimensional coordinates of the tower top center relative to the bottom reference center coordinates based on the absolute three-dimensional coordinates of the tower top center; and determine the tilt of the transmission tower after the wildfire based on the horizontal projection offset and the vertical height difference.
[0016] Compared with the prior art, the present invention has the following beneficial effects: This invention provides a method, apparatus, electronic device, and storage medium for determining the tilt of a power transmission tower. The method acquires first and second image data of the power transmission tower collected by a drone, along with corresponding drone attitude and position data. The first image data is input into a preset key point detection model, which outputs the pixel coordinates of key points on the power transmission tower, including the tower base and the tower top center. Based on camera calibration parameters, the pixel coordinates of the tower base and the tower top center are used for stereo matching in the second image data to determine the relative three-dimensional coordinates of the tower base and the tower top center in the camera coordinate system. Coordinate transformation is performed based on the drone attitude and position data to obtain the absolute three-dimensional coordinates of the tower base and the tower top center in the world coordinate system. The absolute three-dimensional coordinates of the tower base are used to determine the coordinates of the bottom reference center, and the horizontal projection offset and vertical height difference of the tower top center relative to the bottom reference center are calculated. The tilt of the power transmission tower after a wildfire is determined based on the horizontal projection offset and the vertical height difference.
[0017] To address the problem that existing technologies cannot decouple perspective distortion and physical tilt due to the lack of depth information in single-viewpoint imaging, this invention first extracts the two-dimensional pixel coordinates of the tower base and the center of the tower top from the first image data using a preset key point detection model. Then, based on preset camera calibration parameters, it performs stereo matching in the second image data of the power transmission tower using the two-dimensional pixel coordinates of the tower base and the center of the tower top, thereby calculating the relative three-dimensional coordinates containing depth information. To address the problem that existing technologies lack an absolute spatial reference, which makes it impossible to eliminate flight attitude interference, this invention further integrates the attitude data and position data of the UAV, performs coordinate transformation on the relative three-dimensional coordinates, and maps the relative three-dimensional coordinates to absolute three-dimensional coordinates in the world coordinate system, establishing a real physical measurement reference. Finally, based on the generated absolute three-dimensional coordinates, this invention accurately calculates the horizontal projection offset and vertical height difference of the tower top center relative to the bottom reference center, achieving accurate determination of the tilt of the power transmission tower after a wildfire. Attached Figure Description
[0018] Figure 1 This is a flowchart illustrating a method for determining the tilt of a power transmission tower according to an embodiment of the present invention.
[0019] Figure 2 This is a schematic diagram of a device for determining the tilt of a power transmission tower according to an embodiment of the present invention. Detailed Implementation
[0020] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0021] like Figure 1 As shown, to address the problem of inaccurate determination of the tilt angle of transmission towers in existing technologies, an embodiment of the present invention provides a method for determining the tilt angle of transmission towers, comprising at least the following steps: Step S1: Acquire the first image data of the power transmission tower collected by the drone, the second image data of the power transmission tower collected by the drone, the attitude data of the drone, and the position data of the drone; Specifically, in this embodiment of the invention, considering the possibility of complex airflow or residual smoke in the on-site environment, the drone maintains a safe distance to conduct non-contact data acquisition of the power transmission tower. The drone is equipped with binocular vision acquisition equipment or high-precision monocular continuous shooting equipment to perform image acquisition tasks.
[0022] Specifically, the first and second image data of the power transmission tower refer to the image pairs acquired from different shooting angles, satisfying the parallax requirements of binocular stereo vision, for the same power transmission tower. If the drone is equipped with a binocular camera, the first image data of the power transmission tower is acquired through the left lens of the binocular camera, and the second image data is acquired through the right lens of the binocular camera, with strict time synchronization between the first and second image data. If the drone is equipped with a monocular camera, the drone is controlled to move along a preset flight path, acquiring the first image data of the power transmission tower at a first shooting position and the second image data at a second shooting position, with the baseline distance between the first and second shooting positions satisfying the solution requirements for stereo matching.
[0023] It should be noted that the first image data and the second image data described in this embodiment are only used to distinguish between the reference image and the matching image in the logical processing steps, and do not constitute an absolute limitation on the physical acquisition location. Specifically, the first image data can be the left-side image captured by the left lens of the binocular camera, and the second image data can be the right-side image captured by the right lens; or, the first image data can also be the right-side image captured by the right lens, and the second image data can be the left-side image captured by the left lens. Both have equivalent epipolar geometric constraints after epipolar correction, and both are within the protection scope of this invention.
[0024] At the same time, the drone uses its onboard inertial measurement unit (IMU) and satellite positioning module (such as GPS, BeiDou or RTK module) to record flight status data in real time at the moment of shooting.
[0025] The attitude data of the UAV represents its flight attitude relative to the body coordinate system at the time of shooting, specifically including the UAV's pitch angle, roll angle, and yaw angle. In this embodiment, the pitch angle is defined as... The roll angle is defined as Define the yaw angle as The attitude data of the drone can reflect the optical axis pointing deviation of the camera during shooting, and is a key parameter for subsequent transformation of the camera coordinate system to the world coordinate system.
[0026] The drone's position data represents its spatial location in a world coordinate system at the time of image capture. Typically, the drone's position data includes its longitude, latitude, and altitude relative to the ground. For ease of subsequent calculations, the longitude, latitude, and altitude are mapped to a geocentric coordinate system or a local tangent plane coordinate system, representing them as spatial coordinates. .
[0027] The drone will transmit the first image data of the power transmission tower, the second image data of the power transmission tower, the drone's attitude data, and the drone's position data to the ground station or store them in the onboard memory via a wireless image transmission link for subsequent processing terminals to access.
[0028] By acquiring bilateral image data containing parallax information and precise positioning and orientation data, a complete data foundation can be provided for subsequent recovery of the depth information of power transmission towers and the construction of an absolute spatial reference, thereby effectively solving the technical problems of lack of depth dimension and lack of absolute coordinate reference in a single image.
[0029] Step S2: Input the first image data of the transmission tower into a preset key point detection model, so that the key point detection model outputs the pixel coordinates of the key points of the transmission tower based on the first image data; wherein, the key points include the tower feet and the center of the tower top; In a preferred embodiment, a preset keypoint detection model is trained in the following manner: Obtain a 3D model of the power transmission tower; Based on the 3D model of the power transmission tower, preset color space offset parameters, preset semi-transparent smoke layer parameters, and preset noise parameters, a virtual scene is rendered to generate several virtual image samples. Based on the 3D model of the power transmission tower, key points are annotated for each virtual image sample, generating key point annotation data corresponding to each virtual image sample. Based on each virtual image sample and the corresponding key point annotation data, several training samples are constructed; wherein, each training sample includes a virtual image sample and a corresponding annotation label; the annotation label is the key point annotation data corresponding to the virtual image sample; Each training sample is sequentially input into the keypoint detection model to train the model until a preset convergence condition is met. The keypoint detection model outputs a predicted keypoint corresponding to each training sample received. A loss function value is calculated based on the predicted keypoint and its corresponding label. The keypoint detection model is then updated based on the loss function value. The loss function value includes a keypoint coordinate regression loss value and a bounding box regression loss value, with the weight of the keypoint coordinate regression loss value being greater than the weight of the bounding box regression loss value.
[0030] In a preferred embodiment, obtaining a three-dimensional model of the transmission tower includes: Obtain the standard design parameters of the transmission tower; wherein, the standard design parameters include the nominal height of the tower, the total height of the tower, the distance between the bases of the tower legs, the number of crossarm layers, the length of the crossarm, and the specifications of the insulator string; Based on the nominal height of the tower, the total height of the tower, and the distance between the bases of the tower legs, a main skeleton model of the power transmission tower is constructed. Based on the number of crossarm layers and the length of the crossarm, a crossarm model of the transmission tower is constructed; Based on the insulator string specifications, construct an insulator string model for the transmission tower; A three-dimensional model of the transmission tower is generated based on the main frame model, the crossarm model, and the insulator string model.
[0031] In a preferred embodiment, virtual scene rendering is performed based on the 3D model of the transmission tower, preset color space offset parameters, preset semi-transparent smoke layer parameters, and preset noise parameters to generate several virtual image samples, including: Configure the metal material properties on the 3D model of the power transmission tower to generate the basic tower model; Based on preset color space offset parameters, the diffuse color channel of the metal material properties of the basic iron tower model is adjusted to generate an iron tower model with high-temperature burning characteristics. A virtual smoke model is constructed based on preset semi-transparent smoke layer parameters; The virtual smoke model is configured between the iron tower model with high-temperature burning characteristics and the virtual camera to construct a virtual rendering scene that includes smoke occlusion effect; Based on a virtual camera, the virtual rendering scene is captured from multiple perspectives to generate an initial rendered image; Based on preset noise parameters, noise is processed on the initial rendered image to generate virtual image samples.
[0032] In a preferred embodiment, based on the 3D model of the transmission tower, key point annotations are performed on each virtual image sample to generate key point annotation data corresponding to each virtual image sample, including: Based on the three-dimensional model of the transmission tower, determine the model space coordinates of the tower feet and the model space coordinates of the tower top center in the three-dimensional model of the transmission tower; For each virtual image sample, obtain the pose transformation matrix and intrinsic parameter projection matrix of the virtual camera when acquiring the current virtual image sample, and use them as the target pose transformation matrix and target intrinsic parameter projection matrix. Based on the target pose transformation matrix and the target intrinsic parameter projection matrix, perspective projection transformation is performed on the model space coordinates of the tower foot and the model space coordinates of the tower top center, respectively, to generate the projected pixel coordinates of the tower foot on the image plane where the current virtual image sample is located, and the projected pixel coordinates of the tower top center on the image plane where the current virtual image sample is located. The projected pixel coordinates of the tower base on the image plane where the current virtual image sample is located, and the projected pixel coordinates of the tower top center on the image plane where the current virtual image sample is located, are used as the key point annotation data corresponding to the current virtual image sample.
[0033] Specifically, in this embodiment, the keypoint detection model employs a deep learning-based convolutional neural network architecture (e.g., ResNet, HRNet, or a dedicated keypoint regression network). The first image data of the power transmission tower collected by the UAV, after preprocessing (e.g., size normalization, denoising), is input as tensor data into the preset keypoint detection model. The keypoint detection model extracts high-dimensional features from the first image data of the power transmission tower through internal convolutional layers and directly maps the position information of the key structural points of the power transmission tower in the image pixel coordinate system through fully connected layers or heatmap regression layers. In this invention, to subsequently construct a geometric vector for calculating the tilt, keypoints are explicitly defined to include the four tower legs of the power transmission tower (usually defined as the contact points between the tower legs and the foundation) and the center of the tower top (usually defined as the highest point of the geometric central axis of the tower body). The pixel coordinates output by the keypoint detection model are two-dimensional coordinates. The pixel coordinates correspond precisely to the pixel matrix index of the input image.
[0034] To ensure the keypoint detection model maintains high accuracy in both normal environments and extreme conditions such as after wildfires or in dense fog, this embodiment employs a synthetic data-driven strategy to train the keypoint detection model. The specific training process is as follows: First, a 3D model of the power transmission tower is obtained. The 3D model of the power transmission tower serves as the geometric benchmark for generating training samples.
[0035] Secondly, based on the 3D model of the power transmission tower, preset color space offset parameters, preset semi-transparent smoke layer parameters, and preset noise parameters, virtual scene rendering is performed in a virtual engine (such as Unity, Unreal Engine, or professional industrial simulation software) to generate several virtual image samples. This step aims to simulate the visual characteristics of a wildfire scene. Since real post-wildfire image data of power transmission towers is difficult to obtain in large quantities and annotation is costly, virtual rendering technology can generate virtual image samples in batches, including different lighting conditions, different levels of smoke obstruction, and different shooting angles, thus solving the problem of scarce training data.
[0036] Furthermore, based on the 3D model of the power transmission tower, key point annotation is automatically performed on each virtual image sample, generating key point annotation data corresponding to each virtual image sample. Since the 3D model of the power transmission tower and its projection relationship in the virtual camera are known mathematical truths in the virtual scene, the precise pixel coordinates of the key points can be directly calculated, avoiding the errors introduced by manual annotation.
[0037] Next, several training samples are constructed based on each virtual image sample and its corresponding keypoint annotation data. Each training sample includes a virtual image sample as input and annotation labels as supervision signals; the annotation labels are the keypoint annotation data corresponding to the virtual image sample.
[0038] Finally, each training sample is sequentially input into the keypoint detection model for iterative training until a preset convergence condition is met (e.g., the loss function value no longer decreases or the preset number of iterations is reached). During training, the keypoint detection model performs forward propagation calculations upon receiving each training sample, outputting the predicted keypoints corresponding to the training sample. The loss function value is calculated based on the predicted keypoints and their corresponding labels. The weight parameters of the keypoint detection model are updated using a backpropagation algorithm (e.g., SGD, Adam optimizer) based on the loss function value. In this embodiment, the loss function value consists of two weighted components: a keypoint coordinate regression loss value (e.g., mean squared error loss, MSE) and a bounding box regression loss value (e.g., IoU loss or L1 loss). Considering that the core requirement of this invention for tilt calculation is to obtain accurate point coordinates rather than just the position of the target box, the weight of the keypoint coordinate regression loss value is set greater than the weight of the bounding box regression loss value, thereby forcing the model to focus more on pixel-level accuracy of keypoint localization.
[0039] Regarding the method for obtaining the 3D model of the transmission tower, this embodiment adopts a parametric modeling method to ensure the standardization of the model structure: Specifically, the first step is to obtain the standard design parameters of the transmission tower. These standard design parameters are derived from the engineering design drawings or ledger data of the transmission tower, including the tower's nominal height (height from the ground to the lowest crossarm), the tower's total height, the distance between the tower legs (horizontal distance between adjacent tower legs), the number of crossarm layers, the crossarm length, and the specifications of the insulator strings.
[0040] Subsequently, based on the tower's nominal height, total height, and the distance between the tower legs, a main frame model of the transmission tower was constructed, which determined the overall taper and height ratio of the tower. Based on the number of crossarm layers and their length, crossarm models of the transmission tower were constructed at corresponding heights of the main frame model, reflecting the tower's lateral width characteristics. Finally, based on the insulator string specifications, insulator string models of the transmission tower were constructed at the hanging points of the crossarm models.
[0041] Finally, the main skeleton model, crossarm model, and insulator string model of the transmission tower are rigidly combined to generate a complete 3D model of the transmission tower. The model constructed in this way maintains a high degree of consistency with the geometric topology of the actual transmission tower, laying the foundation for subsequent accurate projection annotation.
[0042] To ensure that the generated virtual image samples closely approximate the real visual effects after a wildfire, this embodiment performs multiple environmental simulations during the virtual scene rendering stage: First, configure the metal material properties (such as reflectivity and roughness mapping of galvanized steel) on the 3D model of the power transmission tower to generate the basic tower model.
[0043] Next, the effects of high temperatures on the materials are simulated. Based on preset color space offset parameters, the diffuse color channel of the metal material properties of the basic iron tower model is adjusted, for example, by increasing the black or brown hue component and reducing the surface gloss, thereby generating an iron tower model with high-temperature burning characteristics, simulating the visual state of the tower material surface oxidizing and turning black after a fire.
[0044] Simultaneously, a smoke-occupying environment is simulated. Based on preset semi-transparent smoke layer parameters (including smoke density, color gradient, and transparency alpha channel), a virtual smoke model (such as a particle system or volumetric fog) is constructed. The virtual smoke model is then positioned between a tower model with high-temperature scorching characteristics and a virtual camera, causing the rendered image to exhibit varying degrees of blurring and occlusion, thus constructing a virtual rendering scene incorporating smoke occlusion effects.
[0045] Based on this, using a virtual camera, the virtual rendered scene is captured from multiple perspectives according to a preset drone inspection route, generating initial rendered images. The intrinsic parameter matrix of the virtual camera is consistent with that of the real drone camera.
[0046] Finally, sensor noise is simulated. Based on preset noise parameters (such as the distribution parameters of Gaussian noise or salt-and-pepper noise), noise injection is performed on the initial rendered image to generate the final virtual image sample. This step simulates the noise characteristics of the imaging sensor of a real drone under low light or high temperature interference.
[0047] For the automatic annotation process of virtual samples, this embodiment utilizes the projection principle of computer graphics to ensure the absolute accuracy of the annotation data: Specifically, based on the 3D model of the transmission tower, the model space coordinates of the tower feet are directly extracted from the 3D model of the transmission tower. And the model space coordinates of the center of the tower top. These coordinates are known constants in the modeling coordinate system.
[0048] For each virtual image sample, obtain the pose transformation matrix (including rotation matrix) of the virtual camera when acquiring the current virtual image sample. Translation vector ) and intrinsic parameter projection matrix ( ), which serves as the target pose transformation matrix and the target intrinsic parameter projection matrix.
[0049] Based on the target pose transformation matrix and the target intrinsic projection matrix, perspective projection transformations are performed on the model space coordinates of the tower feet and the model space coordinates of the tower top center, respectively. The calculation process follows the pinhole camera model, converting the model space coordinates into coordinates in the camera coordinate system, and then projecting them onto the image plane, thereby generating the projected pixel coordinates of the tower feet on the image plane where the current virtual image sample is located. And the projected pixel coordinates of the tower top center on the image plane where the current virtual image sample is located.
[0050] Finally, the calculated projection pixel coordinates of the tower feet on the image plane where the current virtual image sample is located, and the projection pixel coordinates of the tower top center on the image plane where the current virtual image sample is located, are directly used as the key point annotation data corresponding to the current virtual image sample, and packaged and stored with the virtual image sample.
[0051] Through the above methods, the present invention can construct a massive, high-precision training dataset containing extreme environmental features at low cost, effectively improving the robustness and positioning accuracy of the key point detection model under complex conditions such as wildfire smoke obscuring and tower material discoloration, thereby ensuring the accuracy of subsequent tilt calculation.
[0052] Step S3: Based on the preset camera calibration parameters, perform stereo matching in the second image data of the power transmission tower according to the pixel coordinates of the tower foot and the pixel coordinates of the tower top center to determine the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center in the camera coordinate system. In a preferred embodiment, based on preset camera calibration parameters, stereo matching is performed on the second image data of the transmission tower according to the pixel coordinates of the tower feet and the pixel coordinates of the tower top center to determine the relative three-dimensional coordinates of the tower feet and the relative three-dimensional coordinates of the tower top center in the camera coordinate system, including: Based on the pixel coordinates of the tower feet and the pixel coordinates of the tower top center, feature blocks of a preset size are extracted from the first image data of the power transmission tower to generate tower foot matching templates and tower top center matching templates. Based on the pre-set epipolar correction parameters in the camera calibration parameters, the first epipolar search area corresponding to the pixel coordinates of the tower foot and the second epipolar search area corresponding to the pixel coordinates of the tower top center are determined in the second image data of the power transmission tower. The sliding similarity calculation of the tower foot matching template is performed within the first epipolar search area to determine the best matching pixel coordinates of the tower foot in the second image data. The matching template of the tower top center is used to perform sliding similarity calculation within the second epipolar line search area to determine the best matching pixel coordinates of the tower top center in the second image data. Based on the pixel coordinates of the tower base, the pixel coordinates of the tower top center, the best matching pixel coordinates of the tower base in the second image data, the best matching pixel coordinates of the tower top center in the second image data, and the baseline length and focal length parameters in the preset camera calibration parameters, triangulation is performed to generate the relative three-dimensional coordinates of the tower base and the relative three-dimensional coordinates of the tower top center in the camera coordinate system.
[0053] Specifically, in this embodiment, the first image data and the second image data of the transmission tower constitute a stereo image pair with parallax information. To recover three-dimensional spatial depth from a two-dimensional pixel plane, this invention utilizes the principle of stereo vision for calculation. First, preset camera calibration parameters are invoked. These preset camera calibration parameters are internal camera parameters (including focal length, principal point coordinates, and distortion coefficients) and structural parameters of the binocular camera system (including baseline length and relative rotation / translation matrix) obtained in advance through the Zhang Zhengyou calibration method or other high-precision calibration methods. These parameters accurately describe the mapping relationship between three-dimensional spatial points and two-dimensional image pixels, forming the geometric basis for subsequent stereo matching and triangulation.
[0054] To improve matching efficiency and accuracy, this embodiment employs a sparse matching strategy, performing depth recovery only on key feature points such as the tower base and the center of the tower top, rather than performing pixel-by-pixel dense matching on the entire image. The specific process is as follows: First, based on the pixel coordinates of the tower feet and the center of the tower top, feature blocks of a preset size are extracted from the first image data of the transmission tower to generate tower foot matching templates and tower top center matching templates. Specifically, the pixel coordinates of the tower feet in the first image data of the transmission tower are used as the basis for the matching templates. Centered on, extract a portion of size [size missing]. The image matrix of pixels is used as the matching template for the tower feet; similarly, the pixel coordinates of the center of the tower top are used. Centered on a point, an image matrix of the same size is cropped and used as the tower top center matching template. The size of the feature patch. Depending on the image resolution, it needs to contain enough texture information to distinguish the background, for example, taking... or Pixel.
[0055] Secondly, based on the epipolar correction parameters in the preset camera calibration parameters, a first epipolar search region corresponding to the pixel coordinates of the tower foot and a second epipolar search region corresponding to the pixel coordinates of the tower top center are determined in the second image data of the transmission tower. According to the epipolar geometry constraint principle, the projection point of a point in space onto the left image plane corresponds to an epipolar line on the right image plane. To narrow the search range and reduce the false matching rate, this invention does not perform a full-image search, but instead utilizes the fundamental matrix. Calculate the epipolar equations corresponding to the pixel coordinates of the tower feet in the second image data of the transmission tower. The polar line The first epipolar search region is defined as the buffer band of a certain pixel width above and below it. Similarly, the epipolar line corresponding to the pixel coordinates of the center of the tower top is calculated. polar lines The region and its neighborhood are defined as the second polar line search area.
[0056] Next, template matching is performed. The matching template for the tower foot is moved pixel-by-pixel within the first epipolar search area to calculate similarity and determine the optimal matching pixel coordinates for the tower foot in the second image data. Specifically, the normalized cross-correlation coefficient (NCC) or sum of squared errors (SSD) is used as the similarity metric. The matching template is moved pixel-by-pixel along the first epipolar search area, and the similarity value between the template and the covered area is calculated. The pixel position corresponding to the extreme similarity value is selected as the optimal matching pixel coordinates for the tower foot in the second image data. Similarly, the matching template for the tower top center is used to perform sliding similarity calculations within the second epipolar search area to determine the optimal matching pixel coordinates of the tower top center in the second image data. .
[0057] Finally, based on the pixel coordinates of the tower feet, the pixel coordinates of the tower top center, the best matching pixel coordinates of the tower feet in the second image data, the best matching pixel coordinates of the tower top center in the second image data, and the baseline length and focal length parameters in the preset camera calibration parameters, triangulation is performed to generate the relative three-dimensional coordinates of the tower feet and the relative three-dimensional coordinates of the tower top center in the camera coordinate system.
[0058] The specific triangulation calculations are based on the principle of parallax, and its mathematical model is as follows: Let the focal length parameter in the preset camera calibration parameters be... The baseline length parameter (i.e., the horizontal distance between the optical centers of the left and right cameras) is: .
[0059] For the tower base, calculate the horizontal parallax between the first image data and the second image data. Based on the principle of similar triangles, the depth value of the tower foot in the camera coordinate system is calculated. : Then combine the pixel coordinates of the tower feet coordinates of the camera principal point The horizontal coordinates of the tower base were calculated. with vertical coordinates : This yields the relative three-dimensional coordinates of the tower feet in the camera coordinate system. .
[0060] Calculate the horizontal parallax for the center of the tower top. Similarly, calculate the depth value of the tower top center in the camera coordinate system. : Calculate the horizontal coordinates of the center of the tower top. with vertical coordinates : This yields the relative three-dimensional coordinates of the tower top center in the camera coordinate system. .
[0061] Through the above-mentioned stereo matching and triangulation steps, the present invention can accurately extract the depth information of key parts of the power transmission tower from two-dimensional image data, effectively overcoming the technical defect that monocular vision cannot perceive the real spatial distance of objects, and providing accurate data support for subsequent coordinate transformation to the world coordinate system and calculation of the real physical tilt.
[0062] Step S4: Based on the attitude data and position data of the UAV, perform coordinate transformation on the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center to generate the absolute three-dimensional coordinates of the tower foot and the absolute three-dimensional coordinates of the tower top center in the world coordinate system. In a preferred embodiment, based on the attitude data and position data of the UAV, a coordinate transformation is performed on the relative three-dimensional coordinates of the tower base and the relative three-dimensional coordinates of the tower top center to generate the absolute three-dimensional coordinates of the tower base and the absolute three-dimensional coordinates of the tower top center in the world coordinate system, including: Based on the attitude data of the UAV, construct a three-dimensional rotation matrix to transform from the camera coordinate system to the world coordinate system; Based on the three-dimensional rotation matrix, a spatial rotation transformation is performed on the relative three-dimensional coordinates of the tower feet and the relative three-dimensional coordinates of the tower top center to generate the intermediate transition coordinates of the tower feet after attitude correction and the intermediate transition coordinates of the tower top center after attitude correction. The drone's position data is spatially translated and transformed with the intermediate transition coordinates of the tower foot after attitude correction to generate the absolute three-dimensional coordinates of the tower foot in the world coordinate system. The drone's position data is spatially translated and transformed with the intermediate transition coordinates after attitude correction at the center of the tower top to generate the absolute three-dimensional coordinates of the center of the tower top in the world coordinate system.
[0063] Specifically, in this embodiment, the relative three-dimensional coordinates of the tower base and the tower top center obtained in step S3 are local coordinates relative to the camera center. This local coordinate system will rotate and tilt as the UAV's flight attitude changes. To obtain the true shape of the power transmission tower in geographic space, the data in the camera coordinate system must be uniformly mapped to a static world coordinate system (such as the ENU or NED coordinate system). The coordinate transformation process follows the principle of rigid body transformation, that is, rotation correction is performed first, followed by translation and superposition.
[0064] To achieve accurate spatial mapping, this embodiment employs a solution strategy that combines Euler angle rotation matrices with spatial vector translation. The specific process is as follows: First, based on the UAV's attitude data, a 3D rotation matrix is constructed to transform the camera coordinate system to the world coordinate system. The UAV's attitude data includes pitch angles. (Pitch), Roll Angle (Roll) and yaw angle (Yaw). These three angular parameters define the rotation relationship between the camera coordinate system and the world coordinate system. The rotation matrix around the X-axis is defined as follows: The rotation matrix around the Y-axis is The rotation matrix around the Z-axis is Calculate the 3D rotation matrix based on commonly used rotation rules (e.g., ZYX rule). .
[0065] 3D rotation matrix The mathematical expression is as follows: in, Represents the cosine function. Represents the sine function. Three-dimensional rotation matrix. Each element in the matrix is calculated from the UAV's attitude data; it is a three-dimensional rotation matrix. Used to align the vector direction in the camera coordinate system to the direction in the world coordinate system.
[0066] Secondly, based on the three-dimensional rotation matrix, the relative three-dimensional coordinates of the tower feet and the relative three-dimensional coordinates of the tower top center are spatially rotated and transformed to generate the intermediate transition coordinates of the tower feet after attitude correction and the intermediate transition coordinates of the tower top center after attitude correction.
[0067] Let the relative three-dimensional coordinates of the tower feet in the camera coordinate system be... Then the intermediate transition coordinates of the tower foot after attitude correction The calculation formula is: Similarly, let the relative three-dimensional coordinates of the tower top center in the camera coordinate system be... The intermediate transition coordinates after attitude correction at the center of the tower top The calculation formula is: Among them, the intermediate transition coordinates after attitude correction at the tower foot and the intermediate transition coordinates after attitude correction at the tower top center represent, in a physical sense, the spatial vectors in the world coordinate system with the current position of the UAV as the origin.
[0068] Next, the UAV's position data is spatially translated and transformed with the attitude-corrected intermediate transition coordinates of the tower base to generate the absolute three-dimensional coordinates of the tower base in the world coordinate system. (UAV position data) This represents the absolute position of the camera's optical center in the world coordinate system. The absolute three-dimensional coordinates of the tower's base in the world coordinate system are obtained through vector addition. : Right now: in, , , These represent the horizontal, vertical, and lateral coordinates of the tower's base in the world coordinate system.
[0069] Finally, the UAV's position data is spatially translated and transformed with the attitude-corrected intermediate transition coordinates of the tower top center to generate the absolute three-dimensional coordinates of the tower top center in the world coordinate system. The calculation process is similar to that at the tower base, generating the absolute three-dimensional coordinates of the tower top center in the world coordinate system. The calculation formula is: Through the above coordinate transformation steps, the present invention can effectively eliminate the influence of attitude jitter and position changes of UAVs when hovering or flying in the air on the measurement results, and restore the relative measurement values based on images to absolute physical coordinates based on geographic space, providing an objective and unified spatial benchmark for subsequent calculation of the true verticality and tilt direction of power transmission towers.
[0070] Step S5: Determine the bottom reference center coordinates of the transmission tower based on the absolute three-dimensional coordinates of the tower feet; calculate the horizontal projection offset and vertical height difference of the absolute three-dimensional coordinates of the tower top center relative to the bottom reference center coordinates based on the absolute three-dimensional coordinates of the tower top center. In a preferred embodiment, based on the absolute three-dimensional coordinates of the tower top center, the horizontal projection offset and vertical height difference of the absolute three-dimensional coordinates of the tower top center relative to the bottom reference center coordinates are calculated, including: From the absolute three-dimensional coordinates of the tower top center, extract the horizontal coordinate components, the vertical coordinate components, and the center of height of the tower top center. Extract the horizontal coordinate components, vertical coordinate components, and vertical coordinate components of the bottom reference center coordinates from the bottom reference center coordinates. Based on the horizontal coordinate components of the tower top center, the vertical coordinate components of the tower top center, the horizontal coordinate components of the bottom reference center coordinates, and the vertical coordinate components of the bottom reference center coordinates, calculate and generate the horizontal projection offset of the absolute three-dimensional coordinates of the tower top center relative to the bottom reference center coordinates. Based on the vertical coordinate components of the tower top center and the vertical coordinate components of the bottom reference center coordinates, calculate the vertical height difference between the absolute three-dimensional coordinates of the tower top center and the bottom reference center coordinates.
[0071] Specifically, in this embodiment, the transmission tower is typically supported by four legs, which form the base plane of the tower. To accurately assess the overall tilt of the transmission tower, it is necessary to determine a geometric center point that represents the position of the tower base as a reference. Therefore, the process of determining the coordinates of the base reference center of the transmission tower is essentially calculating the geometric centroids of all the tower legs in the world coordinate system.
[0072] Specifically, assuming the transmission tower has One tower foot (usually) ), and the absolute three-dimensional coordinates of each tower foot are ,in Indicates the first Each tower foot. Base center coordinates of the bottom of the transmission tower. The coordinates of the bottom reference center are obtained by calculating the arithmetic mean of the absolute three-dimensional coordinates of all tower feet. The calculation formula is as follows: Calculated bottom reference center coordinates Recorded as This coordinate represents the theoretical position of the center of the tower top projected onto the ground under ideal conditions where the transmission tower is not tilted.
[0073] After determining the reference point, this embodiment further decomposes the spatial vector to decouple the horizontal displacement from the vertical height. The specific calculation process is as follows: First, from the absolute three-dimensional coordinates of the tower's top center, extract the horizontal coordinate components, vertical coordinate components, and center-to-vertical coordinate components. Let the absolute three-dimensional coordinates of the tower's top center be... Then the horizontal coordinate component of the center of the tower top is The longitudinal coordinate components of the center of the tower top are The vertical coordinate components of the center of the tower top are .
[0074] Simultaneously, the horizontal, vertical, and angular components of the bottom reference center coordinates are extracted from the bottom reference center coordinates. In other words, the previously calculated components are extracted. , as well as .
[0075] Next, based on the horizontal coordinate components of the tower top center, the vertical coordinate components of the tower top center, the horizontal coordinate components of the bottom reference center coordinates, and the vertical coordinate components of the bottom reference center coordinates, the horizontal projection offset of the absolute three-dimensional coordinates of the tower top center relative to the bottom reference center coordinates is calculated. Horizontal projection offset. It represents the Euclidean distance between the projection point of the tower top center onto the horizontal plane (XY plane) and the bottom reference center.
[0076] Horizontal projection offset The calculation formula is: In the formula, This represents the square of the lateral deviation. This represents the square of the longitudinal deviation. Represents the square root operation. Horizontal projection offset. It directly reflects the physical extent to which the transmission tower deviates from the vertical axis.
[0077] Finally, based on the vertical coordinate components of the tower top center and the vertical coordinate components of the bottom reference center coordinates, the vertical height difference between the absolute three-dimensional coordinates of the tower top center and the bottom reference center coordinates is calculated. It represents the effective vertical height of the transmission tower in its current state.
[0078] Vertical height difference The calculation formula is: In the formula, This indicates the operation of taking the absolute value. Although the height of the top of the tower is necessarily greater than the height of the bottom of the tower under normal circumstances, taking the absolute value ensures the non-negativity of the calculation result.
[0079] Through the above steps, this invention successfully decouples the complex three-dimensional spatial positional relationship into two key physical indicators: horizontal projection offset and vertical height difference. This decoupled calculation method eliminates the interference of the shooting perspective on tilt judgment, ensuring that the final tilt data is entirely based on the geometric shape of the transmission tower in the real physical world. This provides accurate right-angled triangle side length data for subsequent precise calculation of the tilt angle using inverse trigonometric functions.
[0080] Step S6: Determine the tilt of the power transmission tower after the wildfire based on the difference between the horizontal projection offset and the vertical height.
[0081] In a preferred embodiment, determining the tilt of the transmission tower after a wildfire based on the difference between the horizontal projection offset and the vertical height includes: The tilt tangent value is calculated based on the difference between the horizontal projection offset and the vertical height. The inclination tangent value is used to perform an arctangent function operation to generate the inclination of the power transmission tower after the wildfire.
[0082] Specifically, after the aforementioned calculations, the absolute geometric parameters of the transmission tower in the world coordinate system have been obtained, namely the horizontal projection offset and the vertical height difference. These two parameters form the two legs of a right triangle describing the tilt state of the transmission tower. In order to meet the power industry's evaluation standards for the operating status of transmission towers (usually using tilt rate or tilt angle as indicators), it is necessary to convert the length dimension to the angle dimension.
[0083] The specific calculation process follows the principle of inverse trigonometric functions, aiming to map spatial displacement into angular values: First, the tilt tangent is calculated based on the horizontal projection offset and the vertical height difference. In the structural model of the transmission tower, the vertical center axis of the tower, the projection line of the tower on the ground, and the ideal vertical reference line together form a virtual right triangle. The horizontal projection offset corresponds to the length of the opposite side of the right triangle, and the vertical height difference corresponds to the length of the adjacent side. The tilt tangent is then calculated. It characterizes the ratio of the lateral offset of the transmission tower to its height.
[0084] Inclination tangent The calculation formula is as follows: In the formula, This represents the horizontal projection offset calculated in the preceding steps. This represents the vertical height difference calculated in the preceding steps. (Inclination tangent value) The larger the value, the more severe the deviation of the transmission tower from the vertical direction.
[0085] Subsequently, the inclination tangent value was calculated using the arctangent function to generate the inclination of the transmission towers after the wildfire. Since the tangent value only reflects the proportional relationship, in order to intuitively present the inclination state of the transmission towers, the inclination tangent value was converted back into an angle value using the arctangent function.
[0086] Inclination of transmission towers The calculation formula is as follows: In the formula, This represents the arctangent function operation. Pi is a constant (approximately 3.14159). Since computer inverse trigonometric function calculations typically output results in radians, multiplying by... Convert radians to degrees (°), a unit commonly used in electrical engineering. Calculate the tilt of the transmission tower. It is the actual angle between the central axis of the transmission tower and the vertical line of gravity.
[0087] Through the above calculation steps, the present invention can ultimately transform the visual data collected by the UAV into intuitive and quantifiable angular indicators. These indicators can be directly compared with the safety thresholds in the operation and maintenance specifications of power facilities, thereby quickly determining whether the transmission towers after a wildfire are in a critical state, and providing accurate data support for subsequent emergency repair and reinforcement or line shutdown decisions.
[0088] Based on the above method embodiments, the present invention provides corresponding apparatus embodiments.
[0089] like Figure 2As shown, an embodiment of the present invention provides a device for determining the tilt of a power transmission tower, comprising: a data acquisition module, a three-dimensional matching module, a coordinate system transformation module, and a tilt determination module; The data acquisition module is used to acquire first image data of the power transmission tower collected by the drone, second image data of the power transmission tower collected by the drone, attitude data of the drone, and position data of the drone. The stereo matching module is used to input the first image data of the power transmission tower into a preset key point detection model, so that the key point detection model outputs the pixel coordinates of the key points of the power transmission tower according to the first image data; wherein, the key points include the tower foot and the tower top center; based on preset camera calibration parameters, stereo matching is performed on the second image data of the power transmission tower according to the pixel coordinates of the tower foot and the pixel coordinates of the tower top center to determine the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center in the camera coordinate system; The coordinate system transformation module is used to perform coordinate transformation on the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center based on the attitude data and position data of the UAV, so as to generate the absolute three-dimensional coordinates of the tower foot and the absolute three-dimensional coordinates of the tower top center in the world coordinate system. The tilt determination module is used to determine the bottom reference center coordinates of the transmission tower based on the absolute three-dimensional coordinates of the tower feet; calculate and generate the horizontal projection offset and vertical height difference of the absolute three-dimensional coordinates of the tower top center relative to the bottom reference center coordinates based on the absolute three-dimensional coordinates of the tower top center; and determine the tilt of the transmission tower after the wildfire based on the horizontal projection offset and the vertical height difference.
[0090] It should be noted that the embodiments of the device described above correspond to the embodiments of the present invention described above, and can realize the method for determining the tilt of transmission towers as described in any one of the above embodiments of the present invention. Furthermore, the embodiments of the device described above are merely illustrative. The modules described as separate components may or may not be physically separate, and the components shown as modules may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. In addition, in the accompanying drawings of the device embodiments provided by the present invention, the connection relationship between modules indicates that they have a communication connection, which can be specifically implemented as one or more communication buses or signal lines. Those skilled in the art can understand and implement this without creative effort.
[0091] In the description of this specification, the references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of this application. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples. Moreover, without contradiction, those skilled in the art can combine and integrate the different embodiments or examples described in this specification, as well as the features of those different embodiments or examples.
[0092] The above description represents the preferred embodiments of the present invention. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principles of the present invention, and these improvements and modifications are also considered to be within the scope of protection of the present invention.
Claims
1. A method for determining the tilt angle of a power transmission tower, characterized in that, include: Acquire first image data of the power transmission tower collected by the drone, second image data of the power transmission tower collected by the drone, attitude data of the drone, and position data of the drone; The first image data of the power transmission tower is input into a preset key point detection model, so that the key point detection model outputs the pixel coordinates of the key points of the power transmission tower based on the first image data; wherein, the key points include the tower feet and the center of the tower top; Based on preset camera calibration parameters, stereo matching is performed on the second image data of the power transmission tower according to the pixel coordinates of the tower foot and the pixel coordinates of the tower top center to determine the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center in the camera coordinate system. Based on the attitude data and position data of the UAV, coordinate transformation is performed on the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center to generate the absolute three-dimensional coordinates of the tower foot and the absolute three-dimensional coordinates of the tower top center in the world coordinate system. Based on the absolute three-dimensional coordinates of the tower feet, determine the reference center coordinates of the bottom of the transmission tower; based on the absolute three-dimensional coordinates of the tower top center, calculate and generate the horizontal projection offset and vertical height difference of the absolute three-dimensional coordinates of the tower top center relative to the reference center coordinates of the bottom. The tilt of the transmission tower is determined based on the difference between the horizontal projection offset and the vertical height.
2. The method for determining the tilt angle of a transmission tower as described in claim 1, characterized in that, The pre-defined keypoint detection model is trained using the following methods: Obtain a 3D model of the power transmission tower; Based on the 3D model of the power transmission tower, preset color space offset parameters, preset semi-transparent smoke layer parameters, and preset noise parameters, a virtual scene is rendered to generate several virtual image samples. Based on the 3D model of the power transmission tower, key points are annotated for each virtual image sample, generating key point annotation data corresponding to each virtual image sample. Based on each virtual image sample and the corresponding key point annotation data, several training samples are constructed; wherein, each training sample includes a virtual image sample and a corresponding annotation label; the annotation label is the key point annotation data corresponding to the virtual image sample; Each training sample is sequentially input into the keypoint detection model to train the model until a preset convergence condition is met. The keypoint detection model outputs a predicted keypoint corresponding to each training sample received. A loss function value is calculated based on the predicted keypoint and its corresponding label. The keypoint detection model is then updated based on the loss function value. The loss function value includes a keypoint coordinate regression loss value and a bounding box regression loss value, with the weight of the keypoint coordinate regression loss value being greater than the weight of the bounding box regression loss value.
3. The method for determining the tilt angle of a transmission tower as described in claim 2, characterized in that, Obtain a 3D model of the power transmission tower, including: Obtain the standard design parameters of the transmission tower; wherein, the standard design parameters include the nominal height of the tower, the total height of the tower, the distance between the bases of the tower legs, the number of crossarm layers, the length of the crossarm, and the specifications of the insulator string; Based on the nominal height of the tower, the total height of the tower, and the distance between the bases of the tower legs, a main skeleton model of the power transmission tower is constructed. Based on the number of crossarm layers and the length of the crossarm, a crossarm model of the transmission tower is constructed; Based on the insulator string specifications, construct an insulator string model for the transmission tower; A three-dimensional model of the transmission tower is generated based on the main frame model, the crossarm model, and the insulator string model.
4. The method for determining the tilt angle of a transmission tower as described in claim 3, characterized in that, Based on the 3D model of the power transmission tower, preset color space offset parameters, preset semi-transparent smoke layer parameters, and preset noise parameters, virtual scene rendering is performed to generate several virtual image samples, including: Configure the metal material properties on the 3D model of the power transmission tower to generate the basic tower model; Based on preset color space offset parameters, the diffuse color channel of the metal material properties of the basic iron tower model is adjusted to generate an iron tower model with high-temperature burning characteristics. A virtual smoke model is constructed based on preset semi-transparent smoke layer parameters; The virtual smoke model is configured between the iron tower model with high-temperature burning characteristics and the virtual camera to construct a virtual rendering scene that includes smoke occlusion effect; Based on a virtual camera, the virtual rendering scene is captured from multiple perspectives to generate an initial rendered image; Based on preset noise parameters, noise is processed on the initial rendered image to generate virtual image samples.
5. The method for determining the tilt angle of a transmission tower as described in claim 4, characterized in that, Based on the 3D model of the power transmission tower, key points are annotated for each virtual image sample, generating key point annotation data corresponding to each virtual image sample, including: Based on the three-dimensional model of the transmission tower, determine the model space coordinates of the tower feet and the model space coordinates of the tower top center in the three-dimensional model of the transmission tower; For each virtual image sample, obtain the pose transformation matrix and intrinsic parameter projection matrix of the virtual camera when acquiring the current virtual image sample, and use them as the target pose transformation matrix and target intrinsic parameter projection matrix. Based on the target pose transformation matrix and the target intrinsic parameter projection matrix, perspective projection transformation is performed on the model space coordinates of the tower foot and the model space coordinates of the tower top center, respectively, to generate the projected pixel coordinates of the tower foot on the image plane where the current virtual image sample is located, and the projected pixel coordinates of the tower top center on the image plane where the current virtual image sample is located. The projected pixel coordinates of the tower base on the image plane where the current virtual image sample is located, and the projected pixel coordinates of the tower top center on the image plane where the current virtual image sample is located, are used as the key point annotation data corresponding to the current virtual image sample.
6. The method for determining the tilt angle of a transmission tower as described in claim 5, characterized in that, Based on preset camera calibration parameters, stereo matching is performed on the second image data of the transmission tower according to the pixel coordinates of the tower feet and the pixel coordinates of the tower top center to determine the relative three-dimensional coordinates of the tower feet and the relative three-dimensional coordinates of the tower top center in the camera coordinate system, including: Based on the pixel coordinates of the tower feet and the pixel coordinates of the tower top center, feature blocks of a preset size are extracted from the first image data of the power transmission tower to generate tower foot matching templates and tower top center matching templates. Based on the pre-set epipolar correction parameters in the camera calibration parameters, the first epipolar search area corresponding to the pixel coordinates of the tower foot and the second epipolar search area corresponding to the pixel coordinates of the tower top center are determined in the second image data of the power transmission tower. The sliding similarity calculation of the tower foot matching template is performed within the first epipolar search area to determine the best matching pixel coordinates of the tower foot in the second image data. The matching template of the tower top center is used to perform sliding similarity calculation within the second epipolar line search area to determine the best matching pixel coordinates of the tower top center in the second image data. Based on the pixel coordinates of the tower base, the pixel coordinates of the tower top center, the best matching pixel coordinates of the tower base in the second image data, the best matching pixel coordinates of the tower top center in the second image data, and the baseline length and focal length parameters in the preset camera calibration parameters, triangulation is performed to generate the relative three-dimensional coordinates of the tower base and the relative three-dimensional coordinates of the tower top center in the camera coordinate system.
7. The method for determining the tilt angle of a transmission tower as described in claim 6, characterized in that, Based on the UAV's attitude and position data, coordinate transformation is performed on the relative 3D coordinates of the tower base and the tower top center to generate the absolute 3D coordinates of the tower base and the tower top center in the world coordinate system, including: Based on the attitude data of the UAV, construct a three-dimensional rotation matrix to transform from the camera coordinate system to the world coordinate system; Based on the three-dimensional rotation matrix, a spatial rotation transformation is performed on the relative three-dimensional coordinates of the tower feet and the relative three-dimensional coordinates of the tower top center to generate the intermediate transition coordinates of the tower feet after attitude correction and the intermediate transition coordinates of the tower top center after attitude correction. The drone's position data is spatially translated and transformed with the intermediate transition coordinates of the tower foot after attitude correction to generate the absolute three-dimensional coordinates of the tower foot in the world coordinate system. The drone's position data is spatially translated and transformed with the intermediate transition coordinates after attitude correction at the center of the tower top to generate the absolute three-dimensional coordinates of the center of the tower top in the world coordinate system.
8. The method for determining the tilt angle of a transmission tower as described in claim 7, characterized in that, Based on the absolute three-dimensional coordinates of the tower top center, calculate the horizontal projection offset and vertical height difference of the absolute three-dimensional coordinates of the tower top center relative to the bottom reference center coordinates, including: Extract the horizontal coordinate components, vertical coordinate components, and vertical coordinate components of the tower top center from the absolute three-dimensional coordinates of the tower top center. Extract the horizontal coordinate components, vertical coordinate components, and vertical coordinate components of the bottom reference center coordinates from the bottom reference center coordinates. Based on the horizontal coordinate components of the tower top center, the vertical coordinate components of the tower top center, the horizontal coordinate components of the bottom reference center coordinates, and the vertical coordinate components of the bottom reference center coordinates, calculate and generate the horizontal projection offset of the absolute three-dimensional coordinates of the tower top center relative to the bottom reference center coordinates. Based on the vertical coordinate components of the tower top center and the vertical coordinate components of the bottom reference center coordinates, calculate the vertical height difference between the absolute three-dimensional coordinates of the tower top center and the bottom reference center coordinates.
9. The method for determining the tilt angle of a transmission tower as described in claim 8, characterized in that, Based on the difference between the horizontal projection offset and the vertical height, the tilt of the power transmission tower after the wildfire is determined, including: The tilt tangent value is calculated based on the difference between the horizontal projection offset and the vertical height. The inclination tangent value is used to perform an arctangent function operation to generate the inclination of the power transmission tower after the wildfire.
10. A device for determining the tilt angle of a power transmission tower, characterized in that, include: Data acquisition module, stereo matching module, coordinate system transformation module, and tilt determination module; The data acquisition module is used to acquire first image data of the power transmission tower collected by the drone, second image data of the power transmission tower collected by the drone, attitude data of the drone, and position data of the drone. The stereo matching module is used to input the first image data of the power transmission tower into a preset key point detection model, so that the key point detection model outputs the pixel coordinates of the key points of the power transmission tower according to the first image data; wherein, the key points include the tower foot and the tower top center; based on preset camera calibration parameters, stereo matching is performed on the second image data of the power transmission tower according to the pixel coordinates of the tower foot and the pixel coordinates of the tower top center to determine the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center in the camera coordinate system; The coordinate system transformation module is used to perform coordinate transformation on the relative three-dimensional coordinates of the tower foot and the relative three-dimensional coordinates of the tower top center based on the attitude data and position data of the UAV, so as to generate the absolute three-dimensional coordinates of the tower foot and the absolute three-dimensional coordinates of the tower top center in the world coordinate system. The tilt determination module is used to determine the bottom reference center coordinates of the transmission tower based on the absolute three-dimensional coordinates of the tower feet; calculate and generate the horizontal projection offset and vertical height difference of the absolute three-dimensional coordinates of the tower top center relative to the bottom reference center coordinates based on the absolute three-dimensional coordinates of the tower top center; and determine the tilt of the transmission tower after the wildfire based on the horizontal projection offset and the vertical height difference.