A method for intelligent inspection of hydropower stations based on unmanned aerial vehicles (UAVs)

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By employing edge-end collaborative data acquisition and multimodal data fusion, the problem of insufficient coordination between UAVs and fixed monitoring equipment in hydropower station inspections was solved, enabling efficient and accurate equipment defect identification and risk warning, and adapting to the weak signal environment in remote areas.

CN122308422APending Publication Date: 2026-06-30GUANGXI GUIGUAN ELECTRIC POWER CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: GUANGXI GUIGUAN ELECTRIC POWER CO LTD
Filing Date: 2026-04-10
Publication Date: 2026-06-30

Application Information

Patent Timeline

10 Apr 2026

Application

30 Jun 2026

Publication

CN122308422A

IPC: G05D1/495; G05D1/46; G05D101/15; G05D109/20

AI Tagging

Technology Topics

Power station Simulation

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

In existing technologies, the lack of coordination and integration between drones and fixed monitoring equipment in hydropower station inspections leads to inconsistent data collection, high false alarm rates, low positioning accuracy, and delayed response when the signal is poor, posing safety hazards.

Method used

We employ edge-based region partitioning and optimization scheduling algorithms to coordinate UAVs and fixed monitoring equipment in collaborative data collection. We use the NeRF algorithm to construct dynamic environmental features, combine them with Transformer's multimodal fusion and cross-validation algorithms for analysis, and use a pre-trained temporal attention LSTM model for risk prediction to optimize the inspection strategy.

Benefits of technology

It improves the accuracy of equipment defect identification, reduces false alarm rate, shortens early warning response time, adapts to the weak signal problem in remote areas, and ensures stable transmission and processing of inspection data.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122308422A_ABST

Patent Text Reader

Abstract

This invention belongs to the field of power facility inspection technology, specifically relating to an intelligent inspection method for hydropower stations based on unmanned aerial vehicles (UAVs). The method includes: edge-end collaborative scheduling of UAVs and fixed monitoring equipment to collect data; extraction of dynamic target generation masks based on inter-frame difference and optical flow methods; construction of a multimodal NeRF rendering loss function with temperature physical constraints using dynamic and static differentiated radiation field sampling; lightweight dynamic environment feature modeling achieved by combining network pruning and pre-trained feature reuse; generation of inspection strategies based on dynamic environment features and multimodal fusion analysis through Transformer; and prediction of risks and generation of optimization strategies using a cloud-based temporal attention LSTM model, which is then fed back to the edge end. This invention achieves deep fusion of multi-source data and real-time edge-end response, significantly improving inspection accuracy and efficiency, reducing operation and maintenance costs, and adapting to complex power station scenarios.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of power facility inspection technology, specifically relating to an intelligent inspection method for hydropower stations based on unmanned aerial vehicles (UAVs). Background Technology

[0002] With the development of drones, high-definition cameras, and artificial intelligence technologies, the automation and intelligence levels of power facility inspections are gradually improving. Currently, the industry has seen solutions using drones for aerial photography inspections of facilities such as dams and photovoltaic arrays, as well as methods for all-weather area monitoring by deploying fixed high-point cameras. However, in specific practical applications such as hydro-solar hybrid power plants—characterized by complex scenarios, numerous devices, and extremely high safety requirements—existing technical solutions still have the following problems: Because drones and fixed monitoring equipment such as cameras are often deployed and operated independently, lacking a unified intelligent scheduling and task coordination mechanism, this not only creates blind spots and overlaps in the monitored area but also results in asynchronous data collection in time and difficulty in spatial correlation. For example, suspected defects detected by drones cannot be immediately monitored by nearby fixed cameras for continuous, multi-angle observation. Moreover, the detection data from drones and the monitoring data from fixed monitoring equipment are isolated. Existing systems often use a single data source for analysis or simply present data side-by-side, lacking the ability for deep fusion and cross-validation. For instance, they cannot accurately match the local high-definition images captured by drones with the global temperature field changes captured by fixed cameras in time and space, resulting in a high false alarm rate and low positioning accuracy for identifying thermal defects and mechanical faults, making it difficult to form reliable early warning judgments. To address the aforementioned needs for data fusion and complex analysis, existing solutions typically transmit all raw data back to a cloud center for processing. This highly centralized model not only relies heavily on network bandwidth but also introduces significant processing latency. Furthermore, in remote power plant sites where signals are weak, the excessively long chain from data acquisition, uploading, central analysis to result feedback reduces system response speed and poses security risks. Summary of the Invention

[0003] To address the aforementioned problems in the existing technology, this invention provides an intelligent inspection method for hydropower stations based on unmanned aerial vehicles (UAVs). This method solves the problems in existing solutions, such as the lack of coordination and integration between data collected by UAVs and fixed monitoring data, and the reliance on the cloud, which can lead to response delays when the signal is poor.

[0004] The objective of this invention can be achieved through the following technical solution: a method for intelligent inspection of hydropower stations based on unmanned aerial vehicles (UAVs), comprising the following steps: S1: In response to inspection events, the edge device uses a region division and optimization scheduling algorithm to schedule drones and fixed monitoring equipment with gimbals to collaboratively collect data on the target area. S2: The NeRF algorithm is used to construct dynamic environmental features of the target area based on collaboratively collected data and electronic maps. The dynamic environmental features include three-dimensional geometric information, dynamic target motion trajectory information, and temperature field distribution information. S3: Based on the dynamic environmental characteristics, generate and execute the inspection route of the UAV and the monitoring strategy of the fixed monitoring equipment, obtain multimodal data based on the execution results, and analyze the data based on the Transformer multimodal fusion and cross-validation algorithm to obtain the inspection results of the target area. S4: Upload the inspection results to the cloud. The cloud uses a pre-trained temporal attention LSTM prediction model to predict the risks of the inspection results and generate optimization strategies to guide subsequent inspections.

[0005] Preferably, in S1, the scheduling algorithm adopts a dynamic optimization scheduling model based on multi-objective optimization, and the specific process includes: Define decision variables, including normalized UAV task allocation variables, UAV path variables, fixed monitoring equipment task allocation variables, and fixed monitoring equipment gimbal adjustment variables; A weighted cost function is constructed using a weighted fusion method, with time efficiency, system energy consumption, coverage quality targets, and network transmission quality as sub-functions. Constraints are set based on drone endurance constraints, mission time window constraints, fixed monitoring equipment monitoring range constraints, mission must be monitored constraints, and signal strength constraints. The collaborative acquisition scheme is obtained by solving the dynamic optimization scheduling model using an improved multi-objective particle swarm optimization algorithm.

[0006] Preferably, the formula for calculating the weighted cost function is:

[0007] In the formula, , , and These are weighting coefficients, used to balance the importance of the four objectives, and can be dynamically adjusted based on task urgency, signal strength, and other factors. , , and They were normalized to be unified as follows The values within the range represent the time efficiency target, system energy consumption target, coverage quality target, and network transmission target, respectively.

[0008] Preferably, the formula for calculating the coverage quality target is: ; In the formula, For the drone coverage quality function, The coverage quality function for fixed monitoring equipment, where 'o' is the steepness parameter of the sigmoid function, is used to control the sensitivity of the effect of height on quality. is the signal strength coefficient at task point i, m is the total number of drones, and n is the total number of tasks to be executed. Assign variables to fixed monitoring equipment tasks. Assign variables to the task of the drone u.

[0009] Preferably, step S2 specifically includes the following sub-steps: Preprocess the collaboratively collected data to construct a spatiotemporally aligned dataset; The inter-frame difference method combined with the OTSU adaptive threshold segmentation algorithm is used to initially screen dynamic candidate regions, and the noise points and boundary blur of the dynamic candidate regions are corrected using the optical flow algorithm to obtain the final dynamic region mask. Using the final dynamic region mask as a spatial constraint, the NeRF algorithm is used to perform fine-level and coarse-level sampling on the dynamic target region and the static background region, respectively.

[0010] Preferably, S2 further includes constructing a multimodal rendering loss function for the NeRF algorithm using temperature physics constraints.

[0011] Preferably, in S2, the redundant convolutional layers of the MIP network of the NeRF algorithm are pruned using L1 regularization and the pre-trained features of the background static region are reused.

[0012] Preferably, in S3, the specific process of generating UAV inspection routes and fixed monitoring equipment monitoring strategies based on dynamic environmental characteristics is as follows: View coverage analysis data is extracted from dynamic environmental features, and initial gimbal adjustment parameters are assigned to fixed monitoring equipment, including rotation angle and rotation time. The dynamic flight path of the UAV is determined by extracting static obstacle boundaries, dynamic target motion trajectories, and equipment distribution coordinate information from dynamic environmental features and using the NRBO algorithm. Based on the time sequence information of nodes obtained from the dynamic flight path and the information of fixed monitoring equipment at the corresponding nodes, the initial gimbal adjustment parameters of the fixed monitoring equipment are corrected using a greedy algorithm to obtain the corrected gimbal adjustment parameters. Based on the dynamic flight path and the corrected gimbal adjustment coefficient, the inspection flight path of the UAV and the monitoring strategy of the fixed monitoring equipment are generated respectively.

[0013] Preferably, step S3 includes using a population optimization algorithm to solve the NRBO algorithm to determine the dynamic flight path of the UAV, including the following process: Initialize the flight path population and generate multiple initial paths based on the distribution of task points, avoiding static obstacle boundaries; Iterative optimization is performed, and in each iteration, route nodes are rearranged and paths are smoothed, and the multi-objective function value of each dynamic route is calculated. The non-dominated solution with the smallest multi-objective function value is retained, and the next generation population is generated through crossover and mutation operations until the iteration converges, and the dynamic route is output.

[0014] The beneficial effects of this invention are as follows: This invention improves the accuracy of temperature identification and reduces the false alarm rate of equipment thermal and structural defects that are common in hydropower stations by integrating multimodal data obtained from fixed monitoring equipment and UAVs, and combining the dual protection of multimodal data and physical constraints. Through precise extraction and trajectory modeling of dynamic target areas, it effectively identifies dynamic safety hazards such as bird intrusions and personnel violations, shortens the early warning response time, and avoids equipment damage caused by interference from dynamic targets. Furthermore, through differentiated sampling and model lightweighting strategies, the NeRF model can run efficiently at the edge without relying on cloud computing power, adapting to the problems of high latency and high bandwidth dependence in remote areas, effectively avoiding the transmission bottleneck of weak signals in remote power stations, and ensuring the stable transmission and processing of inspection data. Attached Figure Description

[0015] To facilitate understanding by those skilled in the art, the present invention will be further described below with reference to the accompanying drawings.

[0016] Figure 1 This is a schematic diagram of the method steps of the present invention. Detailed Implementation

[0017] To further illustrate the technical means and effects of the present invention in achieving the intended purpose, the following detailed description of the specific implementation methods, structures, features and effects of the present invention, in conjunction with the accompanying drawings and preferred embodiments, is provided.

[0018] Please see Figure 1 This embodiment provides a method for intelligent inspection of hydropower stations based on unmanned aerial vehicles (UAVs). S1: In response to inspection events, the edge device uses a region division and optimization scheduling algorithm to schedule drones and fixed monitoring equipment with gimbals to collaboratively collect data on the target area. S2: The NeRF algorithm is used to construct dynamic environmental features of the target area based on collaboratively collected data and electronic maps. The dynamic environmental features include three-dimensional geometric information, dynamic target motion trajectory information, and temperature field distribution information. S3: Based on the dynamic environmental characteristics, generate and execute the inspection route of the UAV and the monitoring strategy of the fixed monitoring equipment, obtain multimodal data based on the execution results, and analyze the data based on the Transformer multimodal fusion and cross-validation algorithm to obtain the inspection results of the target area. S4: Upload the inspection results to the cloud. The cloud uses a pre-trained temporal attention LSTM prediction model to predict the risks of the inspection results and generate optimization strategies to guide subsequent inspections. The inspection results include a defect information set, dynamic environmental features, and cloud processing logs. The defect information set includes the spatial location of the defect (3D geometric features mapped to GIS coordinates based on S2), type (thermal defect, structural defect, or foreign object intrusion, etc.), severity (confidence score based on Transformer multimodal cross-validation), and timestamp. The environmental features include key parameters of the 3D geometric features generated by S2 (such as the point cloud curvature and volume of the defect area), temperature field distribution features (such as temperature peaks and gradients in abnormal temperature areas), and dynamic target trajectory features (such as the spatial coordinates and movement speed of frequently appearing birds). The cloud processing logs include metadata such as equipment scheduling parameters for this inspection, NeRF model sampling density, and confidence threshold of the multimodal fusion algorithm. Incremental coding compression algorithm is used to process structured data. Only incremental data that differs from historical inspection results is uploaded. For feature data of static background areas, historical snapshots stored in the cloud are reused to reduce data transmission volume. At the same time, check codes and timestamp tags are added to the data to ensure the integrity and time sequence consistency of the data received by the cloud. The edge device uploads the encapsulated inspection result data packet to the cloud center through NB-IoT or satellite communication module.

[0019] By using a pre-trained temporal attention LSTM prediction model to fuse real-time inspection results with historical time-series data, a multi-dimensional risk prediction model for the development of power plant equipment defects and the evolution of safety hazards is realized, and risk prediction analysis is performed on the uploaded data.

[0020] The cloud-based layered optimization strategy is encrypted and encapsulated, then transmitted to the edge via a two-way communication link. The edge updates its local algorithm parameters and device scheduling rules based on the strategy, forming a closed-loop inspection system from edge data collection and analysis to cloud-based prediction and decision-making, and finally to edge-based optimization execution. The cloud, based on the edge's device type and computing power configuration, breaks down the optimization strategy into different execution command packages: flight path adjustment parameters for the UAV controller, gimbal angle and monitoring frequency parameters for fixed monitoring equipment, and NeRF sampling density and model training parameters for edge computing nodes. Upon receiving the strategy, the edge automatically parses the commands and updates its local algorithm configuration library without manual intervention. The edge then executes the next inspection task according to the optimization strategy, uploading the new inspection results back to the cloud. The cloud integrates the new data into the LSTM model's training set, updating model parameters through incremental learning to improve the accuracy of subsequent risk predictions. Simultaneously, the cloud periodically evaluates the execution effectiveness of historical strategies.

[0021] The detailed process for each step includes: In S1, the scheduling algorithm adopts a dynamic optimization scheduling model based on multi-objective optimization. The specific process includes: S101: Define decision variables, including normalized UAV task allocation variables, UAV path order variables, fixed monitoring equipment task allocation variables, and fixed monitoring equipment gimbal adjustment variables. S102: Construct a weighted cost function using a weighted fusion method with time efficiency, system energy consumption, coverage quality target and network transmission quality as sub-function terms; The formula for calculating the cost function is: ; In the formula, , , and These are weighting coefficients, used to balance the importance of the four objectives, and can be dynamically adjusted based on task urgency, signal strength, and other factors. , , and They were normalized to be unified as follows The values within the interval represent the time efficiency target, system energy consumption target, coverage quality target, and network transmission target. The calculation formulas for each target are as follows: ; In the formula, m represents the total number of drones, and n represents the total number of tasks to be performed. Let be the Euclidean distance between task point i and task point j. Set the flight speed for the drone. This is a variable representing the drone's path order. It takes the value 1 if the drone completes task i and then proceeds to task j; otherwise, it takes the value 0. The fixed amount of time required to complete task i, including actions such as hovering and shooting. Assign a task variable to drone u. If drone u is assigned to execute task i, the value is 1; otherwise, the value is 0. ; In the formula, ... and These represent the energy consumption per unit angle of gimbal rotation for the fixed monitoring device f and the change in gimbal rotation angle required to align with the task point i, respectively. The variable is assigned to the fixed monitoring equipment task. If the fixed monitoring equipment f is assigned to monitoring task i, the value is 1; otherwise, it is 0. Adjusting variables for the pan-tilt unit of the fixed monitoring equipment; ; In the formula, The coverage quality function for drones is calculated using the following formula: ; In the formula, This is the current flight altitude of the drone. This is the optimal observation angle, where 'o' is the steepness parameter of the sigmoid function, used to control the sensitivity of altitude to the mass effect. It is the signal strength coefficient at task point i, determined by the real-time signal strength. Mapped to obtain, The real-time received signal strength indicator at task point i is set to a negative value; the smaller the absolute value, the stronger the signal. Different signal strength values correspond to different... Numerical value It is the distance from the drone u's current location to the mission point i. It is a very small constant used to prevent the denominator from being zero; The coverage quality function of fixed monitoring equipment is calculated using the following formula: ; In the formula, The maximum effective monitoring distance of the fixed monitoring device f is given. This is the cosine value of the change in the gimbal rotation angle, used to penalize image instability caused by large rotations; ; In the formula, Let i be the expected amount of data generated when task point i is executed. Used to punish in areas with weak signals ( Small values generate a large amount of data. For tasks with large numerical values, the data acquisition task should be assigned to equipment or areas with better signal strength. By using a multi-objective cost function to take the real-time signal strength of the task point as a key variable, and dynamically coupling it into the coverage quality and network transmission sub-objectives through a defined signal strength efficiency coefficient, a coverage quality function based on a product form is designed. This forces the planning scheme to seek a synergistic optimal solution among safe distance, observation angle, and signal quality, rather than a simple compromise. This effectively solves the inherent contradictions of existing scheduling methods in remote, weak-signal, and complex power plant scenarios, which ignore communication constraints and result in either "data being collected but not transmitted back" or "inspection quality being sacrificed for data transmission." The system can adaptively generate scheduling strategies, significantly reducing the invalid flight mileage of UAVs and redundant energy consumption of equipment while ensuring the real-time transmission of key data. Furthermore, by implementing focused collaborative observation of signal perception in high-risk areas, the system improves overall inspection efficiency while ensuring responsiveness.

[0022] S103: Set constraints based on UAV endurance constraints, task time window constraints, fixed monitoring equipment monitoring range constraints, task monitoring constraints, and signal strength constraints. Here, the UAV endurance constraint is that the total task time allocated to each UAV must not exceed the maximum endurance time supported by its current battery power. The time window constraint is that each task must be executed and completed within a preset time window. The fixed monitoring equipment monitoring range constraint is the task points that any fixed monitoring equipment can cover, including distance constraints, horizontal range constraints, and pitch angle constraints. The task monitoring constraint is that each task point must be monitored by at least one device to ensure the integrity of the collected data. The signal strength constraint includes the detection of signal strength in the task area. For weak signal areas where the signal strength is below a set threshold, the fixed monitoring equipment that relies on network feedback is controlled to perform the data collection task, and the UAV is used for data collection to avoid data loss.

[0023] S104: The cooperative acquisition scheme is obtained by solving the constrained dynamic optimization scheduling model using an improved multi-objective particle swarm optimization algorithm. The specific execution process includes: S1041: Perform particle encoding. Each particle represents a complete scheduling scheme. Taking UAV u and fixed monitoring equipment f as examples, their position vector X encodes the various decision variables defined in S1. For the gimbal adjustment variable of the fixed monitoring equipment, a real number encoding method is used, and the value is determined and converted into a binary value through the threshold method. For the UAV path order variable, the fixed monitoring equipment task allocation variable, and the fixed monitoring equipment gimbal adjustment variable, an indirect encoding method is used to convert them into binary. Multiple particles are randomly generated to form an initial population. For each particle, its continuous variable portion is uniformly and randomly generated within its domain, while its discrete variable portion is randomly assigned using the aforementioned indirect encoding method. Simultaneously, a velocity vector is initialized for each particle, with the values of each dimension of the velocity vector randomly generated within a pre-defined small range. S1042: Decode the position vectors of each particle into specific scheduling decisions, including the mission sequence of the UAV, the monitoring tasks of the fixed monitoring equipment, and the rotation angle of the gimbal; The decoded solution is checked one by one against the constraints set in S103. For minor violations of constraints (such as the gimbal angle slightly exceeding the range), the boundary absorption method is used to directly correct them to the nearest valid boundary value. For serious violations of constraints (such as the total drone mission time exceeding its maximum endurance), a heuristic repair strategy is initiated: for example, the most time-consuming mission is removed from the drone mission sequence, or it is reassigned to other idle drones, and the path is adjusted accordingly. For the repaired feasible scheduling scheme, the total cost is calculated according to the weighted cost function $J$ constructed in S102. The calculation process requires substituting the specific decision variable values after decoding, and performing the calculation and weighted summation of each sub-objective function. At this time, the total cost value is the fitness value of the particle. The smaller the value, the better the overall performance of the scheme. For schemes that are still not feasible after repair, a very large penalty value is assigned as the fitness of the scheme corresponding to the particle. S1043: Update the historical best for individuals and the population. For each particle, record the position with the best fitness in its own search history. After each iteration, compare the fitness of the particle's current position (labeled A) with the fitness corresponding to the historical best fitness position (labeled B). If B is better than A, then use B to update A. At the same time, establish and maintain an external archive set for the population. Use the Pareto optimal solution algorithm to calculate and store all non-dominated solutions found in the current iteration. Update the archive set by comparing the dominance relationship between the particle's current position and the solutions in the archive set, and select the solution with the larger crowding distance from the archive set as the current global optimal leader. S1044: After each iteration, particles are sorted by non-dominated order and crowding distance. The optimal particle set is retained for the next generation to maintain population diversity. The algorithm terminates when the maximum number of iterations is reached, or when the improvement of the optimal solution for multiple consecutive generations is less than a preset threshold. The optimal compromise solution in the current non-dominated solution set is output as the final collaborative acquisition scheme. This scheme is used to determine the flight path and task sequence of each UAV, the monitoring task of each fixed monitoring device, and the gimbal adjustment parameters.

[0024] S2: Using the NeRF algorithm, construct the dynamic environmental features of the target area based on collaboratively acquired data and electronic maps. These dynamic environmental features include three-dimensional geometric information, dynamic target trajectory information, and temperature field distribution information. Specifically, this includes the following sub-steps: First, the multi-source collected data and electronic map basic data obtained by edge-end collaborative scheduling are preprocessed. The collaboratively collected data includes multi-view local image sequences collected by RGB high-definition cameras on drones and infrared thermal imagers, point cloud data collected by lidar, time-series panoramic image sequences collected by fixed monitoring equipment, and GIS geographic information data of the target area based on electronic map data, etc. The network time protocol is used to calibrate the timestamps of collaboratively collected data. Based on the trigger time of the edge scheduling command, the offset of the collection time of each device is calculated. The asynchronous data is interpolated and completed to obtain a time-synchronized dataset. The local coordinates of the UAV and the local coordinates of the fixed monitoring equipment are converted to the global coordinates of the electronic map to construct a spatiotemporally aligned dataset.

[0025] S21: Initial screening of dynamic candidate regions using inter-frame difference method: Using three consecutive synchronized frames as the basic processing unit, the gray-level difference between adjacent frames is calculated to initially locate regions where dynamic targets may exist. Specifically, the gray-level difference images between frame t and frame t-1, and between frame t+1 and frame t are calculated respectively. Then, a logical AND operation is performed on the two sets of difference images to filter out pseudo-dynamic regions caused by single-frame differences that are easily affected by changes in lighting or slight equipment vibration. Subsequently, the OTSU adaptive threshold segmentation algorithm is used to automatically determine the threshold for distinguishing between dynamic and static regions. The difference images after the logical AND operation are binarized to obtain a preliminary dynamic candidate region mask. The binarization determination formula is as follows: ; In the formula, =1 indicates a dynamic candidate region. =0 indicates a static background area. The difference image grayscale value after performing a logical AND operation. The adaptive threshold determined for the OTSU algorithm; To address potential noise points and blurred boundaries in the initial candidate masks, the Lucas-Kanade sparse optical flow algorithm is introduced for precise correction. Noise points include minute grayscale changes caused by leaf movement. Based on the constraint of constant pixel brightness, the sparse optical flow algorithm distinguishes between real dynamic targets and noise interference by solving for the optical flow vector of each pixel (representing the displacement change of a pixel between consecutive frames). Specifically, a set of overdetermined gradient equations is constructed for the pixels within the candidate region, and the optical flow vector (u,v) is solved using the least squares method. The optical flow vector is calculated using the following formula: ; In the formula, W is the gradient matrix of the pixel neighborhood. Let u and v be the time gradient, respectively, and let u and v be the horizontal and vertical components of the optical flow. A threshold value for the optical flow vector magnitude is set. This is used to remove components with amplitudes less than a threshold, resulting in the final dynamic region mask. The removal is performed using the following formula: ; To ensure the stability of the dynamic mask across consecutive frames and avoid jump errors in subsequent sampling stages, spatiotemporal consistency optimization is performed on the precise dynamic mask. First, morphological closing operations (dilation followed by erosion) are used to eliminate isolated noise points in the mask and fill in tiny holes. Then, based on the inter-frame correlation in the time dimension, the dynamic mask of multiple consecutive frames is smoothed, and finally, a final dynamic mask with clear edges and stable inter-frame performance is output, providing a precise spatial range definition for subsequent differential sampling.

[0026] S22: Using the final dynamic mask output by S21 as a spatial constraint, a differentiated density sampling strategy is designed for the ray sampling stage of the NeRF algorithm. While ensuring the accuracy of dynamic target modeling, the redundant calculation of static background areas is minimized, adapting to the limited computing resources at the edge. The specific implementation is as follows: Since the core of the NeRF algorithm is to uniformly sample points in the entire 3D space along the camera rays without distinguishing between dynamic and static regions, the computational efficiency is low, slowing down the system response speed. Therefore, based on this, an adaptive adjustment of the sampling density is achieved based on dynamic masks. The process includes: The scene is clearly divided into dynamic target area and static background area based on dynamic mask, and differential sampling is performed separately: Since the shape and position of dynamic targets (such as birds and mobile devices) change over time, higher density sampling points are needed to accurately capture their features. Therefore, fine layer sampling is adopted, dividing the effective length of light (the effective propagation length of light in the scene) into 32-64 equidistant sampling layers, and stacking 2-4 sampling points in each sampling layer. Through dense sampling of multiple layers and multiple points, the details of dynamic targets are not lost. For static background areas (such as fixed photovoltaic panels and dam walls), the features are stable and high-density sampling is not required. Coarse-layered sampling is adopted, which divides the effective length of light into 8-16 equidistant sampling layers and randomly samples one point in each sampling layer. This minimizes the number of sampling points and computational load while ensuring the integrity of the background structure. For all collected sampling points, the corresponding three-dimensional spatial coordinates are calculated based on their depth values. Combined with the added internal and external participation pose parameters obtained from the previous spatiotemporal registration, the sampling points are mapped from the device's local coordinate system to the global geographic coordinate system. Finally, a set of sampling points containing the three-dimensional coordinates, depth values, and mask identifiers of the sampling points is output, providing input data for subsequent radiation field modeling.

[0027] S23: Due to the limitations of traditional NeRF, which relies solely on single-modal constraints of RGB images, a further step is to incorporate time-series temperature data collected by fixed monitoring equipment as a core constraint term (including physical constraints) into the rendering loss function of the NeRF algorithm. This achieves deep coupling between three-dimensional geometric features and temperature field features, improving the accuracy of defect identification. The process includes the following steps: First, the time-series temperature frame sequence synchronously acquired by the fixed monitoring equipment is preprocessed. Based on the camera intrinsic and extrinsic parameter matrices and pose matrix, the pixel coordinates of the two-dimensional temperature frame are... Mapping to three-dimensional spatial coordinates yields spatialized temperature data, which is then mapped using the following formula: ; In the formula, r represents the three-dimensional spatial coordinates mapped from the pixel coordinates of the two-dimensional temperature frame, and K is the camera intrinsic and extrinsic parameter matrix. The camera pose matrix is used, and then the mapped temperature data is interpolated to complete it, eliminating the missing temperature values caused by the blind spot of the device's view, and finally obtaining the real temperature supervision value that corresponds one-to-one with the NeRF sampling point.

[0028] Due to the density field of the traditional NeRF algorithm and color field Among them, the density field is used to characterize the opacity of a spatial point, the color field is used to characterize the color of a point in a specific observation direction, and the temperature field is set. Through MLP network The characteristics of the three fields are jointly modeled, and the joint formula is: ; In the formula, This is a positional encoding function used to improve the network's ability to fit high-frequency features. For the trainable parameters of the MLP network, the cumulative temperature rendering value along the light ray is calculated using a volume rendering algorithm. The discretized approximation formula is as follows: ; In the formula, This is the cumulative temperature rendering value along the camera ray (the temperature value rendered by the NeRF algorithm model). It represents the rendering result calculated by the NeRF model using the temperature of the sampling points, corresponding to the real temperature frame. This result is used for subsequent error comparison with the real temperature data. N is the total number of sampling points on a single ray, determined by both high-density sampling in dynamic regions and sparse sampling in static regions. i and j are the indices of two sampling points on a single ray. For the i-th sampling point The model predicts temperature values. Let be the three-dimensional spatial coordinates of the i-th sampling point, mapped to the global geographic coordinate system, corresponding one-to-one with the actual spatial location of the inspection scene. Let be the cumulative transmittance of all sampling points preceding the i-th sampling point, representing the probability that light can penetrate the preceding sampling points and reach the i-th sampling point. Let be the opacity of the i-th sampling point, representing the degree to which that sampling point blocks light. The calculation formula is: ; In the formula, Sampling points density value, The depth interval between the i-th and (i+1)-th sampling points; Construct a multimodal rendering loss function with temperature constraints using the following formula: ; In the formula, , and These are the weighting coefficients for each loss, and their sum is 1. Mean squared error loss for RGB image rendering Chamfer distance loss for point cloud geometric constraints. The mean square error loss for temperature rendering is calculated using the following formula: In the formula, Temperature values rendered for the model. To obtain the actual temperature monitoring value collected by the fixed monitoring equipment, the temperature value of the two-dimensional sampling point is mapped to a three-dimensional space value. As a physical constraint on temperature data, based on the physical law of heat conduction, it is used to quantify the deviation between the temperature field predicted by the NeRF model and the actual physical law, that is, to characterize the degree of fit between the temperature field rendered by the model and the physical law of heat conduction, to ensure that the temperature distribution output by the model conforms to the basic logic of heat transfer, thereby avoiding physical paradoxes caused by pure data-driven modeling (such as abrupt changes in the temperature gradient on the surface of photovoltaic panels, and irregular fluctuations in the internal temperature of dams). The calculation of physical constraint terms is realized by discretization based on the Fourier law of heat conduction.

[0029] The addition of temperature physical constraints enables the NeRF model to adapt to temperature field changes under different environmental conditions (such as day-night temperature difference and rainy weather), avoids model failure caused by environmental interference, and improves the stability of the algorithm in complex inspection scenarios.

[0030] Traditional NeRF volume rendering is based on continuous integral calculations. However, the computing power at the edge is limited, making it difficult to support complex integral operations. Therefore, the continuous integral process needs to be discretized into a summation operation. By traversing all sampling points on the light ray, the discrete weighted sum approximates the continuous integral result, which reduces computational complexity and adapts to the real-time modeling requirements at the edge. At the same time, the weights of opacity and cumulative transmittance are introduced to ensure that the temperature rendering value can truly reflect the comprehensive contribution of temperature at each spatial point during light propagation. This is consistent with the physical logic that light penetration affects temperature observation in real inspection scenarios. Converting continuous integrals into discrete summations improves computational efficiency, enabling temperature field rendering to be completed in real time at the edge. This avoids the latency problem of traditional continuous integrals requiring cloud computing power. By weighting opacity and cumulative transmittance, the rendered temperature can accurately match the real light propagation law, reducing temperature deviations caused by ignoring light penetration. This provides a reliable temperature data foundation for subsequent temperature defect identification in photovoltaic fields.

[0031] S24: Redundant convolutional layers of the NeRF network are pruned, and pre-trained features from static background regions are reused to reduce the number of parameters and computational cost of the NeRF model. This enables efficient real-time modeling at the edge, solving the problems of excessively long retraining times for traditional NeRF models and difficulty in adapting to edge computing power. The specific implementation process includes: For redundant convolutional kernels in the MLP network of the NeRF model, kernels with small weight contributions and negligible impact on model accuracy are simplified using an L1 regularization pruning strategy: Leveraging the long-term stability of static background regions (such as the dam body and photovoltaic array frame) in a hydro-solar hybrid power station scenario, NeRF model training for static background regions is completed in the cloud in advance to obtain training feature parameters. During the actual edge modeling process, the network layer parameters corresponding to the static background regions are directly frozen and no longer retrained. Only the network layer parameters corresponding to dynamic target regions and suspected defect regions are updated, reducing the proportion of parameter updates and shortening the training time of the edge model. This achieves an efficient modeling mode of static feature reuse and dynamic feature incremental update, ensuring that the edge can quickly output accurate dynamic environmental features.

[0032] In S3, the specific process of generating UAV inspection routes and fixed monitoring equipment monitoring strategies based on dynamic environmental characteristics is as follows: Step 1: Extract view coverage analysis data from dynamic environmental features: including the three-dimensional coordinates of each task point, the instantaneous position and movement trend of dynamic targets (such as water flow, mobile devices), the installation coordinates and current pan-tilt status of fixed monitoring equipment, and calculate the "effective view score" of each fixed monitoring equipment for each task point based on the three-dimensional geometric information: score = view coverage integrity (the proportion of the task point in the equipment monitoring screen) × temperature field observation sensitivity (the identifiability of temperature data under the equipment view) × dynamic target interference coefficient (the inverse of the probability that the dynamic target occludes the task point); Assign initial gimbal adjustment parameters to fixed monitoring equipment. The gimbal adjustment parameters include rotation angle and rotation time. Based on the effective viewing angle analysis, determine the initial rotation angle of each fixed monitoring equipment to ensure that the fixed monitoring equipment prioritizes coverage of high-priority task points in the initial state and that its rotation process avoids the movement trajectory of dynamic targets. Step 2: Extract constraint parameters: From dynamic environmental features, select static obstacle boundaries (such as the 3D contours of dam walls and transmission towers), dynamic target trajectories (such as water vortex areas and floating object movement paths), and equipment distribution coordinates (3D positions of fixed monitoring equipment and task points). Define the multi-objective function of the RNBO algorithm as the objective, constructing a multi-objective function that minimizes the total flight path length, maximizes the success rate of dynamic target avoidance, and maximizes the collaborative efficiency of fixed equipment. The calculation formula for the multi-objective function is as follows: ; In the formula, , and Here, L is the weighting coefficient, and L is the normalized total heading length of the UAV, which is the sum of the Euclidean distances between multiple task points. The value of L is obtained after normalization using the max-min algorithm. The success rate for dynamic target avoidance is used to quantify the probability that a drone's flight path will avoid all dynamic targets, and its value ranges from 0 to 1. The calculation formula is: ; In the formula, m is the total number of dynamic targets, that is, the number of targets that move or change within the inspection area. Let j be the real-time 3D coordinates of the j-th dynamic target. It involves planning the flight path for the drone, setting it as a continuous three-dimensional path curve. For indicator functions, Let be the minimum distance between the UAV's flight path and the j-th dynamic target. The set safe avoidance distance is used to quantify the minimum safe distance between the UAV and dynamic targets. For the j-th dynamic target, the UAV flight path is calculated. and The minimum distance between them; if this distance is greater than the safe avoidance distance, the target avoidance is successful; otherwise, the avoidance fails. ; In the formula, n is the total number of fixed monitoring devices. The time it takes for the drone to reach the monitoring area of the fixed device f. The time for the fixed device f to complete the gimbal adjustment is used to penalize the failure of coordination caused by the timing asynchrony between the two devices; The process of solving the NRBO algorithm is implemented using population optimization: Initialize the flight path population and generate multiple initial paths based on the distribution of task points, avoiding static obstacle boundaries; Iterative optimization is performed. In each iteration, the route nodes are rearranged (the order of task point visits is adjusted) and the path is smoothed (local paths are corrected to avoid dynamic objectives). The multi-objective function value F value of each dynamic route is calculated. The smallest nondominated solution of F is retained, and the next generation population is generated through crossover and mutation operations until the iteration converges (which can be set to the optimal solution change rate being less than 0.5% for 5 consecutive generations), and the dynamic route is output. Step 3: Based on the optimal dynamic route output by the NRBO algorithm, extract the time sequence information of each node (task point) and the information of the fixed monitoring equipment corresponding to that node. The information of the fixed monitoring equipment includes the equipment number, initial gimbal parameters and the monitoring range of the equipment. Ensure that the fixed monitoring equipment completes data acquisition from the optimal perspective while the drone is stationary. The optimal perspective meets the following conditions: the target point is in the center area of the device's screen within the deviation threshold; the temperature field data acquisition accuracy is the highest and there are no dynamic targets obstructing the view. The corrected pan-tilt-zoom (PTZ) adjustment parameters of the fixed monitoring equipment are obtained by correcting the initial PTZ adjustment parameters using a greedy algorithm. The process includes: Starting from the initial gimbal parameters, calculate the viewing angle adaptation degree under the current parameters, which is the degree to which the above optimal viewing angle conditions are met, with a value range of 0-1; The gimbal rotation angle is finely adjusted locally, and the viewing angle adaptability of the new parameters is calculated. Maintain the parameters with higher adaptability, repeat the fine-tuning process until the adaptability is achieved, and obtain the corrected gimbal adjustment parameters; If the fit is still not up to standard after multiple fine-tunings, the adjustment range should be expanded, or the fixed monitoring equipment at the task point should be reassigned. Step 4: Based on the optimized UAV flight path, generate detailed inspection instructions for the UAV, including the take-off point, the order of visiting mission points, the hovering altitude of each mission point, the flight speed, the dwell time and the collected parameters (such as camera focal length and infrared temperature measurement sensitivity). Based on the corrected gimbal adjustment parameters, a monitoring strategy for fixed monitoring equipment is generated, including gimbal rotation angle sequence, camera exposure parameters, temperature data sampling frequency, and collaborative triggering conditions with the UAV. For drones to perform inspection routes, they fly according to generated instructions and synchronously collect multimodal data at each task point. The multimodal data includes RGB high-definition images, infrared thermal imaging data, and lidar point cloud data. Multiple fixed monitoring devices adjust their attitude according to the corrected gimbal parameters during UAV data collection. Simultaneously, they acquire time-series panoramic images, continuous temperature field data, and record their own status data. The acquired multimodal data is preprocessed in real time, including denoising the acquired images using Gaussian filtering and median de-noising, point cloud distortion correction based on UAV attitude data, temperature calibration to eliminate environmental temperature interference, and data acquisition timestamp synchronization. For any time frame of data acquired by the fixed monitoring equipment, a JSON format flag data segment is added, consisting of the acquisition timestamp, the unique number of the UAV or fixed monitoring equipment, and the numbers of the flight path and task point. For data frames acquired by the UAV, a flag data segment is added, consisting of the acquisition timestamp, the UAV number, and the unique code of the fixed monitoring equipment that cooperated with the UAV at the acquisition time point, and is used for subsequent data fusion. The multimodal fusion and cross-validation algorithm based on Transformer is used to analyze the acquired multimodal data to obtain preliminary inspection results of the target area: The acquired multimodal data were subjected to separate feature extraction, including: using ResNet-50 network to extract spatial features of RGB images, using CNN network to extract temperature gradient data of infrared thermal imaging data, using PointNet++ network to extract three-dimensional geometric features of lidar point cloud data, and using LSTM network to extract temporal features of temperature time series data, including the changing trend and duration of temperature anomalies. A multimodal fusion Transformer model is constructed, which maps the features extracted from each modality to feature vectors of the same dimension. The RGB image features, point cloud geometric features, infrared temperature gradient features, and temperature time series features are uniformly mapped to feature vectors of the same dimension through a 1×1 convolutional layer. Modality type embedding and location embedding are added. Modality type embedding is used to distinguish data of different modalities, and location embedding is generated based on the three-dimensional coordinates of the task point to represent the spatial location association of features. Finally, an enhanced feature vector containing modality and location information is obtained. The Transformer fusion model includes a 6-layer encoder, with each encoder layer containing a multi-head self-attention mechanism (8 attention heads) and a feedforward neural network (2048 hidden layers). The enhanced feature vectors of the four modalities are input into the Transformer encoder. The association weights between different modal features are calculated through the self-attention mechanism (such as the weight enhancement of infrared features and point cloud geometric features in temperature anomaly regions), realizing deep interaction and fusion of cross-modal features and outputting global fused features. Based on the fusion features, the classifiers (fully connected layer, Softmax) corresponding to the four modalities are input respectively to obtain the defect judgment results (defect type or no defect, confidence level) for each modality. The cross-validation rule is set: the defect judgment results of at least three modalities are consistent and the confidence level is ≥0.7 before it is judged as a real defect. The defect types are divided into 5 major categories and 12 subcategories: thermal defects: hot spots on photovoltaic panels, overheating of transmission tower joints, etc.; structural defects: cracks in dams, deformation of photovoltaic panel supports, etc.; foreign object intrusion: birds building nests, accumulation of floating objects, etc.; equipment abnormalities: sensor malfunctions, gimbal shift, etc.; environmental interference: water surface reflection, fog obstruction, etc. A standardized preliminary inspection result set is generated, which includes information such as the spatial location (3D GIS coordinates) of defects, type, severity (based on confidence and feature strength classification: minor, moderate, severe), multimodal feature matching degree, collection timestamp, and corresponding equipment number.

[0033] In S3, a ternary weighted objective function is constructed, incorporating flight path length, dynamic target avoidance success rate, and fixed equipment coordination efficiency. This transforms temporal synchronization into a quantifiable penalty term, adapting to the challenges of dense equipment, numerous dynamic interferences, and stringent collaborative data acquisition requirements in hydro-solar hybrid power plants. Furthermore, based on the type and motion characteristics of dynamic targets, categorized safe avoidance distances are set to improve the success rate of UAVs in avoiding dynamic targets such as birds and floating objects. By combining indicator functions and mean calculation, a quantitative evaluation model for the success rate of dynamic target avoidance is constructed. This transforms the binary judgment of whether avoidance is successful into a probabilistic indicator, providing precise safety dimension input for multi-objective functions and improving the scientificity and accuracy of flight path planning. Based on inspection requirements, the model coordinates the drone flight path and the monitorable range of fixed monitoring equipment, increasing the inspection coverage area of a single drone and the synergy of data collected by the drone and fixed monitoring equipment. Furthermore, by incorporating the coordination efficiency of fixed equipment into the optimization objective, the deviation between the drone's arrival time at the task point and the fixed monitoring equipment's gimbal adjustment completion time is further reduced, improving the coordination success rate. This ensures that the drone and fixed equipment can collect data in the same time period and from the same perspective, providing a spatiotemporally consistent data foundation for subsequent multimodal data fusion and cross-validation, thereby increasing the difficulty and accuracy of data fusion.

[0034] This solution employs a NeRF modeling approach that combines dynamic masking with differentiated sampling and multimodal physical constraints. It utilizes a dynamic target region extraction strategy, employing inter-frame difference for initial screening and Lucas-Kanade sparse optical flow correction, to accurately separate dynamic targets from static backgrounds in inspection scenarios. This resolves the issue of false dynamic region misjudgment caused by lighting interference and equipment jitter. Through a differentiated radiation field sampling mechanism—high-density sampling of dynamic regions and sparse sampling of static regions—the solution reduces edge-end sampling computation while maintaining modeling accuracy for both dynamic targets and defect regions. Furthermore, it incorporates physical constraints of the temperature field into the NeRF model training, avoiding physical paradoxes caused by purely data-driven modeling through physical constraint loss derived from Fourier's heat conduction law. This achieves deep coupling of geometric, temperature, and dynamic features. Furthermore, by employing a layered collaborative architecture that combines lightweight edge processing and cloud optimization, and addressing edge computing power constraints, a NeRF model lightweighting strategy is implemented, which involves pruning redundant convolutional layers and reusing pre-trained features from static backgrounds. This freezes static regional network parameters and performs incremental updates only for dynamic and defective regions, shortening edge model training time and improving system response speed. Additionally, an incremental encoding and compression transmission mechanism is used to upload only defective features and dynamic target information that differ from historical data, reducing data transmission volume and solving the transmission challenges posed by weak signals and limited bandwidth in remote power plants.

[0035] By integrating multimodal data from fixed monitoring equipment and drones, and combining the dual safeguards of multimodal data and physical constraints, the accuracy of temperature identification is improved, reducing the false alarm rate of equipment thermal and structural defects that are common in hydropower stations. Through precise extraction and trajectory modeling of dynamic target areas, dynamic safety hazards such as bird intrusions and unauthorized personnel operations are effectively identified, shortening the early warning response time and avoiding equipment damage caused by interference from dynamic targets. Furthermore, through differentiated sampling and lightweight model strategies, the NeRF model can run efficiently at the edge without relying on cloud computing power, adapting to the problems of high latency and high bandwidth dependence in remote areas, effectively avoiding the transmission bottleneck of weak signals in remote power stations, and ensuring the stable transmission and processing of inspection data.

[0036] The above description is merely a preferred embodiment of the present invention and is not intended to limit the present invention in any way. Although the present invention has been disclosed above with reference to preferred embodiments, it is not intended to limit the present invention. Any person skilled in the art can make some modifications or alterations to the above-disclosed technical content to create equivalent embodiments without departing from the scope of the present invention. Any simple modifications, equivalent changes and alterations made to the above embodiments based on the technical essence of the present invention without departing from the scope of the present invention shall still fall within the scope of the present invention.

Claims

1. A method for intelligent inspection of a hydropower station based on a UAV, characterized in that: Includes the following steps: S1: In response to inspection events, the edge device uses a region division and optimization scheduling algorithm to schedule drones and fixed monitoring equipment with gimbals to collaboratively collect data on the target area. S2: The NeRF algorithm is used to construct dynamic environmental features of the target area based on collaboratively collected data and electronic maps. The dynamic environmental features include three-dimensional geometric information, dynamic target motion trajectory information, and temperature field distribution information. S3: Based on the dynamic environmental characteristics, generate and execute the inspection route of the UAV and the monitoring strategy of the fixed monitoring equipment, obtain multimodal data based on the execution results, and analyze the data based on the Transformer multimodal fusion and cross-validation algorithm to obtain the inspection results of the target area. S4: Upload the inspection results to the cloud. The cloud uses a pre-trained temporal attention LSTM prediction model to predict the risks of the inspection results and generate optimization strategies to guide subsequent inspections. 2.The unmanned aerial vehicle based intelligent inspection method of a hydropower station according to claim 1, characterized in that: In S1, the scheduling algorithm adopts a dynamic optimization scheduling model based on multi-objective optimization. The specific process includes: Define decision variables, including normalized UAV task allocation variables, UAV path variables, fixed monitoring equipment task allocation variables, and fixed monitoring equipment gimbal adjustment variables; A weighted cost function is constructed using a weighted fusion method, with time efficiency, system energy consumption, coverage quality targets, and network transmission quality as sub-functions. Constraints are set based on drone endurance constraints, mission time window constraints, fixed monitoring equipment monitoring range constraints, mission must be monitored constraints, and signal strength constraints. The collaborative acquisition scheme is obtained by solving the dynamic optimization scheduling model using an improved multi-objective particle swarm optimization algorithm. 3.The unmanned aerial vehicle based intelligent inspection method of a hydropower station according to claim 2, characterized in that: The formula for calculating the weighted cost function is as follows: wherein, , , and are weight coefficients for balancing the importance among the four targets, which can be dynamically adjusted according to the task urgency, signal strength, etc. , , and are normalized values uniformly within the interval , respectively, are the time efficiency target, the system energy consumption target, the coverage quality target, and the network transmission target.

4. The intelligent inspection method for hydropower stations based on unmanned aerial vehicles (UAVs) according to claim 3, characterized in that: The formula for calculating the coverage quality target is: ； In the formula, For the drone coverage quality function, The coverage quality function for fixed monitoring equipment, where 'o' is the steepness parameter of the sigmoid function, is used to control the sensitivity of the effect of height on quality. Here, m represents the signal strength coefficient at task point i, m represents the total number of drones, and n represents the total number of tasks to be executed. Assign variables to fixed monitoring equipment tasks. Assign variables to the task of the drone u.

5. The intelligent inspection method for hydropower stations based on unmanned aerial vehicles (UAVs) according to claim 1, characterized in that: S2 specifically includes the following sub-steps: Preprocess the collaboratively collected data to construct a spatiotemporally aligned dataset; The inter-frame difference method combined with the OTSU adaptive threshold segmentation algorithm is used to initially screen dynamic candidate regions, and the noise points and boundary blur of the dynamic candidate regions are corrected using the optical flow algorithm to obtain the final dynamic region mask. Using the final dynamic region mask as a spatial constraint, the NeRF algorithm is used to perform fine-level and coarse-level sampling on the dynamic target region and the static background region, respectively.

6. The intelligent inspection method for hydropower stations based on unmanned aerial vehicles (UAVs) according to claim 5, characterized in that: S2 also includes a multimodal rendering loss function for the NeRF algorithm constructed using temperature physics constraints.

7. The intelligent inspection method for hydropower stations based on unmanned aerial vehicles (UAVs) according to claim 5, characterized in that: S2 also includes pruning redundant convolutional layers of the NeRF algorithm's MIP network using L1 regularization and reusing pre-trained features from the background static region.

8. The intelligent inspection method for hydropower stations based on unmanned aerial vehicles (UAVs) according to claim 1, characterized in that: In S3, the specific process of generating UAV inspection routes and fixed monitoring equipment monitoring strategies based on dynamic environmental characteristics is as follows: View coverage analysis data is extracted from dynamic environmental features, and initial gimbal adjustment parameters are assigned to fixed monitoring equipment. The initial gimbal adjustment parameters include rotation angle and rotation time. The dynamic flight path of the UAV is determined by extracting static obstacle boundaries, dynamic target motion trajectories, and equipment distribution coordinate information from dynamic environmental features and using the NRBO algorithm. Based on the time sequence information of nodes obtained from the dynamic flight path and the information of fixed monitoring equipment at the corresponding nodes, the initial gimbal adjustment parameters of the fixed monitoring equipment are corrected using a greedy algorithm to obtain the corrected gimbal adjustment parameters. Based on the dynamic flight path and the corrected gimbal adjustment coefficient, the inspection flight path of the UAV and the monitoring strategy of the fixed monitoring equipment are generated respectively.

9. The intelligent inspection method for hydropower stations based on unmanned aerial vehicles (UAVs) according to claim 8, characterized in that: S3 includes the use of population optimization algorithms to solve the NRBO algorithm to determine the dynamic flight path of the UAV, including the following processes: Initialize the flight path population and generate multiple initial paths based on the distribution of task points, avoiding static obstacle boundaries; Iterative optimization is performed, and in each iteration, route nodes are rearranged and paths are smoothed, and the multi-objective function value of each dynamic route is calculated. The non-dominated solution with the smallest multi-objective function value is retained, and the next generation population is generated through crossover and mutation operations until the iteration converges, and the dynamic route is output.