A hydrogen-electric hybrid unmanned aerial vehicle power supply configuration optimization method and related equipment

By employing a two-layer optimization framework and transfer learning technology, the problem of the disconnect between hardware configuration and energy management in hydrogen-electric hybrid UAV systems has been solved. This has enabled accurate assessment of the entire lifecycle cost and efficient power system design, thereby improving the economic efficiency and mission adaptability of the UAV.

CN122287291APending Publication Date: 2026-06-26SOUTH CHINA UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SOUTH CHINA UNIV OF TECH
Filing Date
2026-02-10
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

In existing hydrogen-electric hybrid drone system designs, hardware configuration and energy management strategies are disconnected, making it impossible to achieve global optimization and resulting in inaccurate cost predictions, which makes it difficult to meet the long-term operational economic requirements.

Method used

A two-layer optimization framework is adopted: the outer layer optimizes hardware parameters, and the inner layer uses transfer learning technology to generate the optimal energy management strategy. Combined with full life cycle cost assessment, the synergistic optimization of hardware and energy management is achieved.

Benefits of technology

This achieves deep synergy between hardware configuration and energy management strategies, improving the economics and mission capabilities of drones, reducing long-term operating costs, and enhancing design efficiency and solution adaptability.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122287291A_ABST
    Figure CN122287291A_ABST
Patent Text Reader

Abstract

This application provides a method and related equipment for optimizing the power configuration of a hydrogen-electric hybrid unmanned aerial vehicle (UAV), belonging to the field of UAV overall design and energy management. The method employs a two-layer nested optimization architecture: the outer model aims to minimize the total lifecycle cost and maximize the payload, using algorithms to optimize hardware parameters such as fuel cell rated power, lithium battery capacity and number of cells in series, and hydrogen storage tank volume; the inner model, given the hardware configuration of the outer layer, uses a deep reinforcement learning algorithm based on transfer learning for energy management, rapidly generating the optimal power allocation strategy through similarity scaling and fine-tuning techniques. This application achieves coordinated optimization of hardware selection and control strategies by using a closed-loop full lifecycle simulation, combining a power cycle loss model and an ampere-hour throughput method to evaluate the lifecycle losses of lithium batteries and hydrogen fuel cells respectively, making it suitable for customized designs of hydrogen-electric hybrid UAVs with different mission requirements.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of unmanned aerial vehicle (UAV) overall design and energy management technology, and in particular to a method for optimizing the power configuration of a hydrogen-electric hybrid UAV and related equipment. Background Technology

[0002] With the increasing demand for long-endurance and high-payload capabilities in logistics, inspection, surveying, and other fields, hybrid power systems composed of hydrogen fuel cells and lithium batteries have emerged as a highly promising solution. Hydrogen fuel cells offer advantages such as high energy density, long flight time, and rapid refueling, while lithium batteries provide high power density to meet the high power requirements of drones during takeoff and climb. However, designing a cost-effective and high-performance hydrogen-electric hybrid power system for specific missions still faces the following challenges: Existing power system design methods mostly adopt a two-stage approach: "select first, control later." This means that first, based on static energy balance or empirical formulas, the capacity parameters of the fuel cell, lithium battery, and hydrogen storage tank are initially selected; then, an energy management strategy is designed for the selected hardware system. This method severs the dynamic coupling between hardware configuration and operational control, resulting in the final design scheme failing to fully realize the system's potential and making it difficult to achieve global optimization.

[0003] Furthermore, existing designs often focus on optimizing initial purchase costs, neglecting the system's operating costs throughout its entire lifecycle, particularly the maintenance and replacement costs caused by fuel cell stack degradation and lithium battery lifespan decline. In actual flight, drastic fluctuations in load power can cause nonlinear damage to fuel cells, and the lifespan of lithium batteries is closely related to their charge / discharge depth and rate. Traditional simplified lifespan models cannot accurately reflect component degradation under these complex operating conditions, leading to distorted cost predictions and poor long-term operational economics.

[0004] At the optimization method level, achieving coordinated optimization of hardware configuration and energy management strategy during the design phase essentially constitutes a complex two-layer optimization problem: the outer layer optimizes hardware parameter combinations, while the inner layer solves for the optimal energy management strategy for each set of parameters. Directly using intelligent algorithms such as reinforcement learning to train control strategies from scratch for massive hardware combinations is computationally extremely costly and cannot meet the timeliness requirements of engineering design. Summary of the Invention

[0005] The main objective of this application is to propose a method for optimizing the power configuration of a hydrogen-electric hybrid unmanned aerial vehicle (UAV), as well as electronic devices, storage media, and program products. The aim is to achieve deep synergistic optimization of power system hardware parameters and energy management strategies, and to obtain an economically optimal and mission-adaptable power system design scheme through accurate assessment of the entire life cycle cost. At the same time, the computational efficiency of the optimization process is significantly improved by utilizing techniques such as transfer learning.

[0006] To achieve the above objectives, one aspect of this application proposes a method for optimizing the power configuration of a hydrogen-electric hybrid unmanned aerial vehicle (UAV), the method comprising: The outer optimization step, based on the preset flight mission and constraints, aims to minimize the total life cycle cost and maximize the payload, and optimizes the power system hardware configuration scheme, including the rated power of fuel cells, the capacity and configuration of lithium battery packs, and the volume of hydrogen storage tanks. The inner optimization step involves using a deep reinforcement learning-based energy management strategy to optimize power allocation for each hardware configuration scheme generated in the outer optimization step, in order to minimize energy consumption and component lifespan loss in a single operating cycle. The inner optimization step uses transfer learning technology to initialize the energy management strategy agent. The transfer learning scales and fine-tunes the pre-trained strategy model based on the similarity between the current hardware configuration and the pre-trained benchmark configuration, so as to quickly generate the optimal energy management strategy that is adapted to the current hardware configuration. The method achieves coordinated optimization of power system hardware configuration and energy management strategy by iteratively executing the outer optimization step and the inner optimization step.

[0007] In some embodiments, the total lifecycle cost includes at least two of the following: initial investment cost, energy consumption cost, operation and maintenance cost, and component replacement cost due to the lifespan degradation of fuel cells and lithium batteries; the lifespan degradation is quantitatively assessed by establishing a power cycle loss model and / or an ampere-hour throughput model for the components.

[0008] In some embodiments, the optimization variables in the outer optimization step include at least the fuel cell rated power. Total capacity of lithium battery pack Number of lithium battery cells in series and hydrogen storage tank volume Each variable takes a value within a preset feasible domain.

[0009] In some embodiments, during the outer optimization step, the generated hardware configuration scheme is subjected to physical constraint verification; the physical constraints include at least the requirement that the effective payload calculated from the maximum takeoff weight of the UAV, the frame mass, and the mass of each component must be greater than zero.

[0010] In some embodiments, the transfer learning technique specifically includes: Calculate the similarity scaling factor that characterizes the proportional relationship between the current hardware configuration and the baseline hardware configuration; Based on the scaling factor, the system state space under the current configuration is mapped to the state space corresponding to the baseline configuration; The weights of the policy neural network pre-trained on the baseline configuration are loaded as initial weights; Fine-tuning training is performed under the current configuration environment to obtain the final energy management strategy.

[0011] In some embodiments, the energy management strategy model of the pre-trained benchmark configuration is trained on a benchmark UAV system with a typical mission profile using a near-end policy optimization algorithm.

[0012] In some embodiments, the inner-layer optimization step evaluates the performance indicators under the energy management strategy by simulating a complete charge-discharge cycle of the drone from full energy load to inability to complete a single mission; the performance indicators include at least single-cycle fuel cell lifetime loss. Equivalent total cycle life of a single-cycle lithium battery Single-cycle hydrogen and electricity fuel costs And the number of mission flights that can be completed within a single cycle .

[0013] In some embodiments, the single-cycle fuel cell lifespan loss The calculation is performed by accumulating the average power operation damage, power fluctuation dynamic damage, and high power condition penalty damage in a single flight mission. The equivalent full cycle number of the single-cycle lithium battery The calculation is based on the relationship between the absolute time integral of the lithium battery output current and the total battery capacity.

[0014] In some embodiments, the outer optimization step is solved using a multi-objective optimization algorithm, outputting a set of Pareto optimal solutions that are independent of each other on the two objectives of total lifecycle cost and payload.

[0015] In some embodiments, the multi-objective optimization algorithm is a series of non-dominated sorting genetic algorithms or a decomposition-based multi-objective evolutionary algorithm.

[0016] To achieve the above objectives, another aspect of this application provides an electronic device, which includes a memory and a processor. The memory stores a computer program, and the processor executes the computer program to implement the method described above.

[0017] To achieve the above objectives, another aspect of the embodiments of this application proposes a computer-readable storage medium storing a computer program that, when executed by a processor, implements the method described above.

[0018] To achieve the above objectives, another aspect of the embodiments of this application proposes a computer program product, including a computer program that, when executed by a processor, implements the method described above.

[0019] Compared with the prior art, this application has the following significant advantages: 1) Achieving Deep Collaborative Design: This application breaks away from the traditional fragmented design model of "selecting first, then controlling," and through a two-layer optimization framework, enables hardware selection and energy management strategies to be synchronized and optimized in a closed loop. This ensures that the final selected hardware configuration can maximize the energy-saving and life-extending potential of the intelligent energy management strategy, thereby improving the overall economy and mission capabilities of the UAV system.

[0020] 2) Improved accuracy of cost assessment: By integrating a full lifecycle cost analysis that includes a nonlinear lifespan decay model, the optimization results of this application are closer to the long-term actual operating costs of the system. This avoids the risk of a surge in costs later due to underestimating the frequency of component replacement, providing a more reliable basis for investment decisions.

[0021] 3) Significantly improve optimization efficiency: To address the problem of massive and time-consuming strategy training in inner-layer optimization, an innovative transfer learning mechanism based on similarity scaling is introduced. This mechanism can quickly transfer learned energy management experience to new hardware configurations, avoiding repeated training from scratch. This enables large-scale two-layer collaborative optimization to be completed within the engineering timeframe, demonstrating practical application value.

[0022] 4) Enhanced adaptability and flexibility: This application provides a general optimization framework. By inputting different mission parameters (flight time, range, payload requirements, etc.), it can automatically generate optimal or suboptimal power system configuration schemes that match them, enabling rapid response to diverse UAV application scenarios and demonstrating strong customization capabilities. Attached Figure Description

[0023] Figure 1 This is a supplementary flowchart of the power configuration optimization method for hydrogen-electric hybrid drones provided in the embodiments of this application.

[0024] Figure 2 This is a general flowchart of the power configuration optimization method for hydrogen-electric hybrid drones provided in the embodiments of this application.

[0025] Figure 3 This is a schematic diagram of the hardware structure of the electronic device provided in the embodiments of this application. Detailed Implementation

[0026] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to limit it. In the following description, when referring to the accompanying drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with those of this application; they are merely examples of apparatuses and methods consistent with some aspects of the embodiments of this application as detailed in the appended claims.

[0027] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of this application only and is not intended to limit this application.

[0028] With the increasing size and longer endurance requirements of logistics drones, hybrid power systems composed of hydrogen fuel cells and lithium batteries have become a promising application solution. However, existing power system design methods suffer from drawbacks such as the separation of design and control, simplification of lifespan models, and time-consuming optimization calculations.

[0029] Traditional capacity configuration methods for hydrogen-electric hybrid drones are mostly based on static energy balance, assuming that the fuel cell provides all the average power, without considering the dynamic coupling effect of actual energy management strategies on energy utilization efficiency, component lifespan, and total lifecycle cost. Therefore, it is difficult to obtain a globally optimal and engineering-feasible design. Furthermore, existing capacity configurations often only consider the initial purchase cost, ignoring the component replacement costs throughout the lifecycle. Simple lifespan calculation formulas cannot reflect the nonlinear damage to the battery and fuel cell stack caused by complex flight missions. In bi-layer optimization, the inner layer needs to optimize energy management strategies for thousands of hardware combinations. If reinforcement learning is used, training each combination from scratch is extremely time-consuming, making bi-layer optimization computationally difficult to converge.

[0030] In view of this, this application provides a two-layer optimization method that integrates full lifecycle cost assessment, hardware parameter optimization, and real-time energy management strategy collaborative design. By introducing full lifecycle simulation loops and transfer learning techniques, this application effectively solves the aforementioned problems while achieving more accurate cost prediction and introducing a more intelligent reinforcement learning agent. The method is applicable to hydrogen-electric hybrid UAVs of different sizes and with different mission requirements, and can provide customized optimal energy system capacity configuration schemes for any given mission scenario.

[0031] like Figure 1 As shown, this embodiment provides a method for optimizing the power configuration of a hydrogen-electric hybrid drone, including the following steps: Step S101, Outer Optimization Step: Based on the user-defined UAV flight mission profile (such as range, cruise time, maximum takeoff weight, etc.) and physical constraints, construct an optimization problem with the objectives of minimizing the total lifecycle cost and maximizing the payload. The decision variables in this step include at least the rated power of the fuel cell, the total capacity and series configuration of the lithium battery pack, and the volume of the hydrogen storage tank. A multi-objective optimization algorithm (such as NSGA-III) is used to search for hardware configuration schemes within a predefined feasible region.

[0032] In one embodiment, the lifecycle cost assessment model integrates initial investment costs, energy (hydrogen and electricity) consumption costs, periodic operation and maintenance costs, and component replacement costs due to the end of the lifespan of fuel cells and lithium batteries. Specifically, the lifespan loss of fuel cells is quantified by considering average power, power fluctuations, and damage models under high-power conditions; the lifespan loss of lithium batteries is calculated using the ampere-hour throughput method, combined with a charge-discharge depth correction factor, to determine the equivalent number of cycles.

[0033] Step S102, Inner Optimization Step: For each candidate hardware configuration scheme generated in the outer step, its performance under the optimal energy management strategy needs to be evaluated. This step uses a deep reinforcement learning-based agent as the core of the energy management strategy. To accelerate the policy learning process, transfer learning technology is introduced: First, a general policy model is pre-trained on a system with typical tasks and benchmark hardware configurations; then, for each new candidate configuration, the policy network weights of the pre-trained model are scaled and transferred by calculating its similarity scaling factor with the benchmark configuration, and then rapidly fine-tuned to obtain the optimal energy management strategy adapted to the current hardware.

[0034] In one embodiment, the specific process of transfer learning includes: defining the ratio of the rated power of the current configuration to that of the baseline configuration as a similarity scaling factor K; normalizing the state variables (such as demand power, battery SOC, etc.) under the current configuration to the state space of the baseline configuration through factor K; loading the neural network parameters of the pre-trained policy model as initial values; and performing a small number of iterations of fine-tuning training in the new simulation environment to quickly adapt to the differences in hardware parameters.

[0035] Step S103, Cooperative Optimization Cycle: Under the control of the energy management strategy, a complete charge-discharge cycle flight mission simulation is performed on the current hardware configuration, collecting simulation data (such as hydrogen consumption, battery current, component damage, etc.) to calculate the total lifecycle cost and payload under this configuration. The outer optimization algorithm updates and generates a new generation of candidate hardware configurations based on this feedback information (i.e., the objective function value). Through iterative execution of outer optimization and inner simulation evaluation, the cooperative optimization and synchronous convergence of the hardware configuration and energy management strategy are ultimately achieved.

[0036] The following is a detailed description and explanation of the solutions in the embodiments of this application, with reference to specific application examples.

[0037] This embodiment presents a two-layer optimization method for the power configuration of hydrogen-electric hybrid drones, considering the entire lifecycle cost. The "entire lifecycle" refers to the entire usage period from system deployment to the fuel cell reaching the end of its design life. For example... Figure 2 As shown, it includes the following steps: Step S1 (Task Setting and Initialization): Set the UAV's flight mission trajectory, mission parameters, and operational constraints, and initialize the outer optimization algorithm's operating parameters.

[0038] Furthermore, the mission parameters in step S1 include the cruise time corresponding to the distance the UAV needs to fly, and the maximum takeoff weight of the UAV. (Unit: kg) and the corresponding drone frame mass ( (Unit: kg), propeller diameter and other parameters.

[0039] The outer optimization algorithm uses the Non-Dominated Sorting Genetic Algorithm III (NSGA-III), and the running parameters include population size, maximum number of iterations, reference point, crossover probability, mutation probability, etc.

[0040] Step S2 (Capacity Configuration Generation): Using the outer optimization algorithm, a set of candidate capacity configuration solutions is generated, including the fuel cell rated power, lithium battery pack capacity, number of battery cells connected in series, and hydrogen storage tank volume. These are the outer decision variables, expressed as:

[0041] Among them, the rated power of the fuel cell (unit: W), the capacity of a single lithium battery cell (unit: Ah), the number of lithium battery cells connected in series, and the volume of the hydrogen storage tank at a specific pressure (unit: L) are all specified. Step S3 (Constraint Verification and Branch Processing): Model each component of the UAV and verify the physical constraints for the candidate solution of capacity configuration; if the candidate solution does not meet the constraints, directly assign the preset penalty value to the solution and skip the inner simulation step; if the candidate solution meets the constraints, proceed to step S4 to perform inner energy management optimization.

[0042] In step S3, the fuel cell model is used to calculate hydrogen consumption based on the output power. It is constructed based on the polarization curve characteristics, and the calculation method is as follows: Given the P-I curve and I-Flow curve of a fuel cell with a specific rated power obtained by fitting measured performance data of a typical proton exchange membrane fuel cell, the Flow-P curve of the fuel cell with the target rated power is obtained by using the current mediation method.

[0043]

[0044] in All of these are constants used for fitting the actual experimental curves. Hydrogen consumption per unit time (unit: g / s). The current output power of the hydrogen fuel cell (in W). The fuel cell mass is determined based on the selected fuel cell rated voltage.

[0045] in Fixed system mass (controllers, valves, cooling, piping) (unit: kg) The power density is the reactor-level power density (unit: W / kg).

[0046] In step S3, the lithium battery model is either an electrochemical model or an equivalent circuit model. A lithium battery pack consists of several lithium battery cells, and the parameters of the lithium battery model are calculated as follows: Total voltage (unit: V):

[0047] in The smallest battery cell that makes up a lithium battery pack (unit: V) Total battery pack energy (unit: Wh):

[0048] Exponential voltage (unit: V):

[0049] in The exponential voltage of the lithium battery cell Exponential capacity (unit: Ah⁻¹):

[0050] in This represents the exponential capacity of a lithium battery cell.

[0051] Polarization constant (unit: V / Ah):

[0052] in The scaling factor for the polarization constant. The polarization constant of the lithium battery cell is denoted as . It is the maximum capacity of a lithium battery cell. Internal resistance (unit: Ω):

[0053] in This is the non-ideal loss coefficient; the smaller the value, the closer the system is to an ideal parallel connection.

[0054] Lithium battery pack mass (unit: kg)

[0055] in Mass of a lithium battery cell (unit: kg) After determining the fuel cell and lithium battery models, the UAV's payload is:

[0056] The physical constraints in step S3 are:

[0057] If this constraint is not met, TLC is directly set to a larger negative value.

[0058] Step S4 (Policy initialization based on transfer learning): Initialize the energy management policy (EMS) agent using transfer learning technology, load the model parameters of the pre-trained source domain model, and fine-tune the agent according to the hardware parameters of the current candidate solution. In step S4, the "pre-trained source domain model" is an agent trained on a benchmark UAV system with similar task profiles and typical hardware configurations using the PPO (Proximal Policy Optimization) algorithm based on deep reinforcement learning.

[0059] Further, a similarity scaling factor K is defined, where K is the ratio of the rated power of the current candidate solution to the rated power of the benchmark hardware; the target domain state space corresponding to the current candidate solution is normalized and mapped to the source domain state space through the scaling factor K; the neural network weight parameters of the source domain agent are used as the initial weights of the target domain agent, and few-sample fine-tuning training is performed in the hardware environment of the current candidate solution to generate an EMS control strategy adapted to the current capacity configuration.

[0060] Step S5 (Full Charge-Discharge Operation Simulation): Under the control of the EMS agent, perform a continuous flight simulation of a full charge-discharge cycle on the UAV to dynamically evaluate the actual energy consumption and component lifespan loss of a single full charge-discharge operation cycle under the hardware configuration.

[0061] In step S5, the "full charge-discharge cycle" is defined as the period from when the hydrogen tank is fully loaded and the lithium battery's state of charge is initialized to its upper limit, during which the drone continuously performs flight missions until the remaining energy is insufficient to complete a full mission trajectory, including the hybrid flight phase and the pure electric flight phase.

[0062] The specific performance indicators for a single full charge-discharge cycle in step S5 include: single-cycle fuel cell lifetime loss ( The equivalent total number of cycles for a single-cycle lithium battery Single-cycle hydrogen and electricity fuel costs ( ) and the number of mission flights that can be completed within a single cycle ( ).

[0063] The "single-cycle fuel cell life loss" in step S5 is the sum of life loss from multiple tasks within a single cycle, calculated using the following formula:

[0064] in This is the flight mission number. For the first The fuel cell lifespan loss during a single flight is calculated as follows: The actual output power of the fuel cell Normalized to its rated power :

[0065] Calculate the average power loss:

[0066] in Average power impairment factor (unit: ), Sampling time interval (unit: hours) Calculate dynamic power impairment:

[0067] in The dynamic power impairment factor. It is the absolute value of the difference between the normalized power at adjacent time points, reflecting the amplitude of power fluctuation.

[0068] Calculate high-power penalty damage:

[0069] in High power penalty coefficient (unit: h⁻¹); High power threshold; This is an indicator function that takes the value 1 when the power exceeds the threshold and 0 otherwise.

[0070] The total lifespan loss of this mission is the sum of the above three items:

[0071] This value represents the percentage of fuel cell lifespan consumed, ranging from 0 to 1.

[0072] The calculation method for "equivalent full cycle count of a single-cycle lithium battery" in step S5 is as follows:

[0073] in The output current of the lithium battery at a certain moment (unit: A); This is the depth coefficient, which is used when the lithium battery is in deep cycling. The value is 1, indicating that the lithium battery is in a shallow cycle. It is 0.1.

[0074] Step S6 (Lifecycle Indicator Calculation): Based on the performance indicators of a single full charge-discharge cycle calculated in Step S5, calculate the total lifecycle cost of the UAV power system.

[0075] The total lifecycle cost in step S6 ( ) consists of initial investment cost ( ), hydrogen and electricity consumption costs ( ), Operation and maintenance costs ( ) and the cost composition of lithium battery pack replacement ( The cost calculation methods are as follows: The purchase cost of an energy storage system, also known as the initial investment cost, includes the purchase cost of fuel cells, lithium batteries, and hydrogen storage tanks:

[0076] in, Cost per unit power of fuel cells (unit: yuan / W). Cost per unit capacity of lithium battery (unit: yuan / Ah). Cost per unit volume of hydrogen tank (unit: yuan / L).

[0077] Assume the upper limit of the design life of the fuel cell is Therefore, the total number of full charge-discharge cycles that can be executed throughout the entire lifecycle is:

[0078] The formula for calculating the cost of hydrogen and electricity consumption is as follows:

[0079] in, This is the floor function.

[0080] Operation and maintenance costs are directly proportional to the total number of flights, calculated using the following formula:

[0081] in This is the operation and maintenance cost coefficient (unit: yuan / hour). Duration of a single flight mission.

[0082] Battery replacement cost is calculated based on a comparison of the cumulative ampere-hour throughput of the lithium battery with its rated lifespan, using the following formula:

[0083] in It is the maximum number of cycles within the lifespan of a lithium battery.

[0084] Therefore, the formula for total lifecycle cost is:

[0085] Step S7 (Outer Target Calculation and Evolution): Calculate the outer optimization objective function value by combining the total lifecycle cost and the UAV payload; if the algorithm convergence condition or the maximum number of iterations is not met, generate a new generation of candidate solutions according to the update strategy of the outer optimization algorithm, and return to step S3; The "outer layer optimization objective function" in step S7 is defined as minimizing the total lifecycle cost (TLC) and maximizing the payload, as follows:

[0086] In step S7, when the algorithm fails to reach the convergence condition, the next generation population is generated through selection, crossover, and mutation, and then step S3 is repeated.

[0087] Step S8 (Optimal Solution Output): If the convergence condition is met, output the optimal capacity configuration scheme or the Pareto front solution set.

[0088] In summary, the method of this embodiment has the following advantages and beneficial effects compared with the prior art: 1) This application achieves deep collaborative optimization of power system hardware configuration and energy management strategy, breaking through the limitations of the traditional "selection first, control later" approach. This application adopts a two-layer nested optimization architecture. The outer layer optimizes the capacity configuration of fuel cells and lithium batteries using multi-objective or heuristic algorithms, while the inner layer uses deep reinforcement learning to generate the optimal energy management strategy for each hardware configuration. This closed-loop design ensures that the capacity configuration scheme can fully leverage the potential of the energy management strategy, while the energy management strategy can perfectly adapt to specific hardware characteristics, thereby obtaining a globally optimal power system design scheme and effectively improving the payload and endurance of the UAV.

[0089] 2) This application significantly improves the accuracy of total lifecycle cost (TLC) assessment, effectively reducing the long-term operating costs of UAVs. Existing technologies often rely solely on static energy balance or consider only initial purchase costs, while this application introduces a closed-loop full lifecycle simulation, incorporating initial investment, energy consumption, maintenance costs, and component replacement costs due to lifespan degradation into a unified consideration. In particular, by introducing the lithium battery ampere-hour throughput method and the fuel cell power cycle loss model, it is possible to quantify the nonlinear damage to component lifespan caused by complex flight missions. This makes the optimization results more closely aligned with practical engineering applications, avoiding the high maintenance and replacement costs that can result from neglecting battery lifespan degradation.

[0090] 3) This application utilizes transfer learning technology to address the computational bottleneck of time-consuming training and convergence difficulties in the inner layer of reinforcement learning in bilayer optimization. To address the computational challenge of optimizing policies from a massive number of hardware candidate solutions in the inner layer, this application proposes a transfer learning mechanism based on similarity scaling and fine-tuning. By transferring the weight parameters of the source domain pre-trained model to the target domain and combining this with the normalized mapping of hardware parameters, the sample size and time required to train the EMS agent for new hardware configurations are significantly reduced. This makes it possible to achieve large-scale, high-precision bilayer collaborative optimization in engineering, significantly improving design efficiency.

[0091] 4) This application possesses excellent mission adaptability and customization capabilities. The optimization framework constructed in this application is not only applicable to specific flight missions, but also allows for the rapid generation of customized hydrogen-electric hybrid power system configuration schemes adapted to different sizes and application scenarios (such as logistics transportation and inspection) by adjusting the input mission profile parameters (such as flight trajectory, distance, maximum takeoff weight, etc.), which has broad engineering application value.

[0092] This application also provides an electronic device, which includes a memory and a processor. The memory stores a computer program, and the processor executes the computer program to implement the above-described method. This electronic device can be any smart terminal, including tablet computers, in-vehicle computers, etc.

[0093] It is understood that the content of the above method embodiments is applicable to this device embodiment. The specific functions implemented by this device embodiment are the same as those of the above method embodiments, and the beneficial effects achieved are also the same as those achieved by the above method embodiments.

[0094] Please see Figure 3 , Figure 3 The hardware structure of an electronic device according to another embodiment is illustrated. The electronic device includes: The processor 301 can be implemented using a general-purpose CPU (Central Processing Unit), microprocessor, application-specific integrated circuit (ASIC), or one or more integrated circuits, and is used to execute relevant programs to implement the technical solutions provided in the embodiments of this application. The memory 302 can be implemented as a read-only memory (ROM), static storage device, dynamic storage device, or random access memory (RAM). The memory 302 can store the operating system and other applications. When the technical solutions provided in the embodiments of this specification are implemented through software or firmware, the relevant program code is stored in the memory 302 and is called and executed by the processor 301 using the methods described above in the embodiments of this application. Input / output interface 303 is used to implement information input and output; The communication interface 304 is used to enable communication and interaction between this device and other devices. Communication can be achieved through wired means (such as USB, network cable, etc.) or wireless means (such as mobile network, WIFI, Bluetooth, etc.). Bus 305 transmits information between various components of the device (e.g., processor 301, memory 302, input / output interface 303, and communication interface 304); The processor 301, memory 302, input / output interface 303, and communication interface 304 are connected to each other within the device via bus 305.

[0095] This application also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described method.

[0096] It is understood that the content of the above method embodiments is applicable to this storage medium embodiment. The specific functions implemented in this storage medium embodiment are the same as those in the above method embodiments, and the beneficial effects achieved are also the same as those achieved in the above method embodiments.

[0097] Memory, as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs and non-transitory computer-executable programs. Furthermore, memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory may optionally include memory remotely located relative to the processor, and these remote memories can be connected to the processor via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

[0098] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the above-described method.

[0099] It is understood that the content of the above method embodiments is applicable to the embodiments of this program product. The specific functions implemented in the embodiments of this program product are the same as those in the above method embodiments, and the beneficial effects achieved are also the same as those achieved in the above method embodiments. The executable computer program code or "code" used to perform the various embodiments can be written in high-level programming languages ​​such as C, C++, Python, Smalltalk, Java, JavaScript, Visual Basic, Structured Query Language (e.g., Transact-SQL), Perl, or in various other programming languages.

[0100] The embodiments described in this application are for the purpose of more clearly illustrating the technical solutions of the embodiments of this application, and do not constitute a limitation on the technical solutions provided by the embodiments of this application. As those skilled in the art will know, with the evolution of technology and the emergence of new application scenarios, the technical solutions provided by the embodiments of this application are also applicable to similar technical problems.

[0101] Those skilled in the art will understand that the technical solutions shown in the figures do not constitute a limitation on the embodiments of this application, and may include more or fewer steps than shown, or combine certain steps, or different steps.

[0102] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs.

[0103] Those skilled in the art will understand that all or some of the steps in the methods disclosed above, as well as the functional modules / units in the systems and devices, can be implemented as software, firmware, hardware, or suitable combinations thereof.

[0104] The terms “first,” “second,” “third,” “fourth,” etc. (if present) in the specification and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this application described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms “comprising” and “having,” and any variations thereof, are intended to cover non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0105] It should be understood that in this application, "at least one (item)" means one or more, and "more than" means two or more. "And / or" is used to describe the relationship between related objects, indicating that three relationships can exist. For example, "A and / or B" can represent three cases: only A exists, only B exists, and both A and B exist simultaneously, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one (item) of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one (item) of a, b, or c can represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, and c can be single or multiple.

[0106] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of the units described above is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.

[0107] The units described above as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0108] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0109] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes multiple instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned storage medium includes various media capable of storing programs, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0110] The preferred embodiments of the present application have been described above with reference to the accompanying drawings, but this does not limit the scope of the claims of the present application. Any modifications, equivalent substitutions, and improvements made by those skilled in the art without departing from the scope and substance of the embodiments of the present application shall be within the scope of the claims of the present application.

Claims

1. A method for optimizing the power supply configuration of a hydrogen-electric hybrid unmanned aerial vehicle, characterized in that, The method includes the following steps: The outer optimization step, based on the preset flight mission and constraints, aims to minimize the total life cycle cost and maximize the payload, and optimizes the power system hardware configuration scheme, including the rated power of fuel cells, the capacity and configuration of lithium battery packs, and the volume of hydrogen storage tanks. The inner optimization step involves using a deep reinforcement learning-based energy management strategy to optimize power allocation for each hardware configuration scheme generated in the outer optimization step, in order to minimize energy consumption and component lifespan loss in a single operating cycle. The inner optimization step uses transfer learning technology to initialize the energy management strategy agent. The transfer learning scales and fine-tunes the pre-trained strategy model based on the similarity between the current hardware configuration and the pre-trained benchmark configuration, so as to quickly generate the optimal energy management strategy that is adapted to the current hardware configuration. The method achieves coordinated optimization of power system hardware configuration and energy management strategy by iteratively executing the outer optimization step and the inner optimization step.

2. The method according to claim 1, characterized in that, The total lifecycle cost includes at least two of the following: initial investment cost, energy consumption cost, operation and maintenance cost, and component replacement cost due to the lifespan degradation of fuel cells and lithium batteries; the lifespan degradation is quantitatively assessed by establishing a power cycle loss model and / or ampere-hour throughput model for the components.

3. The method according to claim 1, characterized in that, The optimization variables in the outer layer optimization step include at least the fuel cell rated power. Total capacity of lithium battery pack Number of lithium battery cells in series and hydrogen storage tank volume Each variable takes a value within a preset feasible domain.

4. The method according to claim 3, characterized in that, In the outer optimization step, the generated hardware configuration scheme is physically constrained and verified; the physical constraints include at least the requirement that the effective payload calculated from the maximum takeoff weight of the UAV, the frame mass, and the mass of each component must be greater than zero.

5. The method according to claim 1, characterized in that, The transfer learning techniques specifically include: Calculate the similarity scaling factor that characterizes the proportional relationship between the current hardware configuration and the baseline hardware configuration; Based on the scaling factor, the system state space under the current configuration is mapped to the state space corresponding to the baseline configuration; The weights of the policy neural network pre-trained on the baseline configuration are loaded as initial weights; Fine-tuning training is performed under the current configuration environment to obtain the final energy management strategy.

6. The method according to claim 1 or 5, characterized in that, The energy management strategy model of the pre-trained benchmark configuration is trained on a benchmark UAV system with a typical mission profile using a near-end strategy optimization algorithm.

7. The method according to claim 1, characterized in that, In the inner optimization step, the performance indicators under the energy management strategy are evaluated by simulating a complete charge-discharge cycle of the drone from full energy load to inability to complete a single mission; the performance indicators include at least the single-cycle fuel cell life loss. Equivalent total cycle life of a single-cycle lithium battery Single-cycle hydrogen and electricity fuel costs And the number of mission flights that can be completed within a single cycle .

8. The method according to claim 7, characterized in that, The single-cycle fuel cell lifespan loss The calculation is performed by accumulating the average power operation damage, power fluctuation dynamic damage, and high power condition penalty damage in a single flight mission. The equivalent full cycle number of the single-cycle lithium battery The calculation is based on the relationship between the absolute time integral of the lithium battery output current and the total battery capacity.

9. The method according to claim 1, characterized in that, The outer optimization step is solved using a multi-objective optimization algorithm, and outputs a set of Pareto optimal solutions that are independent of each other on the two objectives of total life cycle cost and payload.

10. An electronic device, characterized in that, The electronic device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to implement the method according to any one of claims 1 to 9.