A Smart Obstacle Avoidance Decision-Making Method for Aircraft Based on Dynamic Threat Assessment
By using an intelligent obstacle avoidance decision-making method based on the PPO algorithm, the aircraft can perceive and assess threats in real time, build a dynamic obstacle avoidance model, solve the problem that traditional algorithms cannot cope with dynamic threats in complex environments, and achieve efficient and stable obstacle avoidance decision-making and safe flight.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- XIAN AIRCRAFT DESIGN INST OF AVIATION IND OF CHINA
- Filing Date
- 2026-05-14
- Publication Date
- 2026-06-30
Smart Images

Figure CN122308435A_ABST
Abstract
Description
Technical Field
[0001] This application belongs to the field of flight control technology, and specifically relates to an intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment. Background Technology
[0002] The research on intelligent obstacle avoidance decision-making for aircraft aims to develop a flight control technology that enables aircraft to make autonomous decisions, reduce threats from dynamic targets such as radar, and improve flight safety. The aircraft uses a preset waypoint as a target and performs maneuvers within a certain flight envelope to avoid potential threats along its route. Simultaneously, the influence of multiple factors related to the situation, such as exposure time, needs to be considered.
[0003] The technology of intelligent obstacle avoidance decision-making for aircraft mainly involves avoiding multiple targets and optimizing flight paths. Many scholars have applied traditional algorithms to obstacle avoidance decision-making for aircraft, but with the iterative development of aircraft and radar equipment, traditional algorithms can no longer meet current needs.
[0004] Therefore, there is an urgent need for a technical solution to overcome or mitigate at least one of the aforementioned defects in the existing technology. Summary of the Invention
[0005] The purpose of this application is to provide an intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment, in order to solve at least one problem existing in the prior art.
[0006] The technical solution of this application is:
[0007] An intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment includes:
[0008] Step 1: Determine if there are obstacles within the current detection range. If so, proceed to the threat level assessment stage to determine the threat level of the obstacles.
[0009] Step 2: Construct an intelligent obstacle avoidance decision-making model and train the model;
[0010] Step 3: When the distance between the aircraft and the obstacle is less than the preset safety threshold, the obstacle avoidance control mode is entered. The status of the aircraft and the obstacle is input into the intelligent obstacle avoidance decision model to obtain the action. The aircraft executes the action to achieve obstacle avoidance flight.
[0011] In at least one embodiment of this application, in step 1, the aircraft determines whether there are obstacles within the current detection range through a real-time environmental perception module.
[0012] In at least one embodiment of this application, step 2, constructing an intelligent obstacle avoidance decision model, includes:
[0013] An intelligent obstacle avoidance decision model is constructed based on the PPO algorithm, with the state variable s as the input of the intelligent obstacle avoidance decision model and the action variable α as the output of the intelligent obstacle avoidance decision model.
[0014] Define state variable s:
[0015] ;
[0016] in, The position of the aircraft in the current two-dimensional plane. Let the velocity of the aircraft be in the current two-dimensional plane. For the aircraft attitude angles, O1, O2, ..., O n The location, type, and threat level of the obstacle;
[0017] Define action variable α:
[0018] ;
[0019] in, This refers to the change in the aircraft's yaw angle.
[0020] Design a reward function R based on the aircraft's safety, mission completion rate, and path efficiency:
[0021] ;
[0022] in, As a safety bonus value for the aircraft, The reward value for the aircraft's mission completion. This is the path efficiency bonus value for the aircraft. , , These are the weighting coefficients.
[0023] In at least one embodiment of this application, the training process of the intelligent obstacle avoidance decision model includes:
[0024] Step 21: Initialize the agent's initial policy Among them, the aircraft attitude angle For strategy parameters;
[0025] Step 22: Execute the current policy in the environment. Collect experience datasets ,in, The state at time step t, For the action at time step t, The reward for time step t;
[0026] Step 23: Calculate the dominance function at each time step t. ;
[0027] Step 24: Optimize strategy parameters To maximize expected cumulative reward;
[0028] Step 25: Repeat steps 22 to 24 until the convergence condition is met.
[0029] In at least one embodiment of this application, in step 22, an empirical dataset D is obtained by resampling.
[0030] In at least one embodiment of this application, in step 23:
[0031] The advantage function is calculated using generalized advantage estimation.
[0032] Using discount factors Adjusting parameters to optimize the smoothness and time scale of the advantage function.
[0033] In at least one embodiment of this application, in step 24, the strategy parameters are optimized. The method is as follows:
[0034] The magnitude of policy changes is limited to a preset threshold.
[0035] Update policy parameters using the Adam optimizer .
[0036] In at least one embodiment of this application, in step 25, the convergence condition is:
[0037] The change in average reward is lower than the preset threshold;
[0038] Alternatively, the predetermined maximum number of iterations can be reached.
[0039] In at least one embodiment of this application, in step 3, when the distance between the aircraft and the obstacle is less than a preset safety threshold, the obstacle avoidance control mode is entered:
[0040] ;
[0041] in, The distance between the aircraft and the nearest obstacle. The location of the aircraft. Let i be the position of the i-th obstacle. This is a preset safety threshold.
[0042] The invention has at least the following beneficial technical effects:
[0043] The intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment proposed in this application can optimize the flight trajectory of aircraft and adapt to different types of obstacles, thereby improving the flight safety of aircraft. Attached Figure Description
[0044] Figure 1 This is a flowchart of an intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment, according to one embodiment of this application.
[0045] Figure 2 This is a schematic diagram of an aircraft intelligent decision-making scenario according to one embodiment of this application. Detailed Implementation
[0046] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions in the embodiments of this application will be described in more detail below with reference to the accompanying drawings. In the drawings, the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The described embodiments are some, but not all, embodiments of this application. The embodiments described below with reference to the accompanying drawings are exemplary and intended to explain this application, and should not be construed as limiting this application. All other embodiments obtained by those skilled in the art based on the embodiments of this application without creative effort are within the scope of protection of this application. The embodiments of this application will be described in detail below with reference to the accompanying drawings.
[0047] The following is in conjunction with the appendix Figures 1 to 2 This application will be described in further detail.
[0048] This application provides an intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment, including the following steps:
[0049] Step 1: Determine if there are obstacles within the current detection range. If so, proceed to the threat level assessment stage to determine the threat level of the obstacles.
[0050] Step 2: Construct an intelligent obstacle avoidance decision-making model and train the model;
[0051] Step 3: When the distance between the aircraft and the obstacle is less than the preset safety threshold, the obstacle avoidance control mode is entered. The status of the aircraft and the obstacle is input into the intelligent obstacle avoidance decision model to obtain the action. The aircraft executes the action to achieve obstacle avoidance flight.
[0052] like Figure 1 As shown, firstly, the aircraft cruises in a sensitive area. The aircraft uses a real-time environmental perception module to determine if there are any obstacles, such as radar, within its current detection range. If obstacles are present, their threat level is assessed. Obstacle detection and threat assessment can be achieved using a weighted rule based on the distance between the aircraft and the obstacle, their relative speed, and the obstacle type.
[0053] Secondly, an intelligent obstacle avoidance decision-making model is constructed and trained. The model construction process is as follows:
[0054] An intelligent obstacle avoidance decision model is constructed based on the PPO algorithm, with the state variable s as the input of the intelligent obstacle avoidance decision model and the action variable α as the output of the intelligent obstacle avoidance decision model.
[0055] Define state variable s:
[0056] ;
[0057] in, The position of the aircraft in the current two-dimensional plane. Let the velocity of the aircraft be in the current two-dimensional plane. For the aircraft attitude angles, O1, O2, ..., O n The location, type, and threat level of the obstacle;
[0058] Define action variable α:
[0059] ;
[0060] in, This refers to the change in the aircraft's yaw angle.
[0061] Design a reward function R based on the aircraft's safety, mission completion rate, and path efficiency:
[0062] ;
[0063] in, As a safety bonus value for the aircraft, The reward value for the aircraft's mission completion. This is the path efficiency bonus value for the aircraft. , , These are the weighting coefficients.
[0064] In this embodiment, the aircraft's current two-dimensional planar position, velocity, attitude angle, and the position, type, and threat level of obstacles are used as state variables for reinforcement learning. The change in the aircraft's yaw angle is used as the action space variable for reinforcement learning. A reward function is designed based on the aircraft's safety, mission completion, and path efficiency. Positive rewards are given when the aircraft moves away from obstacles; negative rewards are given when the aircraft approaches obstacles or enters a threat area; and the maximum reward is given when the aircraft successfully avoids all obstacles and reaches the target point. The multi-objective weight coefficients are dynamically adjusted to adapt to real-time environmental changes. The weight of safety is increased in high-threat areas, while mission completion and path efficiency are prioritized in low-threat areas.
[0065] Specifically, the training process of the intelligent obstacle avoidance decision-making model includes:
[0066] Step 21: Initialize the agent's initial policy Among them, the aircraft attitude angle For strategy parameters;
[0067] Step 22: Execute the current policy in the environment. Collect experience datasets ,in, The state at time step t, For the action at time step t, The reward for time step t;
[0068] Step 23: Calculate the dominance function at each time step t. ;
[0069] Step 24: Optimize strategy parameters To maximize expected cumulative reward;
[0070] Step 25: Repeat steps 22 to 24 until the convergence condition is met.
[0071] In a preferred embodiment of this application, in step 22, the empirical dataset D is obtained through resampling. In step 23, the advantage function is calculated using generalized advantage estimation; and a discount factor is used. The smoothness and time scale of the advantage function are adjusted. An adaptive adjustment of the advantage function is introduced to improve training stability. In step 24, the policy parameters are optimized. The approach is as follows: limit the policy change to no more than a preset threshold; use the Adam optimizer to update the policy parameters. In step 25, the convergence condition is: the change in average reward is lower than a preset threshold; or, the predetermined maximum number of iterations is reached.
[0072] Furthermore, in step 3, when the distance between the aircraft and the obstacle is less than a preset safety threshold, obstacle avoidance control mode needs to be activated.
[0073] ;
[0074] in, The distance between the aircraft and the nearest obstacle. The location of the aircraft. Let i be the position of the i-th obstacle. This is a preset safety threshold.
[0075] The aircraft achieves efficient obstacle avoidance based on real-time perceived obstacle distribution information and an improved algorithm based on the PPO algorithm, combined with the aircraft's dynamic model and real-time perceived information. Specifically, it improves sample utilization by storing historical state-action pairs and using experience replay techniques to randomly sample historical data for training.
[0076] At each time step, the spacecraft adjusts its configuration based on the current state. and the actions output by the improved reinforcement learning algorithm The aircraft updates its own state. After executing an action, it senses new obstacle distribution information and recalculates its state-space parameters. This process is repeated until the aircraft successfully avoids all obstacles and resumes its original trajectory. Through trial and error learning in a large number of simulated environments, the model gradually optimizes its strategy, achieving adaptive obstacle avoidance in complex dynamic environments.
[0077] If the distance between the aircraft and the obstacle is not less than the preset safety threshold, the aircraft will continue to traverse the danger zone without entering obstacle avoidance control mode. Once the aircraft has successfully avoided all threat areas, it will exit obstacle avoidance control mode and enter the subsequent mission mode.
[0078] This application presents an intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment. By utilizing sensor information, surrounding scene information, and historical decision information from training sample data, and employing a dynamic threat assessment module and an improved reinforcement learning algorithm, it generates an intelligent obstacle avoidance decision-making model for aircraft. This model effectively addresses path planning tasks in complex environments, improving the success rate of UAV missions. It enhances the aircraft's obstacle avoidance performance and significantly improves its survivability.
[0079] The intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment proposed in this application has the following advantages:
[0080] (1) By sensing the environment in real time and dynamically adjusting the weight coefficients of safety, mission completion and path efficiency, multiple objectives can be flexibly balanced in different environments to ensure that the aircraft makes optimal decisions in complex environments.
[0081] (2) By improving the intelligent decision-making algorithm and combining it with experience playback technology, the efficiency and stability of obstacle avoidance decision-making have been significantly improved;
[0082] (3) By introducing an adaptive adjustment advantage function, the aircraft can reliably avoid obstacles under different threat levels, ensuring mission execution in complex environments.
[0083] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A method for intelligent obstacle avoidance decision-making for aircraft based on dynamic threat assessment, characterized in that, include: Step 1: Determine if there are obstacles within the current detection range. If so, proceed to the threat level assessment stage to determine the threat level of the obstacles. Step 2: Construct an intelligent obstacle avoidance decision-making model and train the model; Step 3: When the distance between the aircraft and the obstacle is less than the preset safety threshold, the obstacle avoidance control mode is entered. The status of the aircraft and the obstacle is input into the intelligent obstacle avoidance decision model to obtain the action. The aircraft executes the action to achieve obstacle avoidance flight.
2. The intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment according to claim 1, characterized in that, In step 1, the aircraft uses a real-time environmental perception module to determine whether there are obstacles within the current detection range.
3. The intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment according to claim 2, characterized in that, Step 2 involves constructing an intelligent obstacle avoidance decision-making model, including: An intelligent obstacle avoidance decision model is constructed based on the PPO algorithm, with the state variable s as the input of the intelligent obstacle avoidance decision model and the action variable α as the output of the intelligent obstacle avoidance decision model. Define state variable s: ; in, The position of the aircraft in the current two-dimensional plane. Let the velocity of the aircraft be in the current two-dimensional plane. For the aircraft attitude angles, O1, O2, ..., O n The location, type, and threat level of the obstacle; Define action variable α: ; in, This refers to the change in the aircraft's yaw angle. Design a reward function R based on the aircraft's safety, mission completion rate, and path efficiency: ; in, As a safety bonus value for the aircraft, The reward value for the aircraft's mission completion. This is the path efficiency bonus value for the aircraft. , , These are the weighting coefficients.
4. The intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment according to claim 3, characterized in that, The training process of the intelligent obstacle avoidance decision model includes: Step 21: Initialize the agent's initial policy Among them, the aircraft attitude angle For strategy parameters; Step 22: Execute the current policy in the environment. Collect experience datasets ,in, The state at time step t, For the action at time step t, The reward for time step t; Step 23: Calculate the dominance function at each time step t. ; Step 24: Optimize strategy parameters To maximize expected cumulative reward; Step 25: Repeat steps 22 to 24 until the convergence condition is met.
5. The intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment according to claim 4, characterized in that, In step 22, the empirical dataset D is obtained through resampling.
6. The intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment according to claim 5, characterized in that, In step 23: The advantage function is calculated using generalized advantage estimation. Using discount factors Adjusting parameters to optimize the smoothness and time scale of the advantage function.
7. The intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment according to claim 6, characterized in that, In step 24, optimize the strategy parameters. The method is as follows: The magnitude of policy changes is limited to a preset threshold. Update policy parameters using the Adam optimizer .
8. The intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment according to claim 7, characterized in that, In step 25, the convergence condition is: The average reward change is below a preset threshold; Alternatively, the predetermined maximum number of iterations can be reached.
9. The intelligent obstacle avoidance decision-making method for aircraft based on dynamic threat assessment according to claim 1, characterized in that, In step 3, when the distance between the aircraft and the obstacle is less than a preset safety threshold, the obstacle avoidance control mode is entered: ; in, The distance between the aircraft and the nearest obstacle. The location of the aircraft. Let i be the position of the i-th obstacle. This is a preset safety threshold.