Heuristic spacecraft autonomous avoidance task planning method under orbital threat environment
By employing a heuristic spacecraft autonomous avoidance mission planning method under orbital threat environments, and utilizing sensor identification and a two-stage planning strategy, the autonomous avoidance capability of spacecraft under orbital threat environments has been improved, thus solving the safety and business continuity issues of spacecraft operation under orbital threats.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HARBIN INST OF TECH
- Filing Date
- 2023-08-18
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies have failed to effectively enhance spacecraft's ability to autonomously and promptly handle space threats in orbital threat environments, affecting spacecraft operational safety and business continuity.
A heuristic spacecraft autonomous avoidance mission planning method based on orbital threat environments is adopted. This method involves initial configuration, establishing a mission planning model, designing an autonomous avoidance architecture, setting a two-stage planning strategy, and performing time and resource constraint reasoning. It also combines visible light, infrared, laser, and microwave sensors for threat identification and avoidance decision-making, thereby enabling spacecraft to autonomously avoid orbital threats.
It improves the accuracy and efficiency of spacecraft in autonomously and promptly handling space threats, reduces the impact on observation missions, and enhances the safety and operational continuity of spacecraft in orbit.
Smart Images

Figure CN117170392B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of spacecraft mission planning, and more specifically to a heuristic spacecraft autonomous avoidance mission planning method under orbital threat environments. Background Technology
[0002] However, increasingly crowded orbital space, surging collision risks, intensified space competition, and increased interference have introduced more uncertainties into spacecraft missions. Clearly, the inability to effectively manage space threats will severely impact the safety and operational continuity of spacecraft in orbit. Furthermore, to avoid delays caused by the "space-to-ground big loop," it is essential to enhance spacecraft's ability to autonomously and promptly address uncertainties such as space threats.
[0003] Chinese Patent Publication No. CN114638082A discloses a general heuristic temporal programming modeling and solution method for spacecraft. Its key features include: fully integrating the characteristics and practical needs of aerospace engineering missions; proposing a general model and its mathematical expression for the spacecraft domain; using a temporal network structure graph to represent the time and energy constraints of spacecraft roving exploration state transitions; employing a forward pruning strategy for constraint propagation and problem solving on the network structure graph; and designing and implementing a heuristic control function based on the maximum time span for problem relaxation to improve solution efficiency. This method constructs a more complete domain model, and the temporal network structure graph technology reduces the computational complexity of the programming solution and greatly simplifies the algorithm design. However, it primarily constructs a general model for the spacecraft domain and performs solutions and calculations around this model; it does not describe the autonomous and timely handling of space threats by spacecraft. Summary of the Invention
[0004] The technical problem to be solved by this invention is how to improve the ability of spacecraft to autonomously and promptly handle uncertainties such as space threats.
[0005] This invention solves the above-mentioned technical problems through the following technical means: a heuristic spacecraft autonomous evasion mission planning method under orbital threat environments, comprising:
[0006] Step 1: Initialize the spacecraft configuration;
[0007] Step 2: Establish a mission planning model for autonomous avoidance of spacecraft orbital threats;
[0008] Step 3: Analyze the scenarios of spacecraft orbital threats and design the spacecraft autonomous avoidance architecture based on the mission requirements of autonomous avoidance;
[0009] Step 4: Set up a two-stage planning strategy. In the first stage of planning, continue the observation mission. In the avoidance behavior decision, if the current orbital threat will not cause damage to the spacecraft, then there is no need to avoid it, and the entire plan is terminated. If the avoidance behavior decision requires the spacecraft to take action to avoid the orbital threat, the observation mission is immediately interrupted, and the second stage of planning is carried out. The second stage of planning is used to avoid the orbital threat.
[0010] Step 5: Perform time-constrained reasoning and numerical effect reasoning involving resource variables on the planning problem, so that the resource variables satisfy the execution of duration actions during the dynamic process of change;
[0011] Step 6: Check the consistency of the time constraints and resource variable constraints of the interactions in each state, and delete the states that do not meet the time constraints;
[0012] Step 7: Use time relaxation planning graphs to guide the planning process through the search space to reach the target.
[0013] Furthermore, step two includes:
[0014] The mission planning model for autonomous avoidance of spacecraft orbital threats is represented by an octet.
[0015] Π=<F,I,G,V,A,Q,P,C>
[0016] Wherein, F represents the fact that the spacecraft's state is established; This is the initial spacecraft state; V represents the target state that the spacecraft needs to maintain to achieve threat avoidance; V is the set of spacecraft resources; A is a set of actions that can change the spacecraft's state and its effects, each action being represented as... N is the name of the action, and dur is the duration of the action. min and dur max These are the minimum and maximum durations of `act`, respectively, with `pre` representing the preconditions, including the start condition. Termination condition pre ⊥ and invariant conditions eff is the effect, including the initial effect. and ending effect eff ⊥ Q represents actions that have started but not yet finished in the event queue; P represents the order of actions from the initial state to the current state; C is a set of time constraints for the actions in the plan.
[0017] Furthermore, the spacecraft autonomous avoidance architecture designed in step three includes a visible light camera, a global camera, an infrared camera, a microwave radar, a lidar, a multi-sensor information fusion unit, a threat target behavior information calculation unit, a threat level inference unit, an avoidance behavior decision-making unit, and an action sequence planning unit. It employs four detection methods: global, infrared, laser, and microwave. The sensors are combined according to the space environment conditions to search for and capture threat targets, and initial ranging and angle measurements are performed. Threat identification is achieved through information complementarity between different sensor devices. Next, by fusing sensor information, the velocity, distance, and azimuth information of the threat target are obtained, thereby acquiring the target's abnormal behavior characteristics, orbital parameters, and collision probability. The visible light camera achieves close-range imaging to acquire morphological features. Combining the processed target information with the spacecraft's own attitude and orbital parameters, fusion inference is performed to obtain a quantitative evaluation of the target's threat category and threat level. Specific avoidance behaviors that the spacecraft should take are inferred and decided, the future actions of the threat target are predicted, and the optimal trajectory for threat avoidance is solved. The spacecraft's own parameters are fed back to the threat level inference unit and the action sequence planning unit in real time.
[0018] Furthermore, step five includes:
[0019] Step 5.1: Decompose each duration action (act) in the task planning model into two non-temporal instantaneous actions, in the form of...<pre,eff> ,in, Indicates the start of an instantaneous action, act ⊥ = <pre ⊥ eff ⊥ > indicates the end of the instantaneous action; each state in the plan is represented as S =<F,V,Q,P,C> When applying an action, an action can only be applied if its effect does not conflict with any invariant of an action in Q, and F and V are updated according to its effect. C is updated when each action is added to the plan.
[0020] Step 5.2: For resource variable V, the state contains a vector V that records its lower and upper limits. max and V min When there are continuous numerical changes, the value of a resource variable depends on time.
[0021] Furthermore, step 5.1 also includes:
[0022] Each planning step has a unique index, and each fact in each state is recorded. All are represented using the following information:
[0023] F + (p)(F -(p) gives the index of step i for the most recently added and deleted fact p;
[0024] FP(p) is a set of pairs<i,d> A step with precondition p, where i is the step index, d∈{0, ε}, and ε represents the time interval. If d=0, step i is recorded as the end of an interval during which p needs to be maintained. In this case, i is the ending step of an action, where p is an invariant condition. If d=ε, step i is recorded as the beginning of an interval during which p needs to be maintained, corresponding to the start or end condition associated with step i.
[0025] Apply the startup operation in step i of the plan. At that time, the following constraints will be added to the planning:
[0026] For each Add time constraint t(sstep) ≥ t(SF) + (p))+ε, where Indicates adding a start action The spacecraft needs to satisfy the state fact p, SF + (p) represents the index of the step i in the spacecraft state S where the fact p was most recently added, t(SF + (p) represents the step index SF + (p) timestamp, t(sstep) represents the timestamp added. The timestamp of step index sstep indicates that the step implementing p is moved forward to before step i; for For each negative effect p, remove p from the state and add the constraint t(sstep) ≥ t(i) + ε, such that the deletion step i occurs after any action that requires p, where t(i) represents the timestamp of the index of step i, and the negative effect p represents some state of the spacecraft that is deleted after adding the action; for For each positive effect p, add p to the state and add the constraint t(sstep)≥t(SF). - (p)+ε, and step i is recorded as the implementation step of p, where t(SF) - (p) indicates the step index SF - (p) timestamp, SF - (p) is the step index of the most recently deleted fact p in state S, and the positive effect p represents some states of the spacecraft added after the action is added; for each invariant if If p is not implemented, adding the constraint t(sstep)≥t(i) will advance the recorded steps for implementing p to before step i, and the invariant will remain unchanged. This indicates the state that the spacecraft needs to maintain during the duration of the action.
[0027] Furthermore, step 5.2 includes:
[0028] When adding an action (act) at point i, the constraints should be set in the following ways:
[0029] 1) If the effect of `act` depends on the value of `v`: add the constraint `t(i) ≥ t(V)`. eff (v))+ε is added to S′.C to cause act to be executed after the action of variable v, where v∈V, and S′.C represents the set of time constraints C in state S′; t(s)+ε≤t(i) and t(i)+ε≤t(e) are added to S′.C; V eff (v) records the index of the most recent step that has an instantaneous effect on v, t(V) eff (v)) represents the timestamp of the most recent step index that has an instantaneous effect on v;
[0030] 2) If act has an instantaneous numerical effect on v: add t(i) ≥ t(V) eff (v))+∈ to S′.C, and update v sequentially; add t(j)+∈≤t(i) to S′.C; add t(s)+∈≤t(i) and t(i)+∈≤t(e) to S′.C; ∈ represents a constant, t(j) represents the timestamp of step j, and t(s) represents the timestamp of step s;
[0031] 3) If act starts an action and has an invariant condition on v: add t(s)+∈≤t(i) and t(i)+∈≤t(e) to S′.C; if act has no effect on updating v, add t(i)≥t(Sv) eff (v))+∈ into S′.C; t(e) represents the timestamp of step e, t(SV) eff (v) represents step V in state S. eff (v) timestamp of the index;
[0032] 4) If act starts an action and has a continuous effect on v: if act does not have an instantaneous update effect on v, add t(i)≥t(V) eff (v))+∈ to S′.C, update v in order; add t(j)+∈≤t(i) to S′.C; add t(s)+∈≤t(i) and t(i)+∈≤t(e) to S′.C;
[0033] 5) If act ends the action starting from k and has a continuous effect on v: add t(s)+∈≤t(i) and t(i)+∈≤t(e) to S′.C; add t(s)+∈≤t(i) and t(i)+∈≤t(e) to S′.C;
[0034] 6) If an action ends with an action that has an invariant condition on v: add i to S′.VP(v); remove (k, i) from S′.VP(v), where S′.VP(v) represents the set of step indices VP(v) in state S′. Both S′ and S represent the spacecraft's state, the difference being that after adding an action in the plan, the spacecraft's state is updated from S to S′. S′ and S alternate depending on the spacecraft's actions.
[0035] Furthermore, step six includes:
[0036] If the steps [0, ..., n-1] in the action plan P that lead to state S are assigned the value [t(0), ..., t(n-1)], then state S is consistent only in time, representing the execution time of each corresponding step. Considering the time constraint C and the resource constraint, after constructing the ordering constraint S′.C, the consistency of time and resources must be checked, and any state that cannot satisfy the time constraint is immediately removed from the search.
[0037] Furthermore, step six also includes:
[0038] The time constraints C established in state S are expressed as follows:
[0039] lb≤t(b)-t(a)≤ub
[0040] Where lb, ub∈R represent the upper and lower bounds of the interval time, and 0≤lb≤ub, t(b)-t(a) represents the interval time between steps a and b;
[0041] When reasoning about the continuous changes in digital resources while under time constraints, linear programming (LP) is used to capture both time constraints and digital constraints, including the interaction between them.
[0042] Furthermore, step seven includes:
[0043] The time relaxation programming heuristic consists of two phases: graph expansion and solution extraction. In the graph expansion phase, the goal is to construct a time relaxation programming graph to determine which facts and actions are achievable. The time relaxation programming graph comprises an alternating fact layer and an action layer. The alternating fact layer consists of propositions that maintain limit boundaries on v. The action layer contains actions that satisfy the preconditions in the preceding fact layer. Preconditions include propositional preconditions and numerical preconditions. In the case of propositional preconditions, if the relevant facts are contained in the preceding layer, the propositional precondition is true. In the case of numerical preconditions, if some assignments of variables in the numerical preconditions are consistent with the upper and lower bounds, then these numerical preconditions are satisfied. "Numerical preconditions" indicate that the execution of an action requires consideration of the spacecraft's resource values. For example, a spacecraft orbit change requires certain fuel resources as a prerequisite; the spacecraft's fuel capacity is a numerical precondition for the orbit change action. "Propositional preconditions" indicate that the execution of an action does not consider the system's resource situation. For example, a spacecraft taking a picture requires the camera to be kept in a calibrated state, without considering the fuel capacity. The camera's calibration state is a propositional precondition for taking the picture, and a numerical precondition that does not consider fuel. In planning, changes in system resources are a difficult problem to solve. For example, fuel capacity decreases dynamically during orbital changes, and other resources (such as computing power) also change dynamically. These resource variables are difficult to accurately represent in planning. However, in planning, for actions that require resources, by using the upper and lower limits of the changes in resource variables caused by the actions at each step of the planning process, the range of resource changes can be determined, thus characterizing the dynamic change process of resources throughout the entire planning process.
[0044] Furthermore, step seven also includes:
[0045] During graph expansion, after calculating the boundaries of all variables in the (i+1)th layer of the action layer, the graph expansion continues iterating to find actions applicable to the (i+1)th layer, thereby finding facts in the (i+2)th layer, and so on. Graph expansion terminates in one of two cases: the fact layer satisfies all propositions and numerical objectives; or, adding more layers does not lead to more preconditions being satisfied, when no new propositions appear and the accumulation of boundaries on variables does not lead to any more numerical preconditions being satisfied; in the second case, the planning problem cannot be solved.
[0046] The advantages of this invention are as follows: This invention employs a mission planning model and autonomous avoidance architecture based on spacecraft orbital threats, setting up a two-stage planning strategy. In the first stage of planning, the observation mission continues. If the avoidance behavior decision requires spacecraft actions to avoid orbital threats, the observation mission is immediately interrupted, and the second stage of planning begins. This second stage of planning is used to avoid orbital threats, thereby enabling the spacecraft to autonomously and promptly handle space threats. Furthermore, the planning problem undergoes time-constrained reasoning and numerical effect reasoning including resource variables, ensuring that resource variables meet the execution of duration actions during dynamic changes. The consistency of time constraints and resource variable constraints in each state is checked, and states that do not meet time constraints are deleted, further improving the accuracy and efficiency of the spacecraft's autonomous and timely handling of space threats. Attached Figure Description
[0047] Figure 1 This is a flowchart illustrating the heuristic spacecraft autonomous avoidance mission planning method under orbital threat environments disclosed in the embodiments of the present invention.
[0048] Figure 2 This is a schematic diagram of the planning results of the heuristic spacecraft autonomous avoidance mission planning method under orbital threat environment disclosed in the embodiments of the present invention. Detailed Implementation
[0049] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below in conjunction with the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0050] like Figure 1 As shown, this invention provides a heuristic spacecraft autonomous evasion mission planning method under orbital threat environments, comprising the following steps:
[0051] S1: Initialize and configure the spacecraft. The specific process is as follows: The spacecraft has complex operational constraints and multiple concurrent subsystems. Its on-orbit operation requires consideration of factors such as structure, capabilities, equipment status, and mission requirements. Mission planning requires describing the spacecraft's resources, subsystem functions, composition, and the constraints that need to be met. These constraints include resource constraints, causal constraints, and time constraints. This embodiment comprehensively considers the spacecraft's need to avoid orbital threats, and the specific subsystems selected are shown in Table 1 below.
[0052] Table 1. Subsystem Names and Number of Involved States
[0053] Subsystem Name Number of state variables Number of states camera 3 18 radar 2 6 2D servo turntable 2 10 attitude and trajectory determination 1 5 Attitude control 1 3 Track control 1 5 Propulsion system 1 5 Intelligent computing unit 1 26
[0054] S2: Establish a mission planning model for autonomous avoidance of spacecraft orbital threats.
[0055] The mission planning model for autonomous avoidance of spacecraft orbital threats can be represented as an octet.
[0056] Π=<F,I,G,V,A,Q,P,C>
[0057] Here, F represents the fact that the spacecraft's state is true, indicating a Boolean proposition that is true or false. This is the initial spacecraft state, describing the facts that were true at the start of the planning process. V represents the target state that the spacecraft needs to maintain to achieve threat avoidance. V is a set of spacecraft resources, containing two types of resource changes. Instantaneous numerical changes can alter resource variables instantaneously, while continuous linear changes depend on the duration and gradient of the action. A is a set of actions that can change the spacecraft state and their effects. Each action may consume resources, denoted as... N is the name of the action, and dur is the duration of the action. min and dur max These are the minimum and maximum durations of `act`, respectively. `pre` represents the preconditions, including the start condition. Termination condition pre ⊥ and invariant conditions Specifically It is a condition that must be maintained at the beginning (end) of the action. It is a condition that must always be maintained between the beginning and end of an action. eff is the effect, including the initial effect. and ending effect eff ⊥ . This indicates that the spacecraft state can be updated at the start (end) of an action based on these effects. Q records actions that have started but not yet finished in the event queue. P represents the sequence of actions from the initial state to the current state. C is a set of time constraints for the planned actions.
[0058] S3: Analyze scenarios of spacecraft orbital threats and design an autonomous avoidance architecture for spacecraft based on the mission requirements of autonomous avoidance.
[0059] With the increase in human space activities, orbital space is becoming increasingly crowded, the amount of space debris is continuously increasing, and space competition is intensifying. Earth observation satellites performing Earth observation missions in orbit inevitably suffer from orbital threats such as collision hazards from space debris and harassment and reconnaissance attacks from hostile satellites. Failure to avoid these threats will cause irreparable damage to the satellites. Currently, the means of dealing with orbital threats heavily rely on ground-based methods, which suffer from problems such as excessive human factors in operation and control, and poor timeliness of threat response, seriously affecting the safety of spacecraft in orbit. To solve this problem, this invention adopts a heuristic mission planning method for autonomous spacecraft avoidance. Without human intervention, the spacecraft autonomously plans the sequence of actions to avoid threats in orbit, thereby achieving orbital threat avoidance and enabling the spacecraft to adapt to the space situation of continuously increasing threat numbers and increasingly deteriorating environment.
[0060] Based on the mission requirement of autonomously avoiding orbital threats, the following methods are adopted: Figure 1 The threat avoidance process illustrated demonstrates autonomous onboard mission planning, eliminating the need for ground personnel and achieving threat avoidance. When an Earth observation satellite encounters a potential collision or harassment threat during its on-orbit operation, it first employs four detection methods: global, infrared, laser, and microwave. Based on the space environment conditions such as direct sunlight, backlighting, and Earth shadow, sensors are combined to search for and capture the threat target, performing initial ranging and angle measurements. Threat identification in complex space environments is achieved through information complementarity between different sensor devices. Secondly, by fusing sensor information, precise velocity, distance, and azimuth information of the threat target are obtained. Based on this, the abnormal behavior characteristics of the threat target, its orbital parameters, and collision probability are acquired. Abnormal behavior characteristics include the target's velocity and behavioral semantics. Visible light cameras can achieve precise close-range imaging; after acquiring the target's distance information, it is determined whether precise imaging with visible light cameras is necessary. Morphological features of the threat target are extracted from the multi-layered image information acquired by the visible light camera, including identification of the threat target as debris, enemy spacecraft, or payloads carried by the target. Then, by combining the processed target information with the spacecraft's own attitude and orbital parameters, a quantitative assessment of the target's threat category and threat level is obtained.
[0061] Next, based on the aforementioned threat target information and the information derived from threat level inference, a reasoning decision is made regarding the specific avoidance behaviors the spacecraft should adopt. These avoidance behaviors include three types: orbital maneuvers, attitude maneuvers, and normal operation. Then, combining threat target behavior information, the future actions of the threat target are predicted. Finally, by integrating the predicted threat behavior and the avoidance behaviors, the optimal trajectory for achieving threat avoidance is determined.
[0062] Finally, the onboard actuators execute a predetermined optimal avoidance trajectory, enabling the spacecraft to perform attitude and orbital maneuvers to evade threats. Furthermore, the spacecraft's own parameters are fed back in real-time to threat level inference and action sequence planning, thus constructing a closed-loop system for autonomous orbital threat avoidance.
[0063] S4: A two-stage planning strategy is set up. In the first stage of planning, the observation mission continues. During the avoidance behavior decision-making, if the current orbital threat will not cause harm to the spacecraft, avoidance is not necessary, and the entire plan is terminated. If the avoidance behavior decision-making requires the spacecraft to take actions to avoid the orbital threat, the observation mission is immediately interrupted, and the second stage of planning begins. The second stage of planning is used to avoid the orbital threat. The specific process is as follows:
[0064] To balance the conflict between threat avoidance and observation missions, this invention proposes a two-stage strategy. Spacecraft orbital threats are a crucial but occasional avoidance task. When not under orbital threat, spacecraft primarily perform Earth observation missions. To minimize the impact on observation mission execution during threat avoidance, a two-stage strategy is proposed.
[0065] like Figure 1 As shown, the entire threat avoidance mission is divided into two phases, with the avoidance behavior decision as the midpoint. The phase before this is the first phase, and the phase after is the second. In the first phase of planning, the spacecraft's attitude and orbital state remain unchanged, which does not affect the execution of the observation mission. Therefore, the observation mission continues in the first phase of planning. In the avoidance behavior decision, subsequent actions to achieve threat avoidance can be obtained. If the current orbital threat does not cause harm to the spacecraft, avoidance is not necessary, and the entire planning is terminated. The second phase of planning will not be implemented. If the avoidance behavior decision requires certain actions by the spacecraft to avoid the orbital threat, the avoidance actions in the second phase will change the spacecraft's attitude and orbit, thus affecting the execution of the observation mission. Therefore, the observation mission needs to be immediately interrupted, and the second phase of planning needs to be initiated. In this two-phase planning strategy, the first phase planning and the observation mission are executed in parallel, and the interruption of the observation mission is determined based on whether the second phase planning is needed. Thus, this strategy can effectively reduce the impact of threat avoidance on the observation mission while improving the efficiency of mission execution.
[0066] It should be noted that this invention studies mission planning for autonomous avoidance of orbital threats. It considers coordinating spacecraft hardware and software actions to achieve threat avoidance without ground intervention. Existing technologies do not address the specific implementation methods for each action. In actual spacecraft operation in orbit, only the planning algorithm and specific methods for each action proposed in this invention need to be loaded into the embedded microprocessor. Guided by the autonomous avoidance planning results, the execution sequence of actions follows the time constraints between them. Then, the proposed two-stage planning strategy can be implemented, enabling autonomous avoidance of orbital threats to the spacecraft.
[0067] S5: Perform time-constrained reasoning and numerical effect reasoning involving resource variables on the planning problem, ensuring that the resource variables satisfy the execution of duration-based actions during dynamic changes. The specific process is as follows:
[0068] Step 5.1: For each duration action in the planning model... Decomposed into two non-time-instantaneous actions, as follows:<pre,eff> ,in Indicates the start of an instantaneous action, act ⊥ = <pre ⊥ eff ⊥ > indicates the end of an instantaneous action. Each state in the plan is represented as S =<F,V,Q,P,C> When applying an action, it can only be applied if its effect does not conflict with any invariant of an action in Q, and F and V will be updated based on its effect. To account for the time structure of the problem, C is updated when each action is added to the plan.
[0069] When planning extended states, step information is stored in the states. Each planning step has a unique index, and each fact in each state is linked to it. The following information was used to represent them:
[0070] F + (p)(F - (p) gives the index of step i for the most recently added (deleted) fact p.
[0071] FP(p) is a set of pairs<i,d> Let be a step with precondition p, where i is the step index, d ∈ {0, ε}, and ε represents a small time interval. If d = 0, record step i at the end of an interval during which p needs to be maintained. In this case, i is the ending step of an action, where p is an invariant condition. If d = ε, record step i at the beginning of an interval during which p needs to be maintained, corresponding to the start or end condition associated with step i.
[0072] Next, describe the actions at the start of the application. This refers to the process of updating the state when an action (act) ends. The startup operation is applied in the planned step i. At that time, the following constraints will be added to the planning:
[0073] For each Add time constraint t(sstep) ≥ t(SF) + (p))+ε, where t(sstep) represents the addition The timestamp of step index sstep. The step implementing p is moved forward to before step i. For For each negative effect p, remove p from the state and add the constraint t(sstep) ≥ t(i) + ε, such that the removal step i occurs after any action that requires p. For For each positive effect p, add p to the state. Add the constraint t(sstep) ≥ t(SF). - (p)+ε, and step i is recorded as the implementation step of p. For each invariant if If p is not implemented, adding the constraint t(sstep)≥t(i) will move the recorded steps for implementing p before step i. Applying the termination action is similar, but invariant conditions do not need to be considered.
[0074] It should be noted that p∈F represents the fact of the spacecraft's state, such as the spacecraft maintaining its attitude or taking pictures. p is a symbol whose meaning changes as the spacecraft's state changes. The spacecraft's state is changed through planned actions. Each action has a duration; to handle the duration of actions, the actions are decomposed into two non-temporally instantaneous actions. Indicates the start of an instantaneous action, act ⊥ = <pre ⊥ eff ⊥ The ">" indicates the end of a momentary action. An action added to the plan must satisfy the corresponding spacecraft state conditions and will alter the spacecraft's state. This represents adding a start action. The spacecraft needs to satisfy the state fact p. Negative effect p represents some states of the spacecraft removed after adding the action, while positive effect p represents some states of the spacecraft added after adding the action. Invariants... This indicates the state that the spacecraft needs to maintain during the duration of the action. For example, a spacecraft needs to be calibrated before taking a picture; the calibration state is a prerequisite for the picture-taking action. After the photo-taking action is performed, the spacecraft's calibration status is deleted, which is a negative effect; conversely, the spacecraft's photo-taking status is increased, which is a positive effect. During the photo-taking process, the spacecraft's attitude cannot change, and maintaining this attitude is the invariant condition for the photo-taking action.
[0075] Step 5.2: For resource variable V, the state contains a vector V that records its lower and upper limits. max and V min When there is continuous numerical variation, the value of the resource variable depends on time. For each v∈V:
[0076] V eff (v) Records the index of the most recent step that has an instantaneous effect on v.
[0077] V cts (v) records a set of start and end step index pairs, where (i, j) ∈ V cts (v) indicates that the action that starts at i and ends at j (step j is still in the event queue) has a continuous numerical effect on v.
[0078] VP(v) records a set of step indices, where i ∈ VP(v). There are three cases when step i depends on the value of v: step i has a precondition involving v; the result of step i depends on the previous value of v; or step i is the start of an action whose duration depends on v.
[0079] VI(v) records index pairs such that (i, j) ∈ VI(v) when the action starting from step i and ending at step j has an invariant condition that depends on v.
[0080] Then, step 5.1 is expanded to handle the effects of changes in resource variables. When adding the action `act` at point `i`:
[0081] 1. If the effect of `act` depends on the value of `v`: Add the constraint `t(i) ≥ t(V)`. eff (v))+ε is added to S′.C to cause act to be executed after the action that most recently affected variable v, where S′.C represents the set of time constraints C in state S′; Add t(s)+ε≤t(i) and t(i)+ε≤t(e) to S′.C to place the dependency effect inside the process effect of the current activity; add i to the set S′.VP(v). For t(), the symbol in parentheses represents the index of the corresponding step, and t() represents the timestamp of the corresponding step.
[0082] 2. If act has an instantaneous numerical effect on v: add t(i) ≥ t(V) eff (v))+∈ into S′.C, and v is updated sequentially; Add t(j)+∈≤t(i) to S′.C to avoid conflicts between the effects of the act and the actions that depend on it; Add t(s)+∈≤t(i) and t(i)+∈≤t(e) to S′.C to place the steps within the scope of continuous effects of the activity; S′.V eff (v)←i; Update S′.V based on the effect. min (v), S′.V max (v).
[0083] 3. If act begins an action (ends at j) and there is an invariant condition on v: Add t(s)+∈≤t(i) and t(i)+∈≤t(e) to S′.C to place the steps within the scope of continuous effects of the activity; if act has no update effect on v, add t(i)≥t(SV). eff (v))+∈ to S′.C, deferring the invariant to the step that most recently had an effect on v; add (i,j) to S′.VI(v).
[0084] 4. If act starts an action (ending at j) and has a continuous effect on v: if act does not have an instantaneous update effect on v, add t(i)≥t(V). eff (v))+∈ into S′.C to update v sequentially; Add t(j)+∈≤t(i) to S′.C; Add t(s)+∈≤t(i) and t(i)+∈≤t(e) to S′.C to place the steps within the scope of the activity invariant conditions; add (i,j) to S′.V cts ;S′.V eff (v)←i.
[0085] 5. If act ends the action that started at k and has a continuous effect on v: Add t(s)+∈≤t(i) and t(i)+∈≤t(e) to S′.C; remove (k,i) from S′.V cts (v) in; Add t(s)+∈≤t(i) and t(i)+∈≤t(e) to S′.C to place the action within the scope of active continuous effects; S′.V eff (v)←k.
[0086] 6. If act ends an action with an invariant condition on v: add i to S′.VP(v); remove (k, i) from S′.VP(v).
[0087] These ordering constraints that change the value of variable v correspond to the order in which steps are added to the program. This allows the value of v to be determined at any time by ordering the steps based on v.
[0088] S6: Check the consistency of time constraints and resource variable constraints for interactions in each state, and delete states that do not meet the time constraints. The specific process is as follows:
[0089] Action consistency check. If the steps [0, ..., n-1] in the program P that lead to state S can be assigned the value [t(0), ..., t(n-1)], then state S is consistent only in time, representing the execution time of each corresponding step. Considering the time constraint C and resource constraint, after constructing the ordering constraint S′.C, it is necessary to check the consistency of time and resources. Any state that cannot satisfy the time constraint will be immediately removed from the search, because any expansion of the action sequence cannot lead to a valid solution.
[0090] The time constraints C established in state S are expressed as follows:
[0091] lb≤t(b)-t(a)≤ub
[0092] Where lb, ub∈R represent the upper and lower bounds of the interval time, and 0≤lb≤ub, t(b)-t(a) represent the interval time between steps a and b.
[0093] When reasoning about the continuous changes in digital resources while under time constraints, linear programming (LP) is used to capture both time and numerical constraints, including the interaction between them. How an LP is constructed will now be described.
[0094] For a plan P = [act0, ..., act0] to reach state S n-1 ], where act n-1 These are the actions most recently added to the plan. The timestamp t for each instantaneous action `act`. i Each corresponds to an LP variable step i The timestamp e of each instantaneous action act at the end of future step i. i Each has a corresponding LP variable estep i .
[0095] Because the effects of numerical changes in resource variables can be either discrete or continuous, two additional variable vectors are created at each step of the planning process. The first V... i Indicates that it immediately follows act i The value of the state variable V before execution (in the case of step 0, V) i The first is equal to the value of V in the initial state I. The second is V.i ', included in the execution of act i The value of V is then executed immediately. The variables in V0 are enumerated as v0, ..., v m-1 Similarly, the variables in V′0 are displayed as v′0, ..., v′ m-1 v i This is the i-th value in V. To represent the discrete changes caused by an action, two vectors are needed at each level: an instantaneous action may cause the value of a variable to differ immediately after execution. To represent this in LP, if the action at step i has no effect on variable v, then v′ i =v i Otherwise, for discrete effects, a constraint is introduced to define v′. i Value:
[0096] v′ i =v i +W·V+k·(ce(i)-cs(i))+c
[0097] Where W is a constant vector, c is an arbitrary constant, and W·V+c represents the action act. i v before execution i The prerequisites must be met. Functions cs(i) and ce(i) represent the start and end timestamp variables of the action at step i. If step i is the end of the action, then ce(i) = step i. i cs(i) is the action start step variable at the end of step i. Similarly, if step i initiates an action, then cs(i) = step i. i And ce(i) is either estep i If the action has not yet been completed, it is either the final action step variable that started in step i. Thus, ce(i)-cs(i) represents the relationship between the effect of the action and its duration.
[0098] Continuous linear changes occur between planning steps, not instantaneously within the execution steps themselves. To record continuous effects, when constructing the LP, starting from the planning stage, the total continuous change gradient acting on each variable v∈V is recorded, where δv represents act. i-1 Then and action i The effective gradient before execution. The gradient on variable v can only be obtained by initiating an action (initiating an existing continuous effect on v). (Adjustments are made for k∈R) or an action is terminated to change (end the effect caused by its initiation). The constant value of δ can be calculated as follows:
[0099] For all variables, if δv0 = 0, then no variable has any continuous numerical change before the planning begins; if acti If v has no continuous numerical influence, then δv i+1 =δv i If act i Initiate a continuous digital effect, Then δv i+1 =δv i +k; if act i This terminated a continuous digital effect. Then δv i+1 =δv i -k.
[0100] Based on these gradient values, when constructing the LP, the following values are recorded for each v∈V:
[0101] v val The LP variable contains the last step m that affects v, after which v′ i value
[0102] v t : The timestamp variable of the last step m that affects v;
[0103] When accessing step i, for each variable v∈V, the v before i can be accessed. i Determined as: If it is a continuous number effect: v i =v val +δv(t(i)-v t If it's an instantaneous digital effect: v i =v val +w i The value of v is increased by w in step i. In other words, v is calculated using the value of v modified after the last step and the time elapsed after that step.
[0104] At each step i, the value associated with each variable v∈V is updated as follows:
[0105] If step i has an instantaneous effect on v, then by creating a constraint that relates v′(i) to v(i), and setting v... val Set v′(i) to vt and set vt to t(i).
[0106] If step i is the start of an action with a continuous effect, changing v at a rate of c per unit time, then c is added to δv and v. val ←v′(i), v t ←t(i).
[0107] If step i is the end of an action that continuously affects v at a rate c, then subtract c from δv, and add v. val ←v′(i), v t←t(i).
[0108] Variables were created to represent resource values, and constraints were introduced to capture the impact of actions on resources. Now consider the constraints arising from the preconditions of each instantaneous action, the invariants that must be observed between the start and end of an action, and any constraints on the duration of each action in the plan. For the form of<v,{≤,=,≥},W·V+c> For each numerical prerequisite, in order for step i to be true, add a constraint to LP:
[0109] v{≤,=,≥}W·V+c
[0110] For from step i Start to step j The final action `act`, and the invariants of `act`, are added to LP in the following form, for the variable [V′]. i , V′ j-1 ] and [V i+1 V j Add a constraint once to each vector. If the end of action `act` (starting from `i`) has not yet appeared in the plan, start from `v′`. i Initially, the invariants of `act` are applied to all vectors of variables: since `act` must end in the future, its invariants must not be violated in any step of the current plan after its starting point. Finally, a duration constraint is added. For the action `act` starting at step `i`, the variable corresponding to the time `act` ends is denoted as `ce(i)`, where `ce(i) = step` if the end of the action has been inserted into the plan at step `j`. j Otherwise, ce(i) = estep i Therefore, for each duration constraint of `act`, a constraint was added:
[0111] ce(i)-step i {≥,=,≤}W·V i +c
[0112] This process constructs a programming algorithm (LP) that captures all numerical and time constraints in the planning process, as well as the interactions between them. The solution to the LP contains variables [step0, ..., step...]. n The value represents the timestamp of the action assignment in the plan. To prevent LP from assigning arbitrarily large (but valid) values to these variables, the LP objective function is to minimize the step value. n , where act nThis is the final step in the planning process so far. If the LP for reaching state S established for planning P cannot be solved, state S can be pruned from the search space without further consideration, since there is no path from S to the legal target state. This is how the validity of the planning is determined.
[0113] When performing a state-space search, a state S follows a planned trajectory, appearing after one action step and before another. If the variable v is undergoing continuous numerical changes (or active changes related to duration), the estimate of a state depends on which instantaneous actions have been applied so far, the time when those actions were applied, and how much time has elapsed since the last action was applied.
[0114] Due to the flexibility of time and continuously changing variable values, two vectors V are used. max and V min Let represent the maximum and minimum values of each numerical variable in S, respectively. The boundary values of these variables can be calculated using LP. For a state S (where act) reachable by planning P, ... n (This is the last step in P), adding another variable vector to LP, denoted as V. now and another timestamp variable step now V now The variable in the expression represents along act n The state trajectory then reaches a certain point (at time step). now The values of each state variable (`now`) are constrained. The numeric variable and timestamp of `now` are additional actions attached to the planning process.
[0115] "now" must follow the previous step, for example, step. now -step n ≥ε
[0116] "now" must precede or coincide with the end of any action that has begun but not yet been completed; for example, for each estep(i), estep(i) ≥ step. now .
[0117] For each variable v now ∈V now Its value is calculated based on continuous numerical changes:
[0118] v now =v val +δv now (step now -v t )
[0119] Finally, the invariant condition for each action that has begun but not yet been completed.<v,{≤,=,≥},W·V+c> :
[0120] v now {≤,=,≥}W·V now +c
[0121] Then, LP can be used to find the upper and lower bounds of the variables. For each variable v now ∈V now The LP solver is called twice: once to set the objective to maximize v. now The other time was to minimize v now Then take it as v in S. max and v min The value of v. In the simplest case, when the variable v is not affected by continuous or duration-dependent changes, the value of v is independent of time, therefore v max =v min Its value can be determined by continuously applying the action effect in P.
[0122] Since each variable has an upper and lower bound, rather than a fixed assignment, the optimistic value of W·X is calculated by using the upper bound on v∈X if its corresponding weight in W is positive; otherwise, its lower bound is used. The premise is then considered satisfied if the resulting value is greater than or equal to c. (The numerical condition W·X≤c can be replaced by multiplying both sides of the inequality by -1 and replacing the constraint of the form W·X=c with the equivalent condition W·X≥c, -WX≥-c.)
[0123] S7: Use time relaxation planning graphs to guide planning through the search space to reach the goal.
[0124] The Time Relaxation Programming Graph (TRPG) heuristic, used in the search algorithms described above, guides the planning process to efficiently reach the goal across the search space. We will now turn to the construction of heuristics for time-dependent changes. TRPG aims to support heuristic computation and consists of two phases: graph expansion and solution extraction. In the graph expansion phase, the goal is to construct an RPG that determines which facts and actions are achievable. TRPG consists of alternating fact layers and action layers. The alternating fact layers consist of propositions that maintain limit boundaries on v, and the action layers contain actions that satisfy the preconditions in the preceding fact layer. In the case of propositional preconditions, a precondition is satisfied if the relevant fact is contained in the preceding layer. In the case of numerical preconditions, these preconditions are satisfied if certain assignments to variables in the preconditions are consistent with the upper and lower bounds, leading to their satisfaction.
[0125] To handle both continuous and instantaneous numerical effects of the action, the continuous linear effects acting on each relevant variable are appended to the instantaneous action effect `act`, denoted by `g(act)`, which represents the set of all these continuous effects. For the continuous effect `cont(act)`, ... Initiation. That is, the gradient effect at the beginning of act includes all the continuous effects of act. Once we have the set of linear continuous effects g(act) associated with each instantaneous action act, the structure of TRPG can be adjusted. First, determine a corresponding maximum rate of change δv for each variable v. max (t), immediately following layer al(t). It is set to the sum of all positive rates of change (influence v) of any instantaneous action in al(t):
[0126]
[0127] This definition relies on the constraint that any action can only be performed once at any given time. If the number of actions that can be performed simultaneously has a definite finite bound p(a), then it is incorporated into δv. max The calculation of (t) is as follows:
[0128]
[0129] Following the al(t) layer, an upper limit is established for the rate of change of each variable. By applying this upper limit to the maximum value of the variable at time t, the maximum value of each variable at any time t′ > t is derived. Then, it is decided how far ahead t′ should be advanced in the construction of the TRPG. Several possibilities exist: the time is limited to advancing ε or until the end of the next action, depending on whether any new facts are available after the most recent action layer. The time can be advanced to the earliest value, at which the cumulative effect of the active, continuous change on the variable satisfies previously unmet preconditions.
[0130] For vectors of constants W and C, each numerical precondition can be written as a constraint on a vector of numerical variables v, in the form W·V≥c. The function ub is defined as follows:
[0131]
[0132] The upper bound of W·V at t′ is: ub(W, V) min (t′), V max (t′)). The earliest point in fact layer i where the action satisfies the numerical premise W·V≥c is ub(W,V). min (t′), V max The minimum value of t′ for (t′))≥c.
[0133] In TRPG, each level is associated with the earliest time it can represent. The earliest time available for fact p is ft(p) = max{t} min (F + (P)), t min (F - (P))+ε}, because the achievement time is either when the last achiever applies the action or after the last deleter achieves it again. Therefore, p is not added to TRPG until a fact layer appears at ft(p). Similarly, for each numeric premise specified on the variable vars, the layer that is considered to be satisfied is deferred to:
[0134]
[0135] Furthermore, any action that adds p is scheduled after the existing actions that affect p, after ft(p). And any action that deletes p must also follow the actions that require p. Therefore, the fact layer fd(p) for the action that deletes p is deferred until:
[0136]
[0137] Among them, t min (i) is the earliest executable timestamp calculated using LP when step i is added to the plan.
[0138] By similar reasoning, the numerical effect ne, which updates the variable v, must be placed after the last action of any variable appearing in ne and vars, and also after the last point that requires v:
[0139]
[0140] According to the structure of TRPG, fact layer 0 contains all true facts in S. Therefore, action layer 0 consists of all actions whose preconditions are satisfied in fact layer 0. Then, fact layer 1 is set to take the optimistic result of fact layer 0 and apply each action in action layer 0. More formally, applying the action in action layer i, i.e., action layer a1(i), results in fact layer i+1, where:
[0141] fl(i+1)=fl(i)∪{eff + (act)|act∈al(i)}
[0142] Considering the numerical effect, in action layer i, the sets of optimistic increase and decrease effects of all actions on variable v are as follows:
[0143]
[0144]
[0145] In both expressions, the minimum and maximum bounds of v make each expression as extreme as possible in the appropriate direction. Similarly, after all available assignment effects, the optimistic upper and lower bounds of v are:
[0146]
[0147]
[0148] Then, the new boundary becomes:
[0149] V max (i+1)[j]=max{act↑(i,V[j]),V max (i)[j]+∑inc(i,V[j])}
[0150] V min (i+1)[j]=min{act↓(i,V[j]),V min (i)[d]+∑dec(i,V[j])}
[0151] In other words, to find the upper (lower) bound of V[j] in the next layer, for each layer, one can choose to apply the sum of all increasing (decreasing) effects. After calculating the boundaries of all variables in the (i+1)th layer, the graph expansion continues iteratively to find the actions applicable to the (i+1)th action layer, thus finding the facts in the (i+2)th layer, and so on. The graph expansion terminates in one of two cases: the fact layer satisfies all propositions and numerical objectives; or, adding more layers never leads to more premises being satisfied, when no new propositions emerge and the accumulation of larger or smaller boundaries on the variables does not lead to any more numerical premises being satisfied. In the second case, the planning problem cannot be solved, therefore, in the original problem, any planning starting from S cannot reach G.
[0152] Assuming graph expansion terminates after all objectives are achieved, the second phase is solution extraction from the planning graph. This is a recursive process, returning from the objectives to the initial fact layers. Each fact layer adds a set of objectives (factual or numerical presuppositions) to be achieved at that layer. Solution extraction repeatedly selects the latest unachieved objective in the planning graph and chooses a method to achieve that objective. For propositional objectives, a single action (with the effect of adding an objective) is selected, and its presupposition is inserted as the objective to be achieved. To satisfy the numerical objective W·V≥c at layer i, actions that affect the variables in v (with non-zero coefficients) are selected until the net increase in W·V,k is sufficient to allow the residual presupposition W·V≤ck to be satisfied at layer i-1. At this point, this remaining presupposition is added as the objective to be achieved at layer i-1, and all action presuppositions selected to support this presupposition are added as objectives to be achieved in the preceding layers.
[0153] Solution extraction terminates when all unfulfilled goals are actually to be achieved at level 0, because these goals are real in the evaluated state and do not require supporting actions. The actions selected in solution extraction form a plan from S to the goals. The length of this plan (the number of actions) forms the heuristic estimate h(S).
[0154] Through the above steps, the final planning results are output, completing the heuristic autonomous avoidance mission planning for spacecraft in orbital threat environments. Figure 2 This is a diagram showing the planning results of a spacecraft in response to orbital threats. The entire planning result yields the action sequence between spacecraft subsystems and software modules, thereby coordinating a series of actions of the spacecraft in response to threats.
[0155] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims
1. A heuristic spacecraft autonomous evasion mission planning method under orbital threat environments, characterized in that, include: Step 1: Initialize the spacecraft configuration; Step 2: Establish a mission planning model for autonomous avoidance of spacecraft orbital threats; The mission planning model for autonomous avoidance of spacecraft orbital threats is represented by an octet. in, This is a fact that the spacecraft's status is established; This is the initial spacecraft state; It is the target state that a spacecraft needs to maintain to achieve threat avoidance; It is a collection of spacecraft resources; It is a set of actions that can change the state of a spacecraft and its effects, each action is represented as , It is the name of the action. It is the duration of the action. and They are The minimum and maximum duration, Prerequisites, including starting conditions Termination conditions and invariant conditions , It refers to the effect, including the initial effect. and ending effect ; It records actions that have started but not yet finished in the event queue; This indicates the sequence of actions from the initial state to the current state; It is a set of time constraints for the actions in the plan; Step 3: Analyze the scenarios of spacecraft orbital threats and design the spacecraft autonomous avoidance architecture based on the mission requirements of autonomous avoidance; Step 4: Set up a two-stage planning strategy. In the first stage of planning, continue the observation mission. In the avoidance behavior decision, if the current orbital threat will not cause damage to the spacecraft, then there is no need to avoid it, and the entire plan is terminated. If the avoidance behavior decision requires the spacecraft to take action to avoid the orbital threat, the observation mission is immediately interrupted, and the second stage of planning is carried out. The second stage of planning is used to avoid the orbital threat. Step 5: Perform time-constrained reasoning and numerical effect reasoning involving resource variables on the planning problem, so that the resource variables satisfy the execution of duration actions during the dynamic process of change; Step 5.1: For each duration action in the task planning model... Decomposed into two non-time-instantaneous actions, in the form of ,in, Indicates the start of an instantaneous action. This indicates the end of an instantaneous action; each state in the plan is represented as... Application Actions Only when The effect can only be applied when it does not conflict with any invariant of any action in Q. And update F and V based on their effects, and update C when adding each action to the plan; Step 5.2: For resource variables The state contains a vector that records its lower and upper bound values. and When there are continuous numerical changes, the value of a resource variable depends on time; Step 6: Check the consistency of the time constraints and resource variable constraints of the interactions in each state, and delete the states that do not meet the time constraints; Step 7: Use time relaxation planning graphs to guide the planning process through the search space to reach the target.
2. The heuristic spacecraft autonomous avoidance mission planning method under orbital threat environments according to claim 1, characterized in that, The third step involves designing an autonomous spacecraft avoidance architecture, which includes a visible light camera, a global camera, an infrared camera, a microwave radar, a lidar, a multi-sensor information fusion unit, a threat target behavior information calculation unit, a threat level inference unit, an avoidance behavior decision-making unit, and an action sequence planning unit. It employs four detection methods: global, infrared, laser, and microwave. The sensors are combined according to the space environment to search for and capture threat targets, performing initial ranging and angle measurements. Threat identification is achieved through information complementarity between different sensor devices. Furthermore, by fusing sensor information, the velocity, distance, and azimuth information of the threat target are obtained, thereby acquiring the target's abnormal behavior characteristics, orbital parameters, and collision probability. The visible light camera enables close-range imaging to acquire morphological features. By combining the processed target information with the spacecraft's own attitude and orbit parameters for fusion reasoning, a quantitative evaluation of the target's threat category and threat level is obtained. The specific avoidance behaviors that the spacecraft should take are reasoned and decided, the future actions of the threat target are predicted, and the optimal trajectory to achieve threat avoidance is solved. The spacecraft's own parameters are fed back to the threat level reasoning unit and the action sequence planning unit in real time.
3. The heuristic spacecraft autonomous avoidance mission planning method under orbital threat environments according to claim 1, characterized in that, Step 5.1 further includes: Each planning step has a unique index, and each fact in each state is recorded. All are represented using the following information: The recently added and deleted facts are shown respectively. p Steps i index; It is a pair Prerequisites p Steps, i It is a step index. , Indicates a time interval, if Record the steps i It is at the end of an interval, during which time... p It needs to be maintained, in this case, i It is the final step of an action, where p It is an invariant condition; if Record the steps i yes p The interval that needs to be maintained begins, corresponding to the step i Relevant start or end conditions; In the planning steps i Application startup operation At that time, the following constraints will be added to the planning: For each Add time constraints ,in, Indicates adding a start action The state facts that a spacecraft needs to meet p , The index i represents the step i in the spacecraft state S where the fact p was most recently added. Representative Step Index timestamp, Indicates adding Step Index timestamp, implementation p The step was moved up to step i Previously; for Each negative effect p ,Will p Remove from state, add constraints Make the deletion step i Whatever happens p After the action, among them Represents the timestamp of step i, negative effect p This indicates that after adding the action, some states of the spacecraft are deleted; for Each positive effect p ,Will p Add to state, add constraints and steps i Recorded as p The implementation steps, among which Indicator of Steps timestamp, It is the step index of the most recently deleted fact p in state S, positive effect p This indicates the addition of some states to the spacecraft after the actions are added; for each invariant ,if Not achieved p Add constraints Then the implementation of the record p The step was moved up to step i Previously, invariants This indicates the state that the spacecraft needs to maintain during the duration of the action.
4. The heuristic spacecraft autonomous avoidance mission planning method under orbital threat environments according to claim 3, characterized in that, Step 5.2 includes: exist i Add action When setting constraints, the following situations apply: 1) If The effect depends on v Value: Add constraints arrive China promotes In variables v Executed after the action, where , Indicates the state Time constraints Collection; Add and arrive middle; Records v Recent step index with instantaneous impact, Indicates to v A timestamp indicating the most recent step with an instantaneous impact; 2) If right v It has instantaneous digital effects: Add arrive In China, v Perform sequential updates; add arrive Add and arrive middle Represents a constant. This represents the timestamp of step j. Indicates the timestamp of step s; 3) If Begin an action, and... v There is an invariant condition: add and arrive In the middle; if right v No update effect, add arrive middle; This represents the timestamp of step e. The step in state S is represented by the symbol. The index's timestamp; 4) If Start an action and... v Produces continuous effects: if right v There is no instant update effect, add arrive In order, update v ;Add to arrive Add and arrive middle; 5) If End from k The initial action, and on v Produces continuous effects: Add and arrive Add and arrive middle; 6) If End of one v Actions with invariant conditions: Add i arrive In the middle; remove ( k,i )from middle, Representing state Step index set .
5. The heuristic spacecraft autonomous avoidance mission planning method under orbital threat environments according to claim 4, characterized in that, Step six includes: If the steps in action planning P reach state S The value is assigned to the value. Then state S is consistent only in time, representing the execution time of each corresponding step. Considering time constraint C and resource constraint, after constructing the ordering constraint... Next, the consistency of time resources should be checked, and any state that does not meet the time constraints should be immediately removed from the search.
6. The heuristic spacecraft autonomous avoidance mission planning method under orbital threat environments according to claim 5, characterized in that, Step six also includes: Time constraints established in state S C They are represented as follows: in, This indicates the upper and lower bounds of the time interval, and , This indicates the time interval between steps a and b; When reasoning about the continuous changes in digital resources while under time constraints, linear programming (LP) is used to capture both time constraints and digital constraints, including the interaction between them.
7. The heuristic spacecraft autonomous avoidance mission planning method under orbital threat environments according to claim 5, characterized in that, Step seven includes: The time relaxation programming heuristic consists of two phases: graph expansion and solution extraction. In the graph expansion phase, the goal is to construct a time relaxation programming graph to determine which facts and actions are achievable. The time relaxation programming graph consists of an alternating fact layer and an action layer. The alternating fact layer consists of propositions that can maintain the limit boundary on v. The action layer contains actions that satisfy the preconditions in the previous fact layer. Preconditions include propositional preconditions and numerical preconditions. In the case of propositional preconditions, if the relevant facts are contained in the previous layer, then the propositional preconditions are true. In the case of numerical preconditions, if some assignments of variables in the numerical preconditions are consistent with the upper and lower bounds, then these numerical preconditions are satisfied.
8. The heuristic spacecraft autonomous avoidance mission planning method under orbital threat environments according to claim 7, characterized in that, Step seven also includes: During graph expansion, after calculating the boundaries of all variables in the (i+1)th layer of the action layer, the graph expansion continues iterating to find actions applicable to the (i+1)th layer, thereby finding facts in the (i+2)th layer, and so on. Graph expansion terminates in one of two cases: the fact layer satisfies all propositions and numerical objectives; or, adding more layers does not lead to more preconditions being satisfied, when no new propositions appear and the accumulation of boundaries on variables does not lead to any more numerical preconditions being satisfied; in the second case, the planning problem cannot be solved.