Planetary soft landing control method and system based on reinforcement learning, and storage medium
A technology of reinforcement learning and control methods, applied in the field of deep space exploration, can solve the problems of difficult convergence of training, complex models, and inability to guarantee the guidance law, etc., and achieves the effects of strong real-time performance, optimized control performance, and damage avoidance.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0118] 1) Experimental environment settings
[0119] In this example, DDPG, TD3 and SAC are selected for training and testing, and the value function network of the three algorithms is designed as figure 1 , the strategic network structure of DDPG and TD3 is as follows figure 2 , SAC policy network structure such as image 3 . Using python programming, build a neural network based on pytorch for training.
[0120] The simulation test software environment of all algorithms in this paper is Ubuntu16.04, and the hardware environment is Intel(R) Core(TM) i5-9300H CPU+NVIDIAGEFORCE GTX 1660Ti+16.0GB RAM.
[0121] 2) Experimental results and analysis
[0122] The training process curves of DDPG, TD3 and SAC algorithms are as follows: Figure 4 , Figure 5 and Figure 6 . Among them, the DDPG round reward starts to rise at 10,000 rounds, and the increase is obvious from 10,000 to 20,000, and then gradually and steadily converges to about 300, and the training ends after abou...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


