Method, device and storage medium for continuous space action planning of intelligent agent
A technology of action planning and intelligent body, which is applied in neural learning methods, based on specific mathematical models, and program-controlled manipulators. It can solve problems such as delays, inability to use interactive information, and inability to use, so as to reduce control delays, estimate accurately, and state value accurate effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0065] combine figure 1 , Figure 4 As shown, the present embodiment provides a continuous space action planning method for an agent, which includes the following steps:
[0066] S1. Combining the state observations in the continuous space action process of the agent into a vector to form a state S t , the driving control quantity in the continuous space action process of the agent is formed into a vector to form an action a t ;
[0067] S2. Construct and train a neural network model, including a strategy network module and a value network module. The strategy network module is used to obtain the action probability distribution in a certain state, and the value network module is used to calculate the value in a certain state. Repeat S2 to train and update the neural network according to the data interacted with the environment;
[0068] S3, KR-PV-UCT simulation, including the following four processes in turn (specifically as image 3 shown):
[0069] Selection process: s...
Embodiment 2
[0118] This embodiment provides a device for continuous space action planning of an agent, including a memory and a processor, the memory is used to store a computer program, and the processor is used to implement the computer program as in Embodiment 1 when executing the computer program. The steps of the method for the continuous space action planning of the agent, here, the method for the continuous space action planning of the agent is the same as that of Embodiment 1, and will not be repeated here.
Embodiment 3
[0120] This embodiment provides a storage medium for continuous space action planning of an agent, on which a computer program is stored, and when the computer program is executed by a processor, the method for the continuous space action planning of an agent as in Embodiment 1 is implemented. Steps, here, the method for the continuous space action planning of the agent is the same as that in Embodiment 1, and will not be repeated here.
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


