The invention provides a Q-function self-
adaptation dynamic planning method based on data. The Q-function self-
adaptation dynamic planning method based on the data achieves an
optimal control aim. The Q-function self-
adaptation dynamic planning method based on the data mainly comprises the following steps: (1) initializing a stable control strategy, (2) initializing a
weight value of an
actuator and critic neural network through the existing control strategy, (3) according to the current control strategy and a
system state of a current state, generating a control motion of a controlled
system, exerting the control motion on a controlled member, observing a
system state of a next time, (4) regulating the
weight value of the
actuator and critic neural network, (5) judging whether a current
iteration cycle is finished or not, if the current
iteration cycle is finished, entering step (6), if the current
iteration cycle is not finished, returning to the step (3), (6) judging whether neural network weight values generated by the two closest iteration cycles are obviously changed, if the neural network weight values generated by the two closest iteration cycles are obviously changed, entering the step (2) through a newly generated actor and critic neural network, and if the neural network weight values generated by the two closest iteration cycles are not obviously changed, outputting a final actor and critic neural network.