An intelligent power generation control method based on multi-agent
reinforcement learning having time tunnel thought includes the following steps: determining a state
discrete set S; determining a combined action
discrete set A; collecting real-time operating data of each
power grid, calculating an instantaneous value of each area
control error ACE(k) and an instantaneous value of a control performance standard CPS(k), and selecting search action a<k>; in the current state s, obtaining a short-term award function
signal R(k) by a certain area
power grid i; obtaining value function errors rho<k> and
delta<k> through calculation and
estimation; updating a Q
function table and a time tunnel matrix e(s<k>, a<k>) corresponding to all states-actions (s, a); updating Q values and updating a mixed strategy pi(s<k>, a<k>) under the current state s; then updating a time tunnel element e (s<k>, a<k>); selecting a variable learning rate phi; and updating a decision change rate
delta (s<k>, a<k>) and a decision space
estimation slope
delta<2>(s<k>, a<k>) according to a function. The intelligent power generation control method based on multi-agent
reinforcement learning having time tunnel thought aims to solve the problem of
equalization of multi-area intelligent power generation control, has a higher
adaptive learning rate capability and a faster learning speed ratio, and has a faster convergence rate and higher robustness.