The invention provides a method for planning tasks of unmanned aerial vehicles on the basis of Q(
lambda) algorithms. The method includes a step of carrying out environment modeling, a step of initializing Markov
decision process models, a step of carrying out Q(
lambda)
algorithm iterative computation and a step of computing the optimal paths according to state value functions. The method particularly includes initializing grid spaces according to the minimum flight path section lengths of the unmanned aerial vehicles, mapping coordinates of the grid spaces to obtain
airway points and representing circular and polygonal
threat regions; building Markov decision models, to be more specific, representing flight action spaces of the unmanned aerial vehicles, designing
state transition probability and constructing reward functions; carrying out iterative computation on the basis of constructed models by the aid of the Q(
lambda) algorithms; computing each optimal path of the corresponding unmanned aerial vehicle according to the ultimate convergent state value functions. The unmanned aerial vehicles can safely avoid the
threat regions via the optimal paths computed according to the ultimate convergent state value functions. The method has the advantages that the traditional
Q learning algorithms and effectiveness tracking are combined with one another, accordingly, the value functionconvergence speeds can be increased, the value function convergence precision can be enhanced, and the unmanned aerial vehicles can be guided to avoid the
threat regions and autonomously plan paths.