The invention provides a
mobile robot path planning
algorithm based on single-chain sequential
backtracking Q-learning. According to the
mobile robot path planning
algorithm based on the single-chain sequential
backtracking Q-learning, a two-dimensional environment is expressed by using a grid method, each environment area block corresponds to a discrete location, the state of a
mobile robot at some moment is expressed by an environment location where the
robot is located, the search of each step of the mobile
robot is based on a Q-learning iterative formula of a non-deterministic Markov
decision process, progressively sequential
backtracking is carried out from the Q value of the
tail end of a
single chain, namely the current state, to the Q value of the head end of the
single chain until a target state is reached, the mobile
robot cyclically and repeatedly finds out paths to the target state from an original state, the search of each step is carried out according to the steps, and Q values of states are continuously iterated and optimized until the Q values are converged. The mobile
robot path planning
algorithm based on the single-chain sequential backtracking Q-learning has the advantages that the number of steps required for optimal path searching is far less than that of a classic Q-learning algorithm and a Q(
lambda) algorithm, the learning time is shorter, and the learning efficiency is higher; and particularly for large environments, the mobile
robot path planning algorithm based on the single-chain sequential backtracking Q-learning has more obvious advantages.