The invention provides a mobile robot path planning algorithm based on single-chain sequential backtracking Q-learning. According to the mobile robot path planning algorithm based on the single-chain sequential backtracking Q-learning, a two-dimensional environment is expressed by using a grid method, each environment area block corresponds to a discrete location, the state of a mobile robot at some moment is expressed by an environment location where the robot is located, the search of each step of the mobile robot is based on a Q-learning iterative formula of a non-deterministic Markov decision process, progressively sequential backtracking is carried out from the Q value of the tail end of a single chain, namely the current state, to the Q value of the head end of the single chain until a target state is reached, the mobile robot cyclically and repeatedly finds out paths to the target state from an original state, the search of each step is carried out according to the steps, and Q values of states are continuously iterated and optimized until the Q values are converged. The mobile robot path planning algorithm based on the single-chain sequential backtracking Q-learning has the advantages that the number of steps required for optimal path searching is far less than that of a classic Q-learning algorithm and a Q(lambda) algorithm, the learning time is shorter, and the learning efficiency is higher; and particularly for large environments, the mobile robot path planning algorithm based on the single-chain sequential backtracking Q-learning has more obvious advantages.