The invention discloses a target tracking method based on spatio-temporal
feature fusion learning, and relates to the technical field of
computer vision and
pattern recognition. The method comprises the steps that firstly, a space-time
feature fusion learning network is constructed, space-time features comprise
time sequence features and space features, and the
time sequence features are extractedin the mode that Alexnet and a time
recurrent neural network are combined; Wherein the spatial features are divided into target object
spatial transformation features and background spatial features,and YOLOv3 and Alexnet extraction is adopted respectively. In the initial training process of the network, a training
data set and a random
gradient descent method are used for training the space-time
feature fusion learning network, and after training is completed, the network can obtain the initial capacity for positioning the target object. The
image sequence to be tracked is input into the network for forward
processing, the network outputs the position and confidence of the target object bounding box, the confidence decides whether the network performs
online learning or not, and the position of the bounding box realizes positioning of the target object, so that tracking of the target object is realized.