The invention discloses a satellite Internet of Things routing strategy based on Q-Learning. Aiming at a satellite Internet of Things routing problem in a complex environment, a satellite Internet ofThings topological structure and node state dynamic changes are considered, the whole satellite Internet of Things is regarded as a reinforcement learning environment, and satellite nodes and ground nodes are regarded as intelligent agents., A method for implementing the strategy comprises the following stepsL firstly, initializingand satellite Internet of Things parameters are initialized firstly; secondly, maintaining a Q value table by each node, and learning the Q value table by utilizing a Q value updating formula according to the hop count, the distance, the direction and the buffer areaoccupancy rate of the satellite nodes; and finally, forwarding the data packet according to a greedy selection strategy through a Q value table obtained by learning. Moreover, the reward value is improved by considering the hop count of the satellite nodes, and the discount factor is improved by considering the distance and direction of the satellite nodes and the occupancy rate of the buffer area, so that the Q value is optimized, and the purpose of efficient routing of the satellite Internet of Things is achieved. Therefore, the satellite Internet of Things routing strategy has good conversion and application prospects in the fields of aviation, spaceflight, social economy and the like.