The invention discloses a monocular vision inertia SLAM method for a dynamic scene. The method comprises the following steps: firstly, extracting ORB feature points by a visual front end, performing target identification by using a YOLO-v3 neural network, further extracting a potential static feature point set, removing RANSAC outer points of an essential matrix, screening out final static featurepoints, and tracking the final static feature points; meanwhile, in order to improve the data processing efficiency, carrying out pre-integration on IMU measurement values; initializing, and calculating initial values including attitude, speed, gravity vector and gyroscope offset; then, carrying out nonlinear optimization of visual inertia tight coupling, and establishing a map; meanwhile, carrying out loopback detection and repositioning, and finally carrying out global pose graph optimization. According to the method, deep learning and visual inertia SLAM are fused, the influence of a dynamic object on SLAM positioning and mapping can be eliminated to a certain extent, and the stability of long-time work of the system is improved.