The invention relates to the field of cross fusion of computer vision and deep learning, in particular to a semantic mapping method based on visual SLAM and two-dimensional semantic segmentation. Themethod comprises the following steps: S1, calibrating camera parameters, and correcting camera distortion; S2, acquiring an image frame sequence; S3, preprocessing the image; S4, judging whether the current image frame is a key frame or not, if so, turning to the step S6, and if not, turning to the step S5; S5, performing dynamic fuzzy compensation; s6, carrying out semantic segmentation, extracting ORB feature points for the image frames, and carrying out semantic segmentation by using a mask region convolutional neural network algorithm model; s7, pose calculation: utilizing a sparse SLAM algorithm model to calculate the pose of the camera; s8, using the semantic information for assisting in dense semantic map construction, and achieving three-dimensional semantic map construction of theglobal point cloud map. According to the invention, the performance of the unmanned aerial vehicle semantic mapping system can be improved, and the robustness of feature point extraction and matchingfor a dynamic scene is significantly improved.