The invention discloses an image semantic segmentation method based on a region and a deep residual network. According to the region-based semantic segmentation method, mutually overlapped regions areextracted by using multiple scales, targets of multiple scales can be identified, and fine object segmentation boundaries can be obtained. According to the method based on the full convolutional network, the convolutional neural network is used for autonomously learning features, end-to-end training can be carried out on a pixel-by-pixel classification task, but rough segmentation boundaries areusually generated in the method. The advantages of the two methods are combined: firstly, a candidate region is generated in an image by using a region generation network, then feature extraction is performed on the image through a deep residual network with expansion convolution to obtain a feature map, the feature of the region is obtained by combining the candidate region and the feature map, and the feature of the region is mapped to each pixel in the region; And finally, carrying out pixel-by-pixel classification by using the global average pooling layer. In addition, a multi-model fusionmethod is used, different inputs are set in the same network model for training to obtain a plurality of models, and then feature fusion is carried out on the classification layer to obtain a final segmentation result. Experimental results on SIFT FLOW and PASCAL Context data sets show that the algorithm provided by the invention has relatively high average accuracy.