The invention relates to a sound event detection and positioning method based on deep learning. The method comprises the following steps of: step 1, segmenting a data set; step 2, preprocessing, namely performing feature extraction on the data set containing the sound signal to obtain a Log-Mel spectrogram and GCC-PHAT; step 3, constructing a deep learning model, namely constructing a network architecture combining a ResNet framework and RNN by referring to the ResNet framework, and compounding a pooling module, a regularization module and a normalization module between layers for optimizing feature extraction and improving nonlinearity; and step 4, two-step training, namely firstly training the SED task to obtain an optimal model and inputting the training result into DOA task training asa feature; and then training the DOA task to finally obtain an optimal training model. According to the method, the characteristics suitable for task training are extracted firstly, so that the reverberation resistance is improved, a new frame structure is provided to solve the problem that the precision is reduced due to network deepening, and finally the prediction precision is improved.