The invention discloses a voice emotion recognition method and device based on domain confrontation. The method comprises the steps: (1) obtaining a voice emotion database, and dividing the voice emotion database into a source domain database and a target domain database, (2) for each voice signal, extracting IS10 features as global features, (3) dividing the voice signal into a plurality of shortsegments which are overlapped by 50% forwards and backwards according to time, and extracting IS10 features of each short segment, (4) inputting the IS10 features of all the short segments into a bidirectional long-short time memory model, then inputting them into an attention mechanism model, and outputting the IS10 features as local features, (5) connecting the global features and the local features in series to serve as joint features, (6) establishing a neural network which comprises a domain discriminator and an emotion classifier, (7) training the neural network, wherein the total lossof the network is obtained by subtracting the loss of the domain discriminator from the loss of the sentiment classifier, and (8) obtaining the joint features of the speech signals to be recognized, and inputting the joint features into the trained neural network to obtain a predicted emotion category. The method is more accurate in recognition result.