The invention discloses a system for reducing VOIP (voice over internet protocol) communication resource overhead, comprising an input layer, a convolution layer, a sampling sub-layer and an output layer, each layer being composed of a characteristic spectrum, each characteristic spectrum containing nerve cells; a method of using the system to reduce VOIP communication resource overhead includes specifically: 1, training a convolutional neural network; 2, initializing the convolutional neural network; 3, inputting voice to be measured into a VAD (voice activity detection) system; 4, extracting voice characteristic parameter MFCC and its first-order differential characteristic parameter from each frame in order; 5, composing the parameters of each frame into a one-dimensional characteristic map taken into the convolutional neural network system; 6, the convolutional neural network system outputting in order a result [x, y] of each frame of the voice to be detected, and the VAD system making judgment and recording the results. The system and method have the advantages that the convolutional neural network system is used in the VAD system for detecting, the misjudgment rate of the VAD system is reduced, calculation time and bandwidth are saved, and VOIP voice resource overhead can be reduced at the premise of ensuring communication quality.