The invention discloses a neural network structured
pruning compression optimization method for convolutional
layers, and the method comprises the steps: (1), carrying out the sparse value distribution of each convolutional layer: (1.1) training an original model, obtaining the weight parameter of each convolutional layer capable of being pruned, and carrying out the calculation, and obtaining theimportance
score of each convolutional layer; (1.2) according to the sequence of importance scores from small to large, carrying out average scale segmentation by referring to the maximum value and the minimum value, carrying out sparse value configuration from small to large on the
convolution layers of all the sections in sequence, and through model training adjustment, obtaining sparse valueconfiguration of all the
convolution layers capable of being pruned; (2) structured
pruning: selecting a
convolution filter according to the sparse value determined in the step (1.2), and carrying outstructured
pruning training; Wherein only one
convolution filter is used for each convolution layer. According to the optimization method provided by the invention, the deep neural network can be more conveniently operated on a resource-limited platform, so that the parameter storage space can be saved, and the model operation can be accelerated.