[0049] Example
[0050] This embodiment discloses a gender recognition method based on an integrated convolutional neural network, such as figure 1 As shown, the steps are as follows:
[0051] S1. First, randomly combine to form a number of new training data sets, and then select M convolutional neural network classifiers trained on the above new training data sets as the base classifiers, which are the first base classifier and the second Base classifier,..., M-th base classifier; such as figure 2 As shown, the basic classifier acquisition process in this step is as follows:
[0052] S11. Select a benchmark data set and several auxiliary data sets; wherein the benchmark data set is divided into a benchmark training data set and a benchmark test data set; image 3 As shown, in this embodiment, the Feret data set is selected as the benchmark data set, and the Adience data set and the AR data set are auxiliary data sets; the Feret data set is divided into a benchmark training data set and a benchmark test data set according to a ratio of 4:1;
[0053] S12. The auxiliary data set is randomly combined and added to the reference training data set to form several new training data sets; in this embodiment, the auxiliary data set Adience data set and AR data set are added to the reference training data set of the Feret data set , Can form 4 new training data sets, namely Feret data set, Feret data set + Adience data set, Feret data set + AR data set, Feret data set + Adience data set + AR data set.
[0054] S13. A certain number of convolutional neural network models are randomly generated; in the method of this embodiment, two convolutional neural network models are randomly generated, namely the five-layer convolutional layer convolutional neural network model G_CNNS_5 and the six-layer convolutional layer convolution Product neural network model G_CNNS_6.
[0055] S14. Use several new training data sets obtained in step S12 to train the convolutional neural network model generated in step S13 respectively to obtain multiple convolutional neural network classifiers; in this embodiment, 4 obtained in step S12 Two data sets are trained on the two convolutional neural network models generated in step S13, and eight convolutional neural network classifiers are obtained through training; the classifier of the convolutional neural network classifier adopts the SoftMax classifier.
[0056] S15. Calculate the recognition accuracy of each convolutional neural network classifier obtained in step S14 on the benchmark test data set; in this embodiment, calculate the 8 convolutional neural network classifiers obtained in step S14 in the Feret data set. The recognition accuracy rate on the benchmark data set.
[0057] S16. Select M convolutional neural network classifiers with the top M recognition accuracy as base classifiers. In this embodiment, M is 5, that is, the top 5 convolutional neural network classifiers in the recognition accuracy of the 8 convolutional neural network classifiers in this step are selected as the base classifiers, and 5 base classifiers are obtained .
[0058] S2, obtain a face image to be tested;
[0059] S3. During the test, input the face image to be tested into the M base classifiers obtained in step S1, and obtain the output categories of the M base classifiers, which are category X1, category X2,..., category XM, and then merge them The gender categories output by the M base classifiers are finally one gender category.
[0060] The network tuning parameter configuration of each convolutional neural network model randomly generated in this embodiment may be as shown in Table 1 below;
[0061] Table 1
[0062] Meta parameter Parameter value Remarks max_iter 50000 Training stops when the convolution iteration is set to 50000 times base_lr 0.001 Set the learning rate to 0.001 solver_mode GPU Set the training mode to use GPU for training weight_decay 0.0005 Set the weight decay rate to 0.0005 to prevent overfitting snapshot 10000 Set to save snapshots of prediction models, etc. every 10,000 times
[0063] Such as Figure 4 As shown, the input layer to the output layer of the convolutional neural network model G_CNNS_6 of the six-layer convolutional layer in the above step S13 of this embodiment are sequentially connected with:
[0064] The first data layer Date21 layer, this layer mainly sets the training set, defines that the input data adopts the data format of LMDB (memory mapped database), the uniform size of the picture is 227*227, the number of batch processing pictures is 128, and defines the input The path source of the training set.
[0065] The second data layer is the Date22 layer, which sets the verification set and also uses the LMDB data format. The uniform size of the picture is 227*227, the number of batch processing pictures is 50, and the file directory source of the input verification set is configured.
[0066] The first convolutional layer conv21 layer, the initial bias in this layer is 0, and the weight is calculated by a Gaussian filter with a variance of 0.01; the size of the convolution kernel of this layer is 11*11, and the number of convolution kernels is 96; after this layer The size of the feature map after convolution is 55*55; the learning rate of the convolution kernel of this layer is 1 and the attenuation factor is 1, the biased learning rate is 2 and the attenuation factor is 0.
[0067] The first downsampling layer P21 layer, this layer uses the Max_pooling algorithm for downsampling, set the downsampling area size to 3*3, the pooling step size to 2, and the feature map size becomes 27*27 after downsampling.
[0068] The first LRN (Local Response Normalization, that is, the local response normalization layer) layer is the L21 layer, and this layer uses the LRN for normalization. Using the default ACROSS_CHANNELS mode, the local area can be extended to adjacent channels for normalization, but the spatial range remains unchanged. Set the number of channels for summation local_size to 5, and the scale parameter alpha to 0.0001.
[0069] The second convolutional layer conv22 layer, the size of the convolution kernel of this layer is 5*5, the number of convolution kernels is 256, and the edge supplement of the convolution kernel is 2; the group is set to 2 in this layer, which is the channel for input and output For grouping, the output channels can only be connected to the input channels of the same group.
[0070] The second down-sampling layer P22 uses the Max_pooling algorithm for down-sampling. Set the down-sampling area size to 3*3, the pooling step size to 2, and the feature map size to 13*13 after down-sampling.
[0071] The second LRN layer, L22 layer, uses LRN for normalization. The default ACROSS_CHANNELS mode is adopted, and the parameter values are the same as the first LRN layer L21 layer.
[0072] The third convolutional layer conv23, the size of the convolution kernel of this layer is 3*3, the number of convolution kernels is 384, and the edge supplement of the convolution kernel is 1; set the learning rate of the convolution kernel of this layer to 1 and the attenuation factor Is 1, the bias learning rate is 2 and the attenuation factor is 0.
[0073] The fourth convolutional layer is conv24. The size of the convolution kernel of this layer is 3*3, the number of convolution kernels is 384, and the edge supplement of the convolution kernel is 1. Set group to 2 to group the input and output channels.
[0074] The fifth convolutional layer conv25, the size of the convolution kernel of this layer is 3*3, the number of convolution kernels is 256, and the edge supplement of the convolution kernel is 1. Set group to 2 to group the input and output channels.
[0075] The sixth convolutional layer conv26 layer, the size of the convolution kernel of this layer is 3*3, the number of convolution kernels is 256, and the edge supplement of the convolution kernel is 1; set group to 2 to group the input and output channels.
[0076] The sixth downsampling layer P26 uses the Max_pooling algorithm for downsampling. The size of the downsampling area is 3*3, the pooling step size is 2, and the feature map size becomes 6*6 after downsampling.
[0077] The first full link layer Q21 layer, output 4096 full connections.
[0078] The first Dropout (random drop rate) layer D21 layer, the probability of the output value is set to 0, which can improve the overfitting of the training.
[0079] The second full link layer Q22 layer, output 4096 full connections.
[0080] And the second Dropout layer D12; set the output value to 0 with probability, which can improve the overfitting of training.
[0081] Among them, the first conv layer conv21, the second conv layer conv22, the third conv layer conv23, the fourth conv layer conv24, the fifth conv layer conv25, the sixth conv layer conv26, the first A full link layer Q21 layer and a second full link layer Q22 layer correspond to the first activation function layer ReLU21 layer, the second activation function layer ReLU22 layer, the third activation function layer ReLU23 layer, and the fourth activation function layer ReLU24 layer, respectively. The fifth activation function layer ReLU25 layer, the sixth activation function layer ReLU26 layer, the seventh activation function layer ReLU27 layer, and the eighth activation function layer ReLU28 layer. First activation function layer ReLU21 layer, second activation function layer ReLU22 layer, third activation function layer ReLU23 layer, fourth activation function layer ReLU24 layer, fifth activation function layer ReLU25 layer, sixth activation function layer ReLU26 layer, seventh The activation function layer ReLU27 layer and the eighth activation function layer ReLU28 layer respectively use the ReLU (modified linear unit) activation function to activate the neurons of the connected layer.
[0082] The structure difference between the five-layer convolutional layer convolutional neural network model G_CNNS_5 and the aforementioned six-layer convolutional layer convolutional neural network model G_CNNS_6 used in the above step S13 in this embodiment is only in the number of convolutional layers. The rest are the same. In this embodiment, the input layer to the output layer of the convolutional neural network model G_CNNS_5 of the five-layer convolutional layer are sequentially connected with the first data layer Date11 layer, the second data layer Date12 layer, the first convolution layer conv11 layer, and the first layer. Downsampling layer P11 layer, first LRN layer L11 layer, second convolutional layer conv12 layer, second downsampling layer P12 layer, second LRN layer L12 layer, third convolution layer conv13 layer, fourth convolution layer conv14 Layer, fifth convolutional layer conv15 layer, fifth down-sampling layer P15 layer, first full link layer Q11 layer, first Dropout layer D11 layer, second full link layer Q12 layer, and second Dropout layer D12 layer; Convolutional layer conv11, second conv12, third conv13, fourth conv14, fifth conv15, first full link layer Q11, second full link Layer Q12 is connected to the first activation function layer ReLU11 layer, the second activation function layer ReLU12 layer, the third activation function layer ReLU13 layer, the fourth activation function layer ReLU14 layer, the fifth activation function layer ReLU15 layer, and the sixth activation layer. The function layer ReLU16 layer and the seventh activation function layer ReLU17 layer. The parameters in each layer of the five-layer convolutional layer G_CNNS_5 and the six-layer convolutional layer G_CNNS_6 have the same parameters.
[0083] In this embodiment, the five base classifiers selected in the above step S16 are the first base classifier CNN. 1 , The second base classifier CNN 2 , The third base classifier CNN 3 , The fourth base classifier CNN 4 And the fifth base classifier CNN 5; Among them, as shown in Table 2 below, the first base classifier CNN 1 The convolutional neural network model G_CNNS_6 of the six-layer convolutional layer is trained from the Feret data set, and the second base classifier CNN 2 The convolutional neural network model G_CNNS_5 of the five-layer convolutional layer trained from the Feret data set + AR data set, the third base classifier CNN 3 It is obtained by training the convolutional neural network model G_CNNS_6 of the six-layer convolutional layer from the Feret data set + AR data set, and the fourth base classifier CNN 4 From the Feret data set + Adience data set training five-layer convolutional layer convolutional neural network model G_CNNS_5, the fifth base classifier CNN 5 The five-layer convolutional layer convolutional neural network model G_CNNS_5 is obtained from the Feret data set + Adience data set + AR data set. In this embodiment, the face images to be tested are respectively input to the first base classifier CNN 1 , The second base classifier CNN 2 , The third base classifier CNN 3 , The fourth base classifier CNN 4 And the fifth base classifier CNN 5 Then, the simple voting fusion method is used to fuse the gender categories output by the five base classifiers, and the recognition results output by the five base classifiers are voted to obtain a final gender category, as shown in Table 2 below. The gender recognition accuracy rate of the gender recognition method in the embodiment can reach 97.82%.
[0084] Table 2
[0085]
[0086]