Filter pruning method based on layer-by-layer forward recursion of multi-layer channel joint metric
By employing a multi-channel joint metric method with layer-by-layer forward recursion, combined with pruning rate and knowledge distillation, the importance of filters is accurately evaluated and filters are pruned. This solves the problem of high computational complexity of deep convolutional neural networks on edge devices and enables efficient deployment of lightweight models.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHENZHEN UNIV
- Filing Date
- 2026-02-04
- Publication Date
- 2026-06-19
AI Technical Summary
When existing deep convolutional neural networks are deployed on edge devices, their computational complexity is high, making it difficult to lightweight them without affecting model performance.
A multi-channel joint metric method with layer-by-layer forward recursion is adopted. By constructing a multi-channel joint metric model, combining pruning rate and knowledge distillation, filter pruning is performed, the importance of filters is accurately evaluated and a pruning mask matrix is generated, and the model is fine-tuned after pruning.
While maintaining model performance, computational complexity and the number of parameters are significantly reduced, enabling the deployment of lightweight models and improving the computational efficiency of edge devices.
Smart Images

Figure CN122242594A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of deep learning technology, and more specifically to a filter pruning method based on multi-channel joint metric using layer-by-layer forward recursion. Background Technology
[0002] In recent years, Deep Convolutional Neural Networks (DCNNs) and their derivative technologies have continued to generate a research boom in the field of artificial intelligence, constantly pushing the performance limits in core vision tasks such as image classification, object detection, and image segmentation. With the rapid development of technologies such as the Internet of Things (IoT), autonomous driving, and smart terminals, the intelligent upgrade of edge devices has created an urgent need for the lightweight deployment of deep learning models.
[0003] However, high-performance network models often come with a large number of parameters and computational complexity, making them difficult to run efficiently on edge devices (such as embedded chips and mobile terminals) with limited computing power and storage capacity. Therefore, how to reduce computational complexity without affecting model performance has become an urgent problem to be solved. Summary of the Invention
[0004] This invention provides a filter pruning method based on multi-channel joint metric with layer-by-layer forward recursion, in order to solve the problem of how to reduce computational complexity without affecting model performance.
[0005] In a first aspect, the present invention provides a filter pruning method based on a multi-channel joint metric using a layer-by-layer forward recursion, the method comprising:
[0006] By utilizing the information reception channels of the previous layer filter, the information reception and output channels of the current layer filter, the information reception channel of the next layer filter, and the sparsity index of the information channels in the deep convolutional network to be pruned, a multi-layer channel joint metric model is constructed through layer-by-layer forward recursion. The pruning rate of the convolutional layer is set using the BN layer parameters; By combining the pruning rate and the multi-channel joint metric model, a dynamic pruning framework with layer-by-layer forward recursion is constructed, the pruning mask matrix is calculated, and filter pruning is performed. By combining knowledge distillation, the pruned deep convolutional neural network is fine-tuned.
[0007] This invention combines all information from the filters in the previous, current, and next layers of the deep convolutional network to be pruned, introduces an information channel sparsity index, and constructs a multi-channel joint metric model that iteratively advances layer by layer to accurately evaluate filter importance. By combining the pruning rate and the multi-channel joint metric model, a dynamic pruning framework that iteratively advances layer by layer is constructed to prune filters, eliminating the interference caused by pruned filters on the measurement of filter importance. Knowledge distillation is used to fine-tune the model, reducing the performance degradation caused by model pruning, and obtaining a high-performance lightweight model, thus realizing the pruning of deep convolutional neural network structures.
[0008] In one optional implementation, a multi-layer channel joint metric model is constructed by utilizing the information reception channels of the previous layer filter, the information reception and output channels of the current layer filter, the information reception channel of the next layer filter, and the sparsity index of the information channels in the deep convolutional network to be pruned. This model includes: Define the pruning mask for each layer of filters; Measure the information reception capability of the current layer filter; Measure the information output capability of the current layer filter; Indicators for measuring the sparsity of information channels; Construct a multi-level channel joint metric model with progressive forward iteration.
[0009] This invention considers that the importance of a filter in a network is jointly determined by its information reception and information output capabilities. It measures the information reception and information output capabilities of the current layer filter, measures the sparsity index of the information channel, and reflects whether the information channel is suitable for the pruning process. It integrates the information reception capability, information output capability, and sparsity index of the information channel as a measure of the filter's importance, and constructs a multi-layer channel joint measurement model that is progressively forward-recursively applied to achieve a comprehensive and accurate assessment of importance.
[0010] In one optional implementation, the sparsity index of the information channel includes an information receiving channel sparsity index and an information output channel sparsity index. The sparsity index of the information channel includes: The ratio of the feature extraction capability of the largest sub-channel in the information receiving channel to the overall feature extraction capability of the information receiving channel is determined as the sparsity index of the information receiving channel. The ratio of the feature extraction capability of the largest sub-channel in the information output channel to the overall feature extraction capability of the information output channel is determined as the sparsity index of the information output channel.
[0011] This invention takes into account that during the pruning process, the feature extraction capabilities of the retained or removed sub-channels are similar, and the overall function of the information channel is greatly affected after pruning. Therefore, it introduces a sparsity index for the information channel and comprehensively measures the sparsity index of the information channel from two aspects: the sparsity index of the information receiving channel and the sparsity index of the information output channel.
[0012] In one optional implementation, a multi-level channel joint metric model with layer-by-layer forward recursion is constructed, including: For the first-layer filter, the product of the information receiving capability, the sparsity index of the information output channel, and the information output capability of the first-layer filter is determined as the importance measure of the first-layer filter. For intermediate layer filters that are neither the first nor the last layer, the product of the current layer's information reception capability, information reception channel sparsity index, information output channel sparsity index, and information output capability is determined as the importance metric for the current layer filter. For the last layer filter, the product of the sparsity index of the information receiving channel and the information receiving capability of the last layer filter is determined as the importance measure of the last layer filter. Based on the importance metrics of the first-layer filter, intermediate-layer filters (not the first layer and not the last layer), and the last-layer filter, a multi-layer channel joint metric model is constructed through layer-by-layer forward recursion.
[0013] This invention measures the importance of different filter levels separately, covering both information reception and output capabilities, while also taking into account the sparsity of information channels. This makes the importance measurement more comprehensive and achieves a balance between lightweight design and high performance.
[0014] In one optional implementation, a dynamic pruning framework is constructed by combining the pruning rate and a multi-channel joint metric model, and the pruning mask matrix is calculated to perform filter pruning, including: Calculate the number of filters that need to be pruned from each convolutional layer; Initialize the pruning mask matrix of the network model; Establish a generation model for the pruning mask vectors of each filter layer; Generate the pruning mask vector for the first layer filter, generate the pruning mask vector for the non-first layer filter, and obtain the pruning mask matrix. Filter pruning is performed based on the pruning mask matrix.
[0015] This invention generates pruning mask vectors in layers, recursively generates mask matrices layer by layer, and performs filter pruning based on the pruning mask matrices, thereby achieving structured pruning. This simplifies the model and reduces computational complexity while ensuring model performance.
[0016] In one optional implementation, a generation model for the pruned mask vectors of each filter layer is established, including: Calculate the importance score vector for each layer of filters and sort the importance scores of each layer of filters in descending order; The pruning mask value of filters with importance scores greater than the importance score threshold is set to 1. The importance score threshold is used to characterize the importance scores of the filters that need to be retained. Set the pruning mask value of filters whose importance score is not greater than the importance score threshold to 0.
[0017] This invention clarifies the pruning mask value of the filter by using the importance score of each layer of the filter, providing a reliable data basis for pruning decisions and avoiding the erroneous pruning of core filters or the retention of invalid and redundant filters.
[0018] In one alternative implementation, filter pruning is performed based on the pruning mask matrix, including: When the pruning mask matrix value is 1, the corresponding filter is retained; When the pruning mask matrix value is 0, the corresponding filter is removed.
[0019] This invention reduces the complexity of pruning and improves the efficiency of model simplification by retaining or removing corresponding filters based on the pruning mask matrix value.
[0020] Secondly, the present invention provides a filter pruning device based on a multi-channel joint metric of layer-by-layer forward recursion, the device comprising: The model building module is used to construct a multi-layer channel joint metric model that is progressively forward-recursively built by utilizing the information receiving channels of the previous layer filter, the information receiving and output channels of the current layer filter, the information receiving channels of the next layer filter, and the sparsity index of the information channels in the deep convolutional network to be pruned. The pruning rate setting module is used to set the pruning rate of the convolutional layer using the BN layer parameters; The pruning module is used to combine the pruning rate and the multi-channel joint metric model to construct a dynamic pruning framework that is progressively forward-recursively applied, calculates the pruning mask matrix, and performs filter pruning. The fine-tuning module is used to fine-tune the pruned deep convolutional neural network by incorporating knowledge distillation.
[0021] Thirdly, the present invention provides an electronic device, comprising: a memory and a processor, wherein the memory and the processor are communicatively connected to each other, the memory stores computer instructions, and the processor executes the computer instructions to perform the filter pruning method based on layer-by-layer forward recursion and any corresponding embodiment of the first aspect described above.
[0022] Fourthly, the present invention provides a computer-readable storage medium storing computer instructions for causing a computer to execute the filter pruning method based on layer-by-layer forward recursion and joint metric of multi-layer channels as described in the first aspect or any corresponding embodiment thereof. Attached Figure Description
[0023] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.
[0024] Figure 1 This is a flowchart illustrating a filter pruning method based on multi-channel joint metric with layer-by-layer forward recursion according to an embodiment of the present invention. Figure 2 This is a schematic diagram of the information flow of three adjacent convolutional layers according to an embodiment of the present invention; Figure 3 This is a schematic diagram of a dynamic pruning framework that iteratively advances layer by layer according to an embodiment of the present invention; Figure 4 This is a structural block diagram of a filter pruning device based on a multi-channel joint metric with layer-by-layer forward recursion according to an embodiment of the present invention. Figure 5 This is a schematic diagram of the hardware structure of an electronic device according to an embodiment of the present invention. Detailed Implementation
[0025] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0026] It is understood that before using the technical solutions disclosed in the various embodiments of the present invention, users should be informed of the types, scope of use, and usage scenarios of the personal information involved in the present invention and their authorization should be obtained in accordance with relevant laws and regulations through appropriate means.
[0027] In intelligent application scenarios, deep neural network models are key to enhancing the intelligence of terminal devices. However, due to the large number of parameters, these models pose deployment challenges for devices with limited resources. They not only require a large amount of storage and computing resources but also increase energy consumption, affecting the long-term operation of the devices.
[0028] Model compression technology is a core solution that emerged to address this contradiction. This technology can significantly reduce the number of parameters and computational complexity of a model while maintaining its predictive accuracy, ultimately resulting in a lightweight model suitable for edge computing. By reducing the number and size of parameters, model compression effectively reduces computational complexity without compromising performance. This makes models easier to deploy on terminal devices, improves efficiency, and promotes the expansion of intelligent technology applications. Advances in this technology are of great significance in promoting the widespread application of artificial intelligence.
[0029] Currently, mainstream methods for model compression include parameter quantization, knowledge distillation, low-rank decomposition, and network pruning.
[0030] Network pruning, a commonly used method in model compression, removes redundant parameters from the network by constructing specific evaluation criteria, thereby reducing the model size. Based on the pruning granularity, network pruning can be broadly classified into four categories: weight pruning, convolutional kernel pruning, filter pruning, and layer pruning.
[0031] Weight pruning is called unstructured pruning, while kernel pruning, filter pruning, and layer pruning all fall under the category of structured pruning. Weight pruning first evaluates the importance of the parameters in the model, then sets the unimportant parameters to zero, performing pruning operations at a very fine level to generate a highly sparse parameter matrix.
[0032] Convolutional kernel pruning uses two-dimensional convolutional kernels as the basic unit and assigns zero values to unimportant kernel parameters according to predefined evaluation criteria; however, these parameters are still retained in the model. Because layer pruning has a significant impact on network architecture, it is rarely used directly in practical applications.
[0033] Filter pruning achieves model compression by directly removing redundant filters from convolutional layers. Pruned networks are easier to optimize using existing computing architectures, making filter pruning an important direction in the field of network pruning.
[0034] The core issue in filter pruning is identifying redundant or unimportant filters that need to be pruned; therefore, evaluating the importance of filters is crucial. Existing filter metrics include evaluating importance based on the L1 norm and the average rank of the activation maps. However, most existing metrics treat convolutional layers or feature maps as independent structures, neglecting inter-layer dependencies. Alternatively, they may only use unidirectional information, such as using statistical information from the next layer to guide the pruning method for the current layer, evaluating importance based on the filter's contribution to the next layer, or using the activation map of the previous layer to guide the pruning method for the current layer.
[0035] To address the aforementioned issues, this invention provides a filter pruning method based on multi-channel joint metric using layer-by-layer forward recursion. This method significantly reduces the number of parameters and floating-point operations in deep convolutional neural network models while maintaining or even improving model performance.
[0036] According to an embodiment of the present invention, a filter pruning method based on multi-level channel joint metric based on layer-by-layer forward recursion is provided. It should be noted that the steps shown in the flowchart in the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions. Furthermore, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in a different order than that shown here.
[0037] This embodiment provides a filter pruning method based on multi-channel joint metric using layer-by-layer forward recursion. Figure 1 This is a flowchart of a filter pruning method based on multi-layer channel joint metric using a layer-by-layer forward recursion according to an embodiment of the present invention, as follows: Figure 1 As shown, the process includes the following steps: Step S101: Using the information receiving channels of the previous layer filter, the information receiving and output channels of the current layer filter, the information receiving channel of the next layer filter, and the sparsity index of the information channels in the deep convolutional network to be pruned, a multi-layer channel joint metric model is constructed by forward recursion.
[0038] In this embodiment of the invention, before constructing the multi-channel joint metric model, a deep convolutional neural network is pre-trained to obtain a network model to be pruned. Specifically, the purpose of pre-training the deep convolutional neural network is to obtain a network model to be pruned. Network models include single-branch networks like VGGNet, multi-branch residual networks like ResNet, and multi-branch densely connected networks like DenseNet. These network models can be used for computer vision tasks such as image classification and object detection. Existing deep learning neural network training frameworks, such as PyTorch, are used to train the selected network model to obtain the original model, which also serves as the teacher model for knowledge distillation.
[0039] In DCNN, the filters in the current layer are responsible for both receiving and integrating information from the previous layer and outputting information to the next layer. For example... Figure 2 As shown, the first l The layer is the current layer, and has A filter, denoted as . No. l Layer n A filter can be represented as ,in Indicates the first l Layer n The filter of the nth filter One convolutional kernel, For the first l The number of layer input information, and satisfying , In fact, not only the first and last layer filters, but also all have two information channels: one is the information receiving channel (i.e.,...) The other is the information output channel (i.e.) During the pruning operation, when a filter is pruned, its corresponding information receiving channel and information output channel are removed.
[0040] The information receiving channel is responsible for receiving and integrating the feature information from the previous layer, and generating new feature information. For example... Figure 2 As shown, the first The information receiving channel of the layer filter is composed of the first layer filter. A layer consists of a set of sub-channels. Each sub-channel is implemented by a convolutional kernel and is responsible for receiving feature information from the previous layer. Therefore, the first... There are layers An information receiving channel can be represented as This is consistent with the filter representation. Layer Each information receiving channel can also represent ,in Indicates the first Layer The filter of the nth filter Individual channels.
[0041] The information output channel is responsible for distributing the feature information generated by the current layer filter to the information receiving channel of the next layer filter, thus participating in the generation of the next layer's feature information. For example... Figure 2 As shown, the first The information output channel of the layer is responsible for allocating the first layer's information. The feature information generated by the layer filter is derived from the first layer filter. The layer consists of a set of sub-channels, which are responsible for transmitting the first... The feature information generated by a filter in the layer. Therefore, the first... Layer Each information output channel can be represented as ,in Indicates the first Layer The information receiving channel receives the first Layer Sub-channels of feature information generated by each filter.
[0042] The information reception capability of the current layer filter depends not only on the feature extraction capability of the information reception channel but also on the intensity of the input information. When evaluating the information reception capability of a filter, there may be information reception channels with similar feature extraction capabilities, but the distribution of feature extraction capabilities in each sub-channel may differ. The feature intensity distribution of the input information of the current layer filter is constant. A high degree of matching between the feature extraction capability distribution of its information reception channel and the feature intensity distribution of the input information indicates a strong information reception capability of the current layer filter; conversely, a low degree of matching indicates a weak information reception capability.
[0043] The information contribution of a current-layer filter to subsequent layers depends not only on the feature allocation capability of the information output channel, but also on its significant contribution to the feature information generated by the next-layer filter, and the strength of the feature information generated by the next-layer filter. If the feature information generated by the current-layer filter is a "critical dependency" of the next layer, then the current-layer filter's information contribution to the next layer is high; conversely, if the feature information generated by the current-layer filter is "completely covered" by the feature information generated by other filters, then its information contribution is low. Simultaneously, the feature information generated by the current-layer filter is allocated to the filters of the next layer by the information output channel and passed to the downstream network. The strength of the feature information generated by the filters of the next layer determines the information contribution of the current-layer filter to subsequent layers.
[0044] In the information receiving and output channels, if the convolution kernel norm is uniformly distributed among the sub-channels, the corresponding feature mapping intensities tend to be consistent. During the pruning process of deep convolutional neural networks, due to the removal of some filters from the previous layer, the corresponding sub-channels in the information receiving channel will lose input information; simultaneously, affected by the pruning of filters in the next layer, the corresponding sub-channels in the information output channel will lose their allocated information contribution. Under the condition that the overall feature extraction capabilities of the information channels are comparable, channels with higher sparsity have a larger number of sub-channels with low mapping intensities. When pruning occurs, the information carried by the removed sub-channels is relatively less, thus reducing the degree of information loss. Conversely, for dense channels with a more uniform distribution of feature mapping intensities, the information carried by their sub-channels is more evenly distributed, and pruning will lead to a relatively more significant information loss. Therefore, by enhancing the sparsity of the information channels, the sensitivity of pruning operations to network information flow can be reduced at the structural level, thereby improving the robustness of the pruning process.
[0045] By integrating the information reception channels of the previous layer, the information reception and output channels of the current layer, and the information reception channel of the next layer in the deep convolutional network to be pruned, and combining the sparsity index of the information channels, a multi-layer channel joint metric model for evaluating the importance of filters is established.
[0046] Step S102: Set the pruning rate of the convolutional layer using the BN layer parameters.
[0047] In this embodiment of the invention, since the sensitivity of each convolutional layer in DCNN to pruning is different, it is necessary to set an appropriate pruning rate for each convolutional layer. A layer pruning sensitivity evaluation is constructed using the scaling factor and offset factor of the Batch Normalization (BN) layer. By setting the overall pruning rate, the pruning rate of each convolutional layer is allocated to confirm the pruning rate of each layer.
[0048] Step S103: Combining the pruning rate and the multi-channel joint metric model, construct a dynamic pruning framework that iteratively advances layer by layer, calculate the pruning mask matrix, and perform filter pruning.
[0049] In this embodiment of the invention, due to the inter-layer transitivity of deep convolutional neural networks, the pruned filters in the previous layer no longer contribute to the information extraction of the next layer. However, in the traditional one-time global evaluation process, the filters retained in the current layer still contain a large number of "zero-input" sub-channels corresponding to the pruned filters in their information receiving channels, which are called invalid sub-channels. Although these invalid sub-channels do not participate in effective feature extraction, they introduce redundant interference in importance evaluation, thereby affecting the accuracy of the filter information receiving capability measurement.
[0050] To avoid interference from front-layer pruning on filter importance measurement, it is necessary to dynamically update the dependencies between preceding and following layers during the pruning process, ensuring that pruned filters and their corresponding sub-channels no longer participate in the evaluation of subsequent layers. Based on this, a recursive layer-by-layer pruning strategy is adopted to construct a dynamic pruning framework that proceeds forward layer by layer. Filters at each layer are pruned according to the pruning rate and a multi-layer channel joint metric model, enabling the importance assessment of each filter layer to reflect the information flow state under the current network structure in real time. This eliminates the interference of pruning on filter importance measurement from an algorithmic perspective.
[0051] Step S104: Combine knowledge distillation to fine-tune the pruned deep convolutional neural network.
[0052] In this embodiment of the invention, during the fine-tuning stage of the network model after pruning, a strategy combining knowledge distillation and fine-tuning is adopted to organically combine the advantages of the two compression methods, resulting in better model performance and compression effect.
[0053] Knowledge distillation, a model compression technique in deep learning, aims to transfer knowledge from large, complex models (i.e., teacher models) to smaller, lightweight models (i.e., student models). The key to knowledge distillation is minimizing the difference in probability distributions between the outputs of the teacher and student models.
[0054] By setting the original model as the teacher model and the pruned model as the student model, and through fine-tuning, a lightweight network with excellent performance is finally obtained.
[0055] This embodiment provides a filter pruning method based on a layer-by-layer forward recursion multi-channel joint metric. By combining all information from the filters in the previous, current, and next layers of the deep convolutional network to be pruned, and introducing an information channel sparsity index, a layer-by-layer forward recursion multi-channel joint metric model is constructed to accurately evaluate filter importance. Combining the pruning rate and the multi-channel joint metric model, a layer-by-layer forward recursion dynamic pruning framework is constructed to prune filters, eliminating the interference caused by pruned filters on the filter importance metric. Knowledge distillation is used for model fine-tuning to reduce the performance degradation caused by model pruning, resulting in a high-performance lightweight model and achieving pruning of deep convolutional neural network structures.
[0056] This embodiment provides a filter pruning method based on multi-channel joint metric using layer-by-layer forward recursion. The process includes the following steps: Step S201: Using the information receiving channels of the previous layer filter, the information receiving and output channels of the current layer filter, the information receiving channel of the next layer filter, and the sparsity index of the information channels in the deep convolutional network to be pruned, a multi-layer channel joint metric model is constructed by forward recursion.
[0057] Specifically, step S201 includes: Step S2011: Define the pruning mask for each layer of filters; Step S2012: Measure the information reception capability of the current layer filter; Step S2013: Measure the information output capability of the current layer filter; Step S2014: Measure the sparsity index of the information channel; Step S2015: Construct a multi-layer channel joint metric model that is progressively forward-recursively applied.
[0058] In this embodiment of the invention, pruning masks for each layer of filters are defined. Taking a network model with L convolutional layers as an example, the pruning mask matrix of the network model is defined as follows: H , , among which, the l The mask vector of the layer filter is , .
[0059] First, the information receiving capability of the current layer filter is measured: the current layer filter integrates the input information into new feature information through the information receiving channel. The stronger the new feature information, the stronger the information receiving capability of the current layer filter. This is mainly related to the feature extraction capability of the information receiving channel and the strength of the input information. The feature extraction capability of the information receiving channel can typically be measured using... L p Norm quantification. The larger the norm of a sub-channel, the more significant the extracted features, and the stronger the feature extraction capability of that sub-channel. The input information of the current layer is the output information of the previous layer, and its feature strength can be measured by the feature extraction capability of the information receiving channel of the previous layer filter. The stronger the feature extraction capability, the higher the intensity of its output feature information. Therefore, the first... l Layer n The feature extraction capability of each information receiving channel is: : (1) in, For the first l Layer n The L1 norm of each information receiving channel, For the first l Layer n The first information receiving channel m Individual channels, For the first l Number of input information to the convolutional layer.
[0060] For the first layer filter, i.e. l=1, since its input information is the input image, all input information is equally important, and the filter's information reception capability is not affected by the input information. Therefore, the first layer is defined as... n Information reception capability of each filter : (2) For filters that are not in the first layer, i.e., 1 < l ≤ L , define the first l Layer n Information reception capability of each filter : (3) in, For the first Layer m The pruning mask of a filter reflects its information reception capability when the feature extraction capability distribution of the information receiving channel matches the mapping intensity distribution of the input information. Conversely, when the feature extraction capability distribution of the information receiving channel does not match the mapping intensity distribution of the input information (i.e., high-intensity input information is received by sub-channels with low extraction capability, and weak-intensity input information is received by sub-channels with weak extraction capability), the filter's information reception capability is weak.
[0061] Secondly, the information output capability of the current layer filter is measured: the current layer filter distributes the generated feature information to the next layer filter through its information output channel. Typically, the feature extraction capability of the information output channel reflects the information contribution of the current layer filter to the next layer filter. However, the feature information generated by the next layer filter is composed of the feature information distributed by all filters in the current layer. If the feature extraction capability of the sub-channels of the information distributed by the current layer filter is weak, its distributed feature information will be replaced, resulting in a low information contribution to the next layer filter. If the feature extraction capability of the sub-channels of the information distributed by the current layer filter is strong, its distributed feature information is "key information," resulting in a high information contribution to the next layer filter.
[0062] For the last layer filter, there is no information output channel, so there is no need to calculate the information contribution of the current layer filter to the next layer. For filters that are not the last layer, i.e. l ≠ L , define the first l Layer n The ability of each filter to contribute information to the next layer. : (4) in, For the first Layer n The first information output channeli Individual channels, For the first +1 The number of information receiving channels in a convolutional layer.
[0063] For the first Layer L of each information receiving channel 1,∞ Norm, which refers to the feature extraction capability of the largest sub-channel in the information receiving channel: (5) in, For the first l +1 floor i The first information output channel m Individual channels.
[0064] If a current layer filter contributes significantly to the information of a filter in the next layer, but the feature information generated by that filter in the next layer is weak, the information will not be well transmitted to the downstream network. Conversely, when the strong feature information generated by the next layer carries the strong feature information allocated by the current layer filter, the current layer filter contributes significantly to the information of subsequent layers.
[0065] The strength of the feature information generated in the next layer is represented by the feature extraction capability of the information receiving channel of the next layer filter. Therefore, for filters that are not the last layer, i.e. l ≠ L , define the first l Layer n The ability of each filter to output information to the next layer is: : (6) in, For the first Layer The feature extraction capability of each information receiving channel can be obtained from the above formula (1).
[0066] This metric reflects the information contribution of the current layer filter to the next layer, and also reflects the ability of the feature information generated by the current layer filter to continue to propagate in subsequent layers.
[0067] Secondly, the sparsity of the information channels is measured: the sparsity of the information channels reflects whether they can effectively perform their feature extraction capabilities after pruning, and whether the filter adapts well to the pruning process. Sparse information channels, during pruning, easily remove sub-channels with low feature extraction capabilities and retain sub-channels with high feature extraction capabilities, resulting in minimal change to the overall function of the information channels.
[0068] Finally, a metric model for the current layer filter is constructed: the current layer filter receives and integrates the feature information generated by the previous layer filter through the information receiving channel, generating new feature information. The strength of the new feature information is determined by the information receiving capability of the current layer filter. The higher the information receiving capability of the current layer filter, the stronger the new feature information. This feature information is transmitted to the next layer filter through the filter's information output channel, and the feature information generated by the next layer filter continues to be transmitted. The information output capability of the current layer filter determines the contribution of its generated feature information to the information of subsequent layers. Therefore, the importance of a filter in the network is jointly determined by the filter's information receiving capability and information output capability. During filter pruning, what is actually pruned are the information receiving channel and the information output channel; the sparsity of the information channel reflects whether the information channel is suitable for the pruning process.
[0069] The importance metric of the current layer filter is a combination of the sparsity of the information receiving channel, the information receiving capability of the filter, the information output capability of the filter, and the sparsity of the information output channel.
[0070] By considering that the importance of a filter in a network is jointly determined by its information reception and information output capabilities, this paper measures the information reception and information output capabilities of the current layer filter, measures the sparsity index of the information channel, and reflects whether the information channel is suitable for the pruning process. The paper integrates the information reception capability, information output capability, and sparsity index of the information channel as the importance measure of the filter, and constructs a multi-layer channel joint measurement model with progressive forward inference to achieve accurate evaluation of the importance measurement in all dimensions.
[0071] Specifically, the sparsity index of the information channel includes the sparsity index of the information receiving channel and the sparsity index of the information output channel. Step S2014 above includes: Step S20141: The ratio of the feature extraction capability of the largest sub-channel in the information receiving channel to the overall feature extraction capability of the information receiving channel is determined as the sparsity index of the information receiving channel. Step S20142: The ratio of the feature extraction capability of the largest sub-channel in the information output channel to the overall feature extraction capability of the information output channel is determined as the sparsity index of the information output channel.
[0072] In this embodiment of the invention, for information channels with consistent distribution, the feature extraction capabilities of the sub-channels that are retained or removed during the pruning process are similar, and the overall function of the information channel is greatly affected after pruning.
[0073] Based on this viewpoint, a sparsity metric is introduced. That is, regarding the sparsity of information receiving channels: (7) in, For the first Layer Feature extraction capability of each information receiving channel It is the first Layer L of each information receiving channel 1,∞ Norm, which is the feature extraction capability of its largest subchannel.
[0074] For non-last layers, the sparsity of information output channels is as follows: (8) in, For the first Layer Feature extraction capability of each information output channel: (9) in, For the first Layer The L1,∞ norm of each information output channel, i.e., the feature extraction capability of its largest sub-channel: (10) This sparsity metric characterizes the norm distribution of a set of convolutional kernels. When these kernels are uniformly distributed, their L1 norms are similar, causing the sparsity value to approach 1 / C, where C is the number of input channels. Conversely, when the weights of these kernels are significantly concentrated on a small number of kernels, the sparsity metric will approach 1.
[0075] Considering that the feature extraction capabilities of the sub-channels retained or removed during the pruning process are similar, and that pruning has a significant impact on the overall function of the information channel, a sparsity index for the information channel is introduced. The sparsity index of the information channel is comprehensively measured from two aspects: the sparsity index of the information receiving channel and the sparsity index of the information output channel.
[0076] Specifically, step S2015 above includes: Step S20151: For the first layer filter, the product of the information receiving capability, the sparsity index of the information output channel, and the information output capability of the first layer filter is determined as the importance measure of the first layer filter. Step S20152: For intermediate layer filters that are neither the first nor the last layer, the product of the current layer's information receiving capability, information receiving channel sparsity index, information output channel sparsity index, and information output capability is determined as the importance metric of the current layer filter. Step S20153: For the last layer filter, the product of the sparsity index of the information receiving channel and the information receiving capability of the last layer filter is determined as the importance measure of the last layer filter. Step S20154: Based on the importance metrics of the first-layer filter, intermediate layer filters that are neither the first nor the last layer, and the last-layer filter, construct a multi-layer channel joint metric model that is progressively forward-recursively applied.
[0077] In this embodiment of the invention, the information reception capability of the first-layer filter is unaffected by the input information. Furthermore, its information reception channel is unaffected by pruning, and the sparsity of this information reception channel does not affect the importance metric of the first-layer filter. Therefore, the first layer is defined as... n The importance metric for each filter is : (11) For intermediate layer filters that are neither the first nor the last layer, i.e. , define the first l Layer n The importance metric for each filter is : (12) The last layer filter does not have an information output channel; therefore, the first layer is defined as follows: Layer The importance metric for each filter is : (13) Only when the filter adapts to the pruning process and has high information reception and output capabilities can it function effectively. A filter with a relatively large value will play a crucial role in the information transmission of the model. Filters with strong information receiving capabilities but poor information output capabilities, or those with good information output but weak information receiving capabilities, are less effective. A smaller value indicates a higher likelihood of pruning. This metric can more effectively streamline the model structure and improve its running efficiency while ensuring model performance.
[0078] By taking into account the differences between different filter levels, importance is measured separately, covering both information reception and information output capabilities, while also considering the sparsity index of the information channel. This makes the importance measurement more comprehensive and achieves a balance between lightweight design and high performance.
[0079] Step S202: Set the pruning rate of the convolutional layer using the BN layer parameters.
[0080] Please see details Figure 1 Step S102 of the illustrated embodiment will not be described again here.
[0081] Step S203: Combining the pruning rate and the multi-channel joint metric model, construct a dynamic pruning framework that iteratively advances layer by layer, calculate the pruning mask matrix, and perform filter pruning.
[0082] Specifically, step S203 includes: Step S2031: Calculate the number of filters to be pruned in each convolutional layer; Step S2032: Initialize the pruning mask matrix of the network model; Step S2033: Establish the generation model of the pruning mask vector of each layer filter; Step S2034: Generate the pruning mask vector of the first layer filter, generate the pruning mask vector of the non-first layer filter, and obtain the pruning mask matrix. Step S2035: Perform filter pruning based on the pruning mask matrix.
[0083] In embodiments of the present invention, such as Figure 3 As shown, the pruning rate of each layer was obtained. Then, based on the pruning rate, the number of filters that need to be pruned in each layer can be determined: (14) in, n l For the first l The number of filters that need to be pruned per convolutional layer N l For the first l The number of filters in each convolutional layer p l For the first l The pruning rate of each convolutional layer.
[0084] Initialize the pruning mask matrix of the network model mask matrix , of which The pruning mask vector of the layer filter is Set the pruning mask of all filters to 1, i.e. This indicates that all filters are preserved in the initial state and do not participate in pruning.
[0085] A generation model for the pruning mask vectors of each layer of filters is established. The pruning mask vectors of the first layer filters are generated, and the pruning mask vectors of non-first layer filters are generated to obtain the pruning mask matrix. This pruning mask matrix represents whether the filter needs to be pruned.
[0086] By generating pruning mask vectors layer by layer and recursively generating mask matrices layer by layer, filter pruning is performed based on the pruning mask matrices, achieving structured pruning. This simplifies the model and reduces computational complexity while ensuring model performance.
[0087] Specifically, step S2033 includes: Step S20331: Calculate the importance score vector of each layer of filters, and sort the importance scores of each layer of filters in descending order; Step S20332: Set the pruning mask value of the filter whose importance score is greater than the importance score threshold to 1. The importance score threshold is used to characterize the importance score of the filter that needs to be retained. Step S20333: Set the pruning mask value of the filter whose importance score is not greater than the importance score threshold to 0.
[0088] In this embodiment of the invention, the first l Taking the layer filter as an example, the first layer filter... l Importance score vector of layer filters Sort from highest to lowest: (15) in, It is the first in the above sorting The importance score of the filter at each position, i.e., the importance score threshold. For the first l The number of filters retained in the layer is determined by setting the pruning mask value of filters with importance scores greater than the importance score threshold to 1, and setting the pruning mask value of filters with importance scores not greater than the importance score threshold to 0. Then, in the 1st layer... l Pruning mask vector of layer filter In the middle, the first j The pruning mask matrix of each filter is : (16) Generate the pruning mask vector for the first layer filter. The importance score vector of the first-layer filter is obtained according to the above formula (11). The pruning mask vector of the first layer filter is obtained according to the above step S2032. .
[0089] Generate pruning mask vectors for non-first-layer filters For the first layer, According to the already obtained first... Layer pruning mask vector The first step is calculated using formulas (12) and (13). The score vector of the layer filter According to step S2032, the first... Pruning mask vector of layer filter .
[0090] By determining the pruning mask value of the filter based on the importance score of each layer of the filter, a reliable data basis is provided for pruning decisions, avoiding the erroneous pruning of core filters or the retention of invalid and redundant filters.
[0091] Specifically, step S2035 includes: Step S20351: When the pruning mask matrix value is 1, the corresponding filter is retained; In step S20352, when the pruning mask matrix value is 0, the corresponding filter is removed.
[0092] In this embodiment of the invention, based on the already obtained pruning mask matrix The network model is pruned to obtain the pruned network. Representing the Layer Does a filter need pruning? When the value is 1, the corresponding filter needs to be retained. When the value is 0, the corresponding filter needs to be removed.
[0093] By retaining or removing the corresponding filters based on the pruning mask matrix values, the complexity of pruning is reduced, and the efficiency of model simplification is improved.
[0094] Step S204: Combine knowledge distillation to fine-tune the pruned deep convolutional neural network.
[0095] Please see details Figure 1 Step S104 of the illustrated embodiment will not be described again here.
[0096] The filter pruning method based on multi-channel joint metric with layer-by-layer forward recursion provided in this embodiment effectively compresses and accelerates the network model, significantly reducing the number of model parameters and floating-point operations while maintaining accuracy.
[0097] It should be noted that this filter pruning method based on multi-channel joint metric of layer-by-layer forward recursion is applicable to a variety of common convolutional neural networks, such as VGG16 and ResNet56, and is also applicable to datasets of different sizes, such as the small and simple dataset CIFAR-10, the small and complex dataset CIFAR-100, and the large and complex dataset ImageNet.
[0098] This embodiment also provides a filter pruning device based on a multi-layer channel joint metric using a layer-by-layer forward recursion approach. This device is used to implement the above embodiments and preferred embodiments, and details already described will not be repeated. As used below, the term "module" can refer to a combination of software and / or hardware that performs a predetermined function. Although the device described in the following embodiments is preferably implemented in software, hardware implementation, or a combination of software and hardware, is also possible and contemplated.
[0099] This embodiment provides a filter pruning device based on multi-layer channel joint metric using layer-by-layer forward recursion, such as... Figure 4 As shown, it includes: The model building module 401 is used to construct a multi-layer channel joint metric model by using the information receiving channels of the previous layer filter, the information receiving channels and output channels of the current layer filter, the information receiving channels of the next layer filter, and the sparsity index of the information channels in the deep convolutional network to be pruned. The pruning rate setting module 402 is used to set the pruning rate of the convolutional layer using the BN layer parameters; The pruning module 403 is used to combine the pruning rate and the multi-channel joint metric model to construct a dynamic pruning framework that is progressively forward-recursively applied, calculates the pruning mask matrix, and performs filter pruning. The fine-tuning module 404 is used to fine-tune the pruned deep convolutional neural network by combining knowledge distillation.
[0100] In some alternative implementations, the model building module 401 includes: The pruning mask definition unit is used to define the pruning mask for each layer of filters; The first metric unit is used to measure the information reception capability of the current layer filter. The second metric unit is used to measure the information output capability of the current layer filter. The third metric unit is used to measure the sparsity of information channels. The model building unit is used to construct a multi-level channel joint metric model that is progressively forward-recursively applied.
[0101] In some alternative implementations, the third measurement unit includes: The first determining subunit is used to determine the ratio of the feature extraction capability of the largest sub-channel in the information receiving channel to the overall feature extraction capability of the information receiving channel as the sparsity index of the information receiving channel. The second determining sub-unit is used to determine the ratio of the feature extraction capability of the largest sub-channel in the information output channel to the overall feature extraction capability of the information output channel as the sparsity index of the information output channel.
[0102] In some alternative implementations, the model building unit includes: The third determining subunit is used to determine the importance measure of the first layer filter by multiplying the information receiving capability, the sparsity index of the information output channel, and the information output capability of the first layer filter. The fourth determining subunit is used to determine the importance measure of the current layer filter by multiplying the information receiving capability, information receiving channel sparsity index, information output channel sparsity index, and information output capability of the current layer for intermediate layer filters that are not the first layer or the last layer. The fifth determining subunit is used to determine the importance measure of the last layer filter by multiplying the sparsity index of the information receiving channel of the last layer filter and the information receiving capability. The sixth determination subunit is used to construct a multi-channel joint metric model based on the importance metrics of the first-layer filter, intermediate layer filters that are not in the first layer and not in the last layer, and the last layer filter.
[0103] In some alternative implementations, the pruning module 403 includes: The computation unit is used to calculate the number of filters that need to be pruned from each convolutional layer; Initialization unit, used to initialize the pruning mask matrix of the network model; The model building unit is used to build the generation model of the pruning mask vectors of each layer of filters; The pruning mask vector generation unit is used to generate the pruning mask vector of the first layer filter, generate the pruning mask vector of the non-first layer filter, and obtain the pruning mask matrix. The pruning unit is used to prune the filter based on the pruning mask matrix.
[0104] In some optional implementations, the module creation unit includes: The sorting subunit is used to calculate the importance score vector of each layer of filters and sort the importance scores of each layer of filters in descending order. The first setting subunit is used to set the pruning mask value of filters whose importance score is greater than the importance score threshold to 1. The importance score threshold is used to characterize the importance score of the filters that need to be retained. The second setting subunit is used to set the pruning mask value of filters whose importance score is not greater than the importance score threshold to 0.
[0105] In some alternative implementations, the pruning unit includes: The reserved sub-unit is used to retain the corresponding filter when the value of the pruning mask matrix is 1; Remove sub-units, which are used to remove the corresponding filters when the pruning mask matrix value is 0.
[0106] The filter pruning device based on multi-level channel joint metric based on layer-by-layer forward recursion provided in this embodiment of the invention can execute the filter pruning method based on multi-level channel joint metric based on layer-by-layer forward recursion provided in any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the method. Further functional descriptions of the above modules and units are the same as in the corresponding embodiments described above, and will not be repeated here.
[0107] Figure 5 This is a schematic diagram of the structure of an electronic device provided in an embodiment of the present invention.
[0108] The following is a detailed reference. Figure 5 The diagram illustrates a structural schematic suitable for implementing an electronic device according to embodiments of the present invention. The electronic device may include a processor (e.g., a central processing unit, graphics processor, etc.) 501, which can perform various appropriate actions and processes according to a program stored in read-only memory (ROM) 502 or a program loaded from memory 508 into random access memory (RAM) 503. The RAM 503 also stores various programs and data required for the operation of the electronic device. The processor 501, ROM 502, and RAM 503 are interconnected via a bus 504. An input / output (I / O) interface 505 is also connected to the bus 504.
[0109] Typically, the following devices can be connected to I / O interface 505: input devices 506 including, for example, touchscreens, touchpads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; output devices 507 including, for example, liquid crystal displays (LCDs), speakers, vibrators, etc.; memory devices 508 including, for example, magnetic tapes, hard disks, etc.; and communication devices 509. Communication device 509 allows electronic devices to communicate wirelessly or wiredly with other devices to exchange data. Although Figure 5 Electronic devices with various devices are shown, but it should be understood that it is not required to implement or have all of the devices shown, and more or fewer devices may be implemented or have instead.
[0110] In particular, according to embodiments of the present invention, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments of the present invention include a computer program product comprising a computer program carried on a non-transitory computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via a communication device 509, or installed from a memory 508, or installed from a ROM 502. When the computer program is executed by the processor 501, it performs the functions defined in the filter pruning method based on layer-by-layer forward recursion and multi-layer channel joint metric according to embodiments of the present invention.
[0111] Figure 5 The electronic device shown is merely an example and should not be construed as limiting the functionality and scope of use of the embodiments of the present invention.
[0112] This invention also provides a computer-readable storage medium. The methods described above according to embodiments of the invention can be implemented in hardware or firmware, or implemented as recordable on a storage medium, or implemented as computer code downloaded over a network and originally stored on a remote storage medium or a non-transitory machine-readable storage medium and subsequently stored on a local storage medium. Thus, the methods described herein can be processed by software stored on a storage medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware. The storage medium can be a magnetic disk, optical disk, read-only memory, random access memory, flash memory, hard disk, or solid-state drive, etc.; further, the storage medium can also include combinations of the above types of memory. It is understood that computers, processors, microprocessor controllers, or programmable hardware include storage components capable of storing or receiving software or computer code. When the software or computer code is accessed and executed by the computer, processor, or hardware, the filter pruning method based on layer-by-layer forward recursion and multi-level channel joint metric shown in the above embodiments is implemented.
[0113] A portion of this invention can be applied as a computer program product, such as computer program instructions, which, when executed by a computer, can invoke or provide the methods and / or technical solutions according to the invention through the operation of the computer. Those skilled in the art will understand that the forms in which computer program instructions exist in a computer-readable medium include, but are not limited to, source files, executable files, installation package files, etc. Correspondingly, the ways in which computer program instructions are executed by a computer include, but are not limited to: the computer directly executing the instructions, or the computer compiling the instructions and then executing the corresponding compiled program, or the computer reading and executing the instructions, or the computer reading and installing the instructions and then executing the corresponding installed program. Here, the computer-readable medium can be any available computer-readable storage medium or communication medium accessible to a computer.
[0114] Although embodiments of the invention have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the invention, and all such modifications and variations fall within the scope defined by the appended invention.
Claims
1. A filter pruning method based on multi-level channel joint metric using layer-by-layer forward recursion, characterized in that, The method includes: By utilizing the information reception channels of the previous layer filter, the information reception and output channels of the current layer filter, the information reception channel of the next layer filter, and the sparsity index of the information channels in the deep convolutional network to be pruned, a multi-layer channel joint metric model is constructed through layer-by-layer forward recursion. The pruning rate of the convolutional layer is set using the BN layer parameters; Combining the pruning rate and the multi-channel joint metric model, a dynamic pruning framework is constructed by progressive forward iteration, the pruning mask matrix is calculated, and filter pruning is performed. By combining knowledge distillation, the pruned deep convolutional neural network is fine-tuned.
2. The method according to claim 1, characterized in that, The method utilizes the information reception channels of the previous layer filter, the information reception and output channels of the current layer filter, the information reception channel of the next layer filter, and the sparsity index of the information channels in the deep convolutional network to be pruned to construct a multi-layer channel joint metric model with layer-by-layer forward recursion, including: Define the pruning mask for each layer of filters; Measure the information reception capability of the current layer filter; Measure the information output capability of the current layer filter; Indicators for measuring the sparsity of information channels; Construct a multi-level channel joint metric model with progressive forward iteration.
3. The method according to claim 2, characterized in that, The sparsity index of the information channel includes the sparsity index of the information receiving channel and the sparsity index of the information output channel. The sparsity index for measuring the information channel includes: The ratio of the feature extraction capability of the largest sub-channel in the information receiving channel to the overall feature extraction capability of the information receiving channel is determined as the sparsity index of the information receiving channel. The ratio of the feature extraction capability of the largest sub-channel in the information output channel to the overall feature extraction capability of the information output channel is determined as the sparsity index of the information output channel.
4. The method according to claim 2, characterized in that, The construction of the multi-level channel joint metric model with layer-by-layer forward recursion includes: For the first-layer filter, the product of the information receiving capability, the sparsity index of the information output channel, and the information output capability of the first-layer filter is determined as the importance measure of the first-layer filter. For intermediate layer filters that are neither the first nor the last layer, the product of the current layer's information reception capability, information reception channel sparsity index, information output channel sparsity index, and information output capability is determined as the importance metric for the current layer filter. For the last layer filter, the product of the sparsity index of the information receiving channel and the information receiving capability of the last layer filter is determined as the importance measure of the last layer filter. Based on the importance metrics of the first-layer filter, intermediate-layer filters (not the first layer and not the last layer), and the last-layer filter, a multi-layer channel joint metric model is constructed through layer-by-layer forward recursion.
5. The method according to claim 1, characterized in that, The process of constructing a layer-by-layer forward recursive dynamic pruning framework by combining the pruning rate and the multi-channel joint metric model, calculating the pruning mask matrix, and performing filter pruning includes: Calculate the number of filters that need to be pruned from each convolutional layer; Initialize the pruning mask matrix of the network model; Establish a generation model for the pruning mask vectors of each filter layer; Generate the pruning mask vector for the first layer filter, generate the pruning mask vector for the non-first layer filter, and obtain the pruning mask matrix. Filter pruning is performed based on the pruning mask matrix.
6. The method according to claim 5, characterized in that, The generation model for establishing the pruning mask vectors of each layer of filters includes: Calculate the importance score vector for each layer of filters and sort the importance scores of each layer of filters in descending order; The pruning mask value of filters whose importance score is greater than the importance score threshold is set to 1, where the importance score threshold is used to characterize the importance score of the filters that need to be retained. Set the pruning mask value of filters whose importance score is not greater than the importance score threshold to 0.
7. The method according to claim 6, characterized in that, The filter pruning based on the pruning mask matrix includes: When the pruning mask matrix value is 1, the corresponding filter is retained; When the pruning mask matrix value is 0, the corresponding filter is removed.
8. A filter pruning device based on multi-channel joint metric using layer-by-layer forward recursion, characterized in that, The device includes: The model building module is used to construct a multi-layer channel joint metric model that is progressively forward-recursively built by utilizing the information receiving channels of the previous layer filter, the information receiving and output channels of the current layer filter, the information receiving channels of the next layer filter, and the sparsity index of the information channels in the deep convolutional network to be pruned. The pruning rate setting module is used to set the pruning rate of the convolutional layer using the BN layer parameters; The pruning module is used to combine the pruning rate and the multi-channel joint metric model to construct a dynamic pruning framework that is progressively forward-recursively applied, calculates the pruning mask matrix, and performs filter pruning. The fine-tuning module is used to fine-tune the pruned deep convolutional neural network by incorporating knowledge distillation.
9. An electronic device, characterized in that, include: A memory and a processor are communicatively connected, the memory storing computer instructions, and the processor executing the computer instructions to perform the filter pruning method based on multi-layer channel joint metric according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions for causing the computer to perform the filter pruning method based on multi-layer channel joint metric according to any one of claims 1 to 7.