Task processing model generation method and device, electronic equipment, and storage medium
By pruning the deep neural network model during training, a pruned task processing model is generated, which solves the problem of large model parameters, reduces computational requirements, and improves model deployment efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING DAJIA INTERNET INFORMATION TECH CO LTD
- Filing Date
- 2023-02-03
- Publication Date
- 2026-06-12
AI Technical Summary
Deep neural network models have a large number of parameters and require a large amount of computation, making them difficult to deploy and apply on devices with limited computing power.
By acquiring sample multimedia data and hierarchical dependency information, and combining it with pruning parameter configuration information, the deep neural network model is pruned and trained to determine the target pruning parameter information and generate a pruned task processing model.
It effectively reduced the number of parameters in the task processing model and the operating pressure on the equipment, and improved the efficiency of model pruning.
Smart Images

Figure CN116205286B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of computer technology, and in particular to a method, apparatus, electronic device, and storage medium for generating a task processing model. Background Technology
[0002] With the rapid development of science and technology, artificial intelligence has been widely applied, and deep learning, as a major technology of artificial intelligence, has also received increasing attention and importance.
[0003] In related technologies, artificial intelligence models, such as deep neural network models, are trained and deployed to perform data processing, such as classification. However, with the development of deep learning, deep neural network models are becoming increasingly deep, with a large number of parameters and high computational demands, making them difficult to deploy and apply on devices with limited computing power. Summary of the Invention
[0004] This disclosure provides a method, apparatus, electronic device, and storage medium for generating a task processing model, to at least solve the problem of the large number of parameters in deep task processing models in related technologies. The technical solution of this disclosure is as follows:
[0005] According to a first aspect of the present disclosure, a method for generating a task processing model is provided, the method comprising:
[0006] Acquire sample multimedia data;
[0007] Obtain the hierarchical dependency information corresponding to the first task processing model to be pruned, and obtain the pruning parameter configuration information; the hierarchical dependency information represents the data transmission dependency between at least two neural network layers in the first task processing model;
[0008] Based on the sample multimedia data, the pruning parameter configuration information, and the hierarchical dependency information, the first task processing model is subjected to pruning training to determine the target pruning parameter information.
[0009] The first task processing model is pruned based on the target pruning parameter information to obtain a pruned task processing model. The pruned task processing model is used to process the multimedia data to be processed and generate task processing results.
[0010] In some possible designs, the step of performing pruning training on the first task processing model based on the sample multimedia data, the pruning parameter configuration information, and the hierarchical dependency information to determine the target pruning parameter information includes:
[0011] Based on the pruning parameter configuration information and the hierarchical dependency information, initialize the pruning parameter information;
[0012] Based on the sample multimedia data and the pruning parameter information, the model parameters in the first task processing model are pruned and trained to obtain the pruned second task processing model and the model loss information corresponding to the second task processing model.
[0013] The pruning parameter information is updated based on the model loss information to obtain the target pruning parameter information.
[0014] In some possible designs, the step of pruning and training the model parameters in the first task processing model based on the sample multimedia data and the pruning parameter information to obtain the pruned second task processing model and the corresponding model loss information of the second task processing model includes:
[0015] Based on the pruning parameter information, the target neural network layer in the first task processing model is pruned to obtain the second task processing model.
[0016] The sample multimedia data is input into the second task processing model for preset processing to obtain the model loss information.
[0017] In some possible designs, the step of pruning the target neural network layer in the first task processing model based on the pruning parameter information to obtain the second task processing model includes:
[0018] Determine the weight parameter information corresponding to the target neural network layer;
[0019] Based on the pruning parameter information, the weight parameter information is pruned to obtain the pruned parameter information corresponding to the target neural network layer;
[0020] The first task processing model is updated based on the pruned parameter information to obtain the second task processing model.
[0021] In some possible designs, the step of inputting the sample multimedia data into the second task processing model for preset processing to obtain the model loss information includes:
[0022] The sample multimedia data is input into the second task processing model for preset processing to obtain the detection result corresponding to the sample multimedia data.
[0023] Obtain the tag data corresponding to the sample multimedia data;
[0024] The detection results are compared with the label data to obtain the detection loss information corresponding to the second task processing model;
[0025] Based on the weight parameter information and the post-pruning parameter information, determine the pruning loss information corresponding to the second task processing model;
[0026] The model loss information is determined based on the detection loss information and the pruning loss information.
[0027] In some possible designs, before pruning the target neural network layer in the first task processing model based on the pruning parameter information to obtain the second task processing model, the following steps are also included:
[0028] Based on the pruning parameter configuration information and the hierarchical dependency information, the first neural network layer to be pruned and the second neural network layer corresponding to the first neural network layer are determined in the first task processing model. There is a data transmission dependency between the second neural network layer and the first neural network layer.
[0029] The first neural network layer and the second neural network layer are determined as the target neural network layer.
[0030] In some possible designs, the pruning parameter configuration information includes a pruning method identifier. The step of pruning the target neural network layer in the first task processing model based on the pruning parameter information to obtain the second task processing model includes:
[0031] Determine the pruning processing rule information corresponding to the pruning method identifier;
[0032] Based on the pruning rules and pruning parameters, the target neural network layer in the first task processing model is pruned to obtain the second task processing model.
[0033] In some possible designs, obtaining the pruning parameter configuration information includes:
[0034] Obtain the model training configuration file corresponding to the first task processing model, wherein the model training configuration file includes the inserted pruning configuration text;
[0035] Read the pruning configuration text to obtain the pruning parameter configuration information.
[0036] In some possible designs, obtaining the hierarchical dependency information corresponding to the first task processing model to be pruned includes:
[0037] Determine the input information and output information corresponding to each of the at least two neural network layers;
[0038] Determine the connection relationship information between the at least two neural network layers;
[0039] The hierarchical dependency information is determined based on the connection relationship information, the input information, and the output information.
[0040] According to a second aspect of the present disclosure, an apparatus for generating a task processing model is provided, the apparatus comprising:
[0041] The sample data acquisition module is configured to acquire sample multimedia data;
[0042] The configuration information acquisition module is configured to acquire the hierarchical dependency information corresponding to the first task processing model to be pruned, and to acquire the pruning parameter configuration information; the hierarchical dependency information represents the data transmission dependency relationship between at least two neural network layers in the first task processing model;
[0043] The pruning parameter determination module is configured to perform pruning training on the first task processing model based on the sample multimedia data, the pruning parameter configuration information, and the hierarchical dependency information, and determine the target pruning parameter information.
[0044] The model pruning module is configured to perform pruning processing on the first task processing model based on the target pruning parameter information to obtain a pruned task processing model. The pruned task processing model is used to process the multimedia data to be processed and generate task processing results.
[0045] In some possible designs, the pruning parameter determination module includes:
[0046] The parameter initialization submodule is configured to initialize pruning parameter information based on the pruning parameter configuration information and the hierarchical dependency information.
[0047] The pruning training module is configured to perform pruning training on the model parameters in the first task processing model based on the sample multimedia data and the pruning parameter information, so as to obtain the pruned second task processing model and the model loss information corresponding to the second task processing model.
[0048] The pruning parameter determination submodule is configured to update the pruning parameter information based on the model loss information to obtain the target pruning parameter information.
[0049] In some possible designs, the pruning training module includes:
[0050] The target network layer pruning unit is configured to perform pruning processing on the target neural network layer in the first task processing model based on the pruning parameter information to obtain the second task processing model.
[0051] The model loss information determination unit is configured to input the sample multimedia data into the second task processing model for preset processing to obtain the model loss information.
[0052] In some possible designs, the target network layer pruning unit includes:
[0053] The weight parameter determination subunit is configured to determine the weight parameter information corresponding to the target neural network layer.
[0054] The weight parameter pruning subunit is configured to perform pruning processing on the weight parameter information based on the pruning parameter information to obtain the pruned parameter information corresponding to the target neural network layer.
[0055] The model update subunit is configured to update the first task processing model based on the pruned parameter information to obtain the second task processing model.
[0056] In some possible designs, the model loss information determination unit includes:
[0057] The sample detection subunit is configured to input the sample multimedia data into the second task processing model for preset processing to obtain the detection result corresponding to the sample multimedia data.
[0058] The tag data acquisition subunit is configured to acquire the tag data corresponding to the sample multimedia data;
[0059] The detection loss determination subunit is configured to compare the detection results with the label data to obtain the detection loss information corresponding to the second task processing model;
[0060] The pruning loss determination subunit is configured to determine the pruning loss information corresponding to the second task processing model based on the weight parameter information and the post-pruning parameter information.
[0061] The model loss determination subunit is configured to determine the model loss information based on the detection loss information and the pruning loss information.
[0062] In some possible designs, the device further includes:
[0063] The network layer lookup module is configured to perform an operation based on the pruning parameter configuration information and the hierarchical dependency information to determine the first neural network layer to be pruned in the first task processing model and the second neural network layer corresponding to the first neural network layer, wherein there is a data transmission dependency between the second neural network layer and the first neural network layer.
[0064] The target network layer determination module is configured to determine the first neural network layer and the second neural network layer as the target neural network layer.
[0065] In some possible designs, the pruning parameter configuration information includes a pruning method identifier, and the pruning training module further includes:
[0066] The pruning rule determination submodule is configured to determine the pruning processing rule information corresponding to the pruning method identifier;
[0067] The target network layer pruning unit is configured to perform pruning processing on the target neural network layer in the first task processing model based on the pruning processing rule information and the pruning parameter information, so as to obtain the second task processing model.
[0068] In some possible designs, the configuration information acquisition module includes:
[0069] The model file acquisition submodule is configured to acquire the model training configuration file corresponding to the first task processing model, wherein the model training configuration file includes the inserted pruning configuration text.
[0070] The pruning configuration acquisition submodule is configured to read the pruning configuration text and obtain the pruning parameter configuration information.
[0071] In some possible designs, the configuration information acquisition module further includes:
[0072] The input / output information determination submodule is configured to determine the input information and output information corresponding to each of the at least two neural network layers.
[0073] The connection relationship information determination submodule is configured to determine the connection relationship information between the at least two neural network layers.
[0074] The hierarchical dependency determination submodule is configured to determine the hierarchical dependency information based on the connection relationship information, the input information, and the output information.
[0075] According to a third aspect of the present disclosure, an electronic device is provided, comprising: a processor; and a memory for storing processor-executable instructions; wherein the processor is configured to execute the instructions to implement a method for generating a task processing model as described in any of the first aspects above.
[0076] According to a fourth aspect of the present disclosure, a computer-readable storage medium is provided, wherein when instructions in the storage medium are executed by a processor of an electronic device, the electronic device is enabled to perform the task processing model generation method described in any one of the first aspects of the present disclosure.
[0077] According to a fifth aspect of the present disclosure, a computer program product including instructions is provided, which, when run on a computer, causes the computer to perform the method for generating a task processing model as described in any of the first aspects of the present disclosure.
[0078] The technical solutions provided by the embodiments of this disclosure bring at least the following beneficial effects:
[0079] By determining the hierarchical dependency information corresponding to the task processing model, the data transmission dependency between network layers in the task processing model can be characterized. Based on the above hierarchical dependency information and the obtained preset pruning parameter configuration information, the pruning parameter information can be automatically determined, and the task processing model can be automatically pruned according to the pruning parameter information to obtain the pruned task processing model. This effectively reduces the number of parameters in the task processing model and the operating pressure on the equipment, and improves the model pruning efficiency.
[0080] It should be understood that the above general description and the following detailed description are exemplary and explanatory only, and are not intended to limit this disclosure. Attached Figure Description
[0081] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this disclosure and, together with the description, serve to explain the principles of this disclosure, and are not intended to unduly limit this disclosure.
[0082] Figure 1 This is a schematic diagram illustrating an application environment according to an exemplary embodiment;
[0083] Figure 2 This is a flowchart illustrating a method for generating a task processing model according to an exemplary embodiment. Figure 1 ;
[0084] Figure 3 This is a flowchart illustrating a method for generating a task processing model according to an exemplary embodiment. Figure 2 ;
[0085] Figure 4 An exemplary diagram illustrates a neural network layer that has interdependent relationships between its layers;
[0086] Figure 5 This is a flowchart illustrating a method for generating a task processing model according to an exemplary embodiment. Figure 3 ;
[0087] Figure 6 This is a block diagram of a task processing model generation apparatus according to an exemplary embodiment;
[0088] Figure 7 This is a block diagram illustrating an electronic device for model pruning according to an exemplary embodiment. Detailed Implementation
[0089] To enable those skilled in the art to better understand the technical solutions of this disclosure, the technical solutions in the embodiments of this disclosure will be clearly and completely described below with reference to the accompanying drawings.
[0090] It should be noted that the terms "first," "second," etc., used in the specification, claims, and accompanying drawings of this disclosure are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this disclosure described herein can be implemented in orders other than those illustrated or described herein. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this disclosure. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this disclosure as detailed in the appended claims.
[0091] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for display, data used for analysis, etc.) involved in this disclosure are all information and data authorized by the user or fully authorized by all parties.
[0092] Please see Figure 1 , Figure 1 This is a schematic diagram illustrating an application environment according to an exemplary embodiment, such as... Figure 1 As shown, the application environment may include terminal 100 and server 200.
[0093] Terminal 100 can be used to provide model pruning services to any user. Specifically, terminal 100 can be, but is not limited to, electronic devices such as smartphones, desktop computers, tablets, laptops, smart speakers, digital assistants, augmented reality (AR) / virtual reality (VR) devices, and smart wearable devices, or software running on the aforementioned electronic devices, such as applications. Optionally, the operating system running on the electronic device can be, but is not limited to, Android, iOS, Linux, and Windows. Optionally, terminal 100 has a preset lightweight model framework installed, which is used to implement the above-mentioned task processing model generation method.
[0094] In an optional embodiment, server 200 can provide background services to terminal 100. Specifically, server 200 can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network), and big data and artificial intelligence platforms.
[0095] In addition, it should be noted that, Figure 1 The example shown is merely one application environment provided by this disclosure. In practical applications, other application environments may also be included, such as more terminals.
[0096] In the embodiments described in this specification, the terminal 100 and the server 200 can be directly or indirectly connected through wired or wireless communication, and this disclosure does not impose any restrictions.
[0097] Figure 2 This is a flowchart illustrating a method for generating a task processing model according to an exemplary embodiment. Figure 1 Optionally, the method for generating this task processing model is used in an electronic device. Optionally, the electronic device can be a terminal or a server. Optionally, the method for generating this task processing model is implemented based on a preset lightweight model framework. Figure 2 As shown, the method may include the following steps (210-230).
[0098] Step 210: Obtain sample multimedia data.
[0099] The aforementioned sample multimedia data can be sample images, sample videos, sample text, or mixed-modal multimedia data, etc.
[0100] Step 220: Obtain the hierarchical dependency information corresponding to the first task processing model to be pruned, and obtain the pruning parameter configuration information.
[0101] Optionally, the task processing model described above is a machine learning model, such as a neural network model or a statistical model.
[0102] In an exemplary embodiment, the device acquires a first task processing model to be pruned and determines the hierarchical dependency information corresponding to the first task processing model. This hierarchical dependency information characterizes the data transmission dependencies between at least two neural network layers in the first task processing model.
[0103] Optionally, the first task processing model can be a trained task processing model or an untrained task processing model.
[0104] In one possible implementation, the aforementioned sample multimedia data is input into an initial task processing model for pre-defined processing to obtain corresponding detection results. The label data corresponding to the sample multimedia data, i.e., the actual results, is compared with the detection results to obtain the model loss information corresponding to the task processing model. If the model loss information meets preset loss conditions, such as a loss value less than or greater than a preset threshold, the current model parameters can be saved, generating the aforementioned first task processing model. If the model loss information does not meet the preset loss conditions, the model parameters are updated until the model loss information meets the preset loss conditions, thereby generating the aforementioned first task processing model.
[0105] Optionally, the first task processing model is used to perform preset processing on the data to be detected, thereby obtaining the corresponding detection results. The data to be detected includes images, videos, text, and audio to be detected. The preset processing can be preset classification processing, object detection processing, etc., which are not limited in this embodiment.
[0106] Optionally, the aforementioned lightweight model framework includes a model parsing module. Based on the model parsing module, the aforementioned first task processing model, i.e. the original model, can be obtained and parsed to construct hierarchical dependency information.
[0107] Optionally, the class model information corresponding to the first task processing model is obtained, such as the torch.nn.Module class model input model parsing module. Based on the above class model information, the model dependency relationship parsing processing is performed to obtain the above hierarchical dependency relationship information, that is, the hierarchical dependency relationship information output by the model parsing module.
[0108] In an exemplary embodiment, such as Figure 3As shown, the process of determining the hierarchical dependency information corresponding to the first task processing model may include the following steps (221-223). Figure 3 This is a flowchart illustrating a method for generating a task processing model according to an exemplary embodiment. Figure 2 .
[0109] Step 221: Determine the input information and output information corresponding to at least two neural network layers.
[0110] Optionally, the input information includes the input data corresponding to the neural network layer and the dimensional information corresponding to the input data. Optionally, the input data includes, but is not limited to, weight parameter data and bias parameter data. Optionally, the output information includes the output data corresponding to the neural network layer and the dimensional information corresponding to the output data. Optionally, the output data includes, but is not limited to, weight parameter data and bias parameter data.
[0111] Step 222: Determine the connection information between at least two neural network layers.
[0112] Optionally, the aforementioned connectivity information represents the connectivity between at least two neural network layers.
[0113] Step 223: Determine the hierarchical dependency information based on the connection relationship information, input information, and output information.
[0114] Optionally, based on the aforementioned connection information, interconnected neural network layers among the at least two neural network layers can be identified. For each interconnected neural network layer, the data relationship between the output information of the previously connected neural network layer and the input information of the subsequently connected neural network layer is determined. Based on this data relationship, it can be determined whether a data transmission dependency exists between the interconnected neural network layers. If the aforementioned data relationship indicates that a data transmission dependency exists between the interconnected neural network layers, then a data transmission dependency exists between these interconnected neural network layers.
[0115] In one example, such as Figure 4 As shown, this example illustrates a schematic diagram of neural network layers that are interdependent. The first convolutional layer has 8 output channels, and the batch normalization layer also has 8 input channels. If the output dimension of the first convolutional layer is reduced to 4 through pruning, then the input channel count of the second convolutional layer also needs to be automatically reduced to 4.
[0116] The technical solution provided in this application can accurately determine the hierarchical dependency information corresponding to the task processing model by using the input information, output information and connection relationship information between each neural network layer, which helps to improve the accuracy of model pruning.
[0117] In an exemplary embodiment, the device also acquires pruning parameter configuration information. Optionally, the pruning parameter configuration information includes pruning processing configuration parameter information and pruning processing initialization parameter information.
[0118] Optionally, the pruning configuration parameter information includes pruning method identifier, pruning rate, and ADMM (Alternating Direction Method of Multipliers) parameters.
[0119] In one possible implementation, the pruning parameter configuration information includes, but is not limited to, the following pruning parameters to be configured:
[0120] The pruning method parameter (pruning_method) corresponds to the pruning method identifier and is used to determine the pruning processing method; the pruning criterion parameter (pruning_criterion) corresponds to the pruning criterion identifier and is used to determine the pruning criterion; the pruning engine parameter (pruning_engine) corresponds to the pruning engine identifier and is used to determine the pruning engine; the pruning penalty parameter (rho) is a hyperparameter that needs to be adjusted based on the training results and is used to adjust the pruning loss; the pruning parameter update frequency parameter (admmepoch) corresponds to the number of batches for updating pruning parameters and is used to determine the batch period for updating pruning parameters; it is a hyperparameter that needs to be adjusted based on the training results; the learning rate decrease parameter corresponds to the number of learning rate decreases and is used to determine the learning rate within each admm_epoch. The parameters include: the number of times the learning rate decreases; the default initial learning rate parameter, which determines the default initial learning rate; the learning rate decay ratio parameter, which determines the learning rate decay ratio within `admm_epoch`; the neural network layer pruning parameter (`base_layer_pruning_ratio`), used to set the active pruning ratio for different types of neural network layers; the specified neural network layer pruning decay ratio parameter (`special_layer_pruning_ratio`), used to set an individual decay ratio for a specified neural network layer; the non-pruning layer setting parameter, used to determine the neural network layers in the task processing model that do not need to be pruned; and the dependency addition parameter, used to set additional data transfer dependencies for a specified neural network layer.
[0121] In one example, the configuration data for the pruning parameters to be configured is as follows:
[0122] “pruning_method”: “filter”, # Prune by convolution kernel;
[0123] "pruning_criterion": "12";
[0124] "pruning_engine": "numpy (a scientific computing library for Python)", # the package used for pruning calculations, options: numpy or pytorch (a Python machine learning library);
[0125] "rho": 1e-2, # Hyperparameters that need to be adjusted based on training results;
[0126] "admm epoch": 9, # Hyperparameters that need to be adjusted based on training results;
[0127] “internal_stage”: 3, # Number of times the learning rate decreases within each admm_epoch;
[0128] "init_lr": 0.1, # Default initial learning rate, usually passed in by the external model during ADMM Pruner initialization, and the default value is not used;
[0129] "1r_decay": 0.1, the learning rate decay ratio within #admm_epoch;
[0130] “base_layer_pruning_ratio”: {“Conv2D”: 0.5;“ConvTranspose2d”: 0.5;“Linear”: 0.5; # Set the active pruning rate for different types of layers here;
[0131] "special_layer_pruning_ratio" assigns a separate attenuation ratio to certain specific layers;
[0132] "exclude_layer_pruning_ratio" specifies layers that do not need to be pruned, such as the last layer in a classification problem.
[0133] "extra_dependencies" adds extra dependencies for certain specific layers.
[0134] In one possible implementation, the pruning initialization parameter information includes, but is not limited to, the following initialization parameters to be configured: model identifier parameter (model(torch.nn.Module)); configuration path parameter (config); training phase parameter, used to determine the pruning phase; model input parameter (inputs_tuple); PID (Process Identification) parameter; resume parameter (resume); original initial learning rate parameter (external_init_1r); and address parameter (pretrained state).
[0135] `model(torch.nn.Module)`: The complete model; `config`: The ADMM configuration path (.json); `stage(str)`: The ADMM training stage. Optional options: `pretrain`, `prune`, `retrain`; `inputs_tuple`: The inputs required for the model's forward pass. For example, if the input `x.shape = (N, 3, 256, 256)`, then `inputs_tuple = (torch.ones(1, 3, 256, 256),)`. If there are multiple inputs, they are passed as a tuple; `PID`: The PID used when the script file is executed automatically; `load_model`: Determines whether the model is loaded within the framework, defaults to False; `resume`: Determines whether breakpoints need to be resumed, defaults to False; `external_init_1r`: The initial learning rate used during the original model training; `pretrained state`: The checkpoint address to be loaded, only effective when `load flag == True`.
[0136] In an exemplary embodiment, such as Figure 3 As shown, the implementation process corresponding to step 220 above may also include the following steps (224-225).
[0137] Step 224: Obtain the model training configuration file corresponding to the first task processing model.
[0138] Optionally, the model training configuration file includes inserted pruning configuration text. This configuration file includes the original training code text corresponding to the model handling the first task. The pruning configuration text can be inserted into this original training code text to achieve pruning configuration. This configuration is more user-friendly, requiring less modification to the original model training code; only a few lines of pruning configuration code need to be inserted to train and prune using the preset lightweight model framework.
[0139] In one possible implementation, the pre-defined lightweight model framework includes an API (Application Programming Interface) layer, which is mainly used to: provide framework interfaces to users of the framework externally, and internally handle the ADMM pruning call logic at different stages of the training process, that is, to determine which methods need to be executed at different stages.
[0140] Optionally, the interfaces corresponding to the lightweight model framework include, but are not limited to, the following interfaces:
[0141] The first interface (on_train_begin()) executes the loading of model pre-training parameters and breakpoint loading, and initializes the pruned parameter (Z) and pruning parameter (U) variables;
[0142] The second interface (on_epoch_begin()) is used to update the Z and U variables and adjust the learning rate.
[0143] The third interface (on_batch_begin()) is established to ensure the completeness of the interfaces at different stages, and also to facilitate the expansion of functionality.
[0144] The fourth interface (on_batch_end()) is established for the sake of completeness of the interfaces at different stages, and also for the convenience of expanding functionality;
[0145] The fifth interface (on_epohc_end()): saves the model checkpoint for each round (internal event to ensure database consistency), calculates the pruning loss (admm loss) and returns it to the outside;
[0146] The sixth interface (on_admm_loss()): calculates the pruning loss;
[0147] The seventh interface (on_train_end()): Plotting: Plotting the changes in ADMM loss as the epoch increases during training, and saving the final model checkpoint.
[0148] Optionally, the pruning configuration text includes interface call text, which is used to call the interface corresponding to the lightweight model framework. By inserting the interface call text into the original training code text, the configuration and processing of pruning for the first task processing model can be realized.
[0149] In one example, the pruning configuration text includes parameter configuration code (used to configure the specific data corresponding to the above parameters), pruning object initialization code (used to pass in the above-configured pruning parameters, inserted after model building and before optimizer building, calling the above first interface), second interface call code (inserted before the start of each epoch update cycle in the original code), third interface call code (inserted before the start of each batch in the original code), sixth interface call code (inserted after each original detection loss calculation is completed in the original code), fourth interface call code (inserted after the end of each batch in the original code), fifth interface call code (inserted after the end of each epoch update cycle in the original code), and seventh interface call code (inserted after model training in the original code).
[0150] Step 225: Read the pruning configuration text to obtain the pruning parameter configuration information.
[0151] Based on the aforementioned lightweight model framework, pruning configuration text can be read to obtain pruning parameter configuration information, facilitating subsequent pruning processing. The ease of use and user-friendly access of the lightweight model framework reduce the technical skill requirements for users. Furthermore, it achieves the goal of minimizing code insertion; the lightweight model framework optimizes the code structure, allowing relevant personnel to quickly insert pruning code into the original configuration file for training and pruning when there is a need for model optimization acceleration, saving considerable manpower for relevant technical personnel. Using this easy-to-use, easy-to-maintain, and easy-to-extend lightweight model framework for model pruning allows relevant professionals to avoid investing excessive time and energy in repetitive tasks within highly similar business areas, enabling them to focus more on exploring new technologies.
[0152] The technical solution provided in this disclosure allows the device to obtain pre-configured pruning parameter configuration information by reading the pruning configuration text inserted in the model training configuration file, thereby automatically performing pruning processing and effectively improving the efficiency of model pruning processing.
[0153] Step 230: Based on the sample multimedia data, pruning parameter configuration information, and hierarchical dependency information, perform pruning training on the first task processing model to determine the target pruning parameter information.
[0154] Deep neural network structures contain a large number of redundant parameters. During inference, only a small portion of the weights participate in the effective computation and have a major impact on the inference result. Pruning removes redundant weights, nodes, or layers from the network structure, reducing the network size and computational complexity, thus achieving a balance between inference performance and speed.
[0155] The aforementioned target pruning parameters indicate which parameters need to be deleted and which need to be retained in the model. These target pruning parameters are continuously updated during iterative pruning training until the task processing model, pruned according to the target pruning parameters, meets the preset training conditions.
[0156] In an exemplary embodiment, such as Figure 3 As shown, step 230 above may include the following steps (231 to 233).
[0157] Step 231: Initialize the pruning parameter information based on the pruning parameter configuration information and hierarchical dependency information.
[0158] Optionally, the aforementioned lightweight model framework includes a model pruning module, which is mainly used for: model initialization, loading pruning configuration, determining pruning methods, obtaining model dependencies and performing post-processing, updating ADMM pruning algorithm parameters, and pruning the parameters according to the model weights.
[0159] Model initialization includes establishing ADMM parameters, loading pruning configurations, determining pruning methods, and resolving pruning rates for each layer of the model. Loading pruning configurations involves loading hyperparameters such as rho, admm_epoch, and base_prune_ratio. Determining pruning methods specifically involves using the pruning configuration to determine the pruning method for the pruning object. Obtaining and post-processing model dependencies involves inputting the model and passing it through the model parsing module to obtain the output of model dependencies. Updating ADMM pruning algorithm parameters is the core of implementing the ADMM pruning algorithm—updating ADMM parameters. Pruning the model based on model weights involves using the pruning methods determined above to prune the model, resulting in the pruned model.
[0160] Optionally, a pruning object is created and initialized based on the pruning parameter configuration information and a preset pruning object class. This pruning object includes the aforementioned initialization pruning parameter information, specifically including various pruning parameters and preset pruning methods.
[0161] Based on the neural network layer selection parameters and hierarchical dependencies in the above pruning parameter configuration information, the target neural network layer to be pruned in the first task processing model is determined.
[0162] Step 232: Based on the sample multimedia data and pruning parameter information, perform pruning training on the model parameters in the first task processing model to obtain the pruned second task processing model and the corresponding model loss information.
[0163] During the pruning training process, the device can prune the first task processing model based on pruning parameter information, and verify the accuracy of the pruned task processing model by processing sample multimedia data to calculate model loss data.
[0164] In an exemplary embodiment, such as Figure 5 As shown, step 232 above may include the following steps (2321 to 2322), Figure 5 This is a flowchart illustrating a method for generating a task processing model according to an exemplary embodiment. Figure 3 .
[0165] Step 2321: Prune the target neural network layer in the first task processing model based on the pruning parameter information to obtain the second task processing model.
[0166] In the task processing model, the input and output of each neural network layer are dependent on each other. For example, a change in the output dimension of a certain layer will cause a change in the input dimension of the connected layers. Therefore, it is necessary to combine the hierarchical dependency information to perform unified pruning on the neural network layers in the first task processing model that have data transmission dependencies on each other, so as to ensure that the dimension of the data transmitted between neural network layers with data transmission dependencies is consistent.
[0167] The aforementioned target neural network layer refers to the neural network layer that needs pruning, as determined by the pruning parameter configuration information, and the neural network layer with which it has a transmission relationship, as determined by the hierarchical dependency information.
[0168] In an exemplary embodiment, the process for determining the target neural network layer is as follows:
[0169] Based on the pruning parameter configuration information and hierarchical dependency information, the first neural network layer to be pruned and the corresponding second neural network layer in the first task processing model are determined; wherein, there is a data transmission dependency between the second neural network layer and the first neural network layer; the first neural network layer and the second neural network layer are used as the target neural network layers.
[0170] In practical applications, the inputs and outputs of various neural network layers in a task processing model are dependent on each other. For example, a change in the output dimension of a certain layer will cause a change in the input dimension of its connected layers. The technical solution provided in this application, through pruning parameter configuration information and layer dependency information, can determine the first neural network layer to be pruned in the first task processing model, as well as the second neural network layer that has a data transmission dependency with the first neural network layer. By using the first and second neural network layers together as the target neural network layers to be pruned, the correctness of model pruning can be effectively improved, avoiding the situation where the input and output data dimensions of interdependent neural network layers in the pruned model are inconsistent.
[0171] Once the target neural network layer is determined, it can be pruned accordingly. In one possible implementation, the device determines the weight parameter information corresponding to the target neural network layer, and prunes the weight parameter information based on the pruning parameter information to obtain the pruned parameter information corresponding to the target neural network layer; thereby updating the first task processing model based on the pruned parameter information to obtain the second task processing model.
[0172] The weight parameter information corresponding to the target neural network layer includes the weight parameter data and bias term parameter data for each layer. The pruning parameter information can indicate which weight parameter data and bias term parameter data need to be deleted, or which should be retained. The device can delete the corresponding parameter data from the weight parameter information according to the pruning parameter information, thereby obtaining the pruned parameter information for the target neural network layer. The pruned parameter information retains the parameter data that were not pruned, such as the unpruned weight parameter data and bias term parameter data.
[0173] Optionally, during pruning, pruning mask parameter data, i.e., the aforementioned pruning parameter information, is initialized. Based on this pruning mask parameter data, the parameter data in the weight parameter information is pruned to obtain the pruned parameter information corresponding to the target neural network layer. The aforementioned pruning mask parameter data is used to adjust the parameter information corresponding to the target neural network layer and can be dynamically adjusted during the model pruning process.
[0174] The technical solution provided in this disclosure can accurately determine the target neural network layer that needs to be pruned in the task processing model by using pruning parameter configuration information and hierarchical dependency information, and then perform corresponding pruning processing, which helps to improve the accuracy and efficiency of model pruning.
[0175] In another possible implementation, the pruning process can be configured by selecting a pruning method identifier in the pruning configuration parameter information. Specifically, the pruning parameter configuration information includes a pruning method identifier, which is used to determine the pruning process. The device determines the pruning rule information corresponding to the pruning method identifier, and then, based on the pruning rule information and the pruning parameter information, performs pruning on the target neural network layer in the first task processing model to obtain the second task processing model.
[0176] Optionally, the aforementioned pruning rules are preset pruning algorithms, such as a preset code block or a pre-packaged pruning interface. After determining the pruning rules based on the pruning method identifier, pruning can be performed according to the pruning method configured in the pruning rules. By configuring the pruning method identifier in the pruning parameter configuration information, the selection and determination of the pruning method can be achieved, effectively improving the flexibility of model pruning.
[0177] Optionally, the above pruning process includes a first pruning process; correspondingly, the pruning method identifier includes the first pruning method identifier corresponding to the first pruning process. Optionally, the first pruning process refers to pruning based on the Alternating Direction Method of Multipliers (ADMM). Correspondingly, the above lightweight model framework includes the implementation code text corresponding to the first pruning process. Optionally, the ADMM pruning process used in this lightweight model framework is used to find redundant weight parameters in the network structure of the first task processing model.
[0178] Optionally, the above pruning process also includes a second pruning process orthogonal to the first pruning process. Optionally, the model lightweight framework includes the implementation code text corresponding to the first and second pruning processes. The model lightweight framework integrates the ADMM pruning optimization method and various pruning methods orthogonal to it, forming a highly user-friendly tool platform that can support functions such as model pruning, distillation, quantization, and model conversion.
[0179] Optionally, the pruning methods orthogonal to ADMM pruning in the lightweight model framework include, but are not limited to, the following methods:
[0180] Filter pruning: A structured pruning method that uses filters as the basic unit for pruning convolutional layers, rather than individual weights.
[0181] Channel pruning: A structured pruning method that uses the input channel as the basic unit when pruning convolutional layers, rather than individual weights.
[0182] bcr pruning: an unstructured pruning method.
[0183] In one example, for structured pruning, weighted connections are removed according to filter / channel, which has the effect of changing the shape of the layer's input and output, as well as the weight matrix. Unstructured pruning removes individual weighted connections from the network by setting them to 0.
[0184] After pruning the task processing model, the pruned second task processing model can be tested and trained using sample multimedia data, as shown in step 2222 below.
[0185] Step 2322: Input the sample multimedia data into the second task processing model for preset processing to obtain model loss information.
[0186] Optionally, the above model loss information is obtained by fusing model detection loss information and model pruning loss information. The process of determining the model loss information can be as follows:
[0187] The sample multimedia data is input into the second task processing model for preset processing to obtain the detection results corresponding to the sample multimedia data. The sample multimedia data can be sample images, sample videos, etc. The preset processing can be classification detection processing, target object detection processing, etc., and this embodiment does not limit this. The detection results can be category information or object information corresponding to the sample multimedia data.
[0188] On the one hand, the detection loss information corresponding to the second task processing model can be determined based on the detection results. Optionally, the label data corresponding to the sample multimedia data can be obtained; the detection results are compared with the label data to obtain the detection loss information corresponding to the second task processing model. The aforementioned label data is the actual result corresponding to the detection result. For example, the actual category or actual object corresponding to the sample multimedia data.
[0189] On the other hand, based on the weight parameter information and the post-pruning parameter information, the pruning loss information corresponding to the second task processing model can be determined. The weight parameter information is the parameter information of the target neural network layer before pruning. Comparing the weight parameter data before pruning with the post-pruning parameter data can measure the pruning effect. The above pruning process is used to find redundant weights in the network structure. In model training, it is specifically implemented by adding ADMM Loss (i.e., the above-mentioned pruning loss information) to the original model's Loss (i.e., the above-mentioned detection loss information). By comparing the above-mentioned parameter information and the post-pruning parameter information, the above-mentioned pruning loss information can be determined.
[0190] After obtaining the two types of losses mentioned above, the model loss information can be determined based on the detection loss information and the pruning loss information. Optionally, the detection loss information and the pruning loss information can be fused, such as by adding them, to obtain the aforementioned model loss information. Optionally, the model loss information can be determined according to the following formula:
[0191]
[0192] Where f represents the original training loss function, i.e., the detection loss function; f({W i},{b i}) represents the original training loss information, i.e., the detection loss information; This represents the ADMM constraint loss information applied to the first task processing model, i.e., the pruning loss information mentioned above. W i b represents the original weight parameters corresponding to the i-th layer of the neural network. i Z represents the bias parameter corresponding to the i-th layer of the neural network; i U represents the pruned weight parameters corresponding to the i-th layer of the neural network. i ρ represents the pruning bias parameter corresponding to the i-th layer of the neural network, and ρ represents the pruning penalty parameter.
[0193] The technical solution provided in this disclosure can accurately determine the model loss information in the task processing model by determining the detection loss information and pruning loss information corresponding to the task processing model. This enables timely and accurate feedback on the model effect after pruning relevant parameters, which helps to improve the accuracy and efficiency of model pruning.
[0194] Step 233: Update the pruning parameter information based on the model loss information to obtain the target pruning parameter information.
[0195] If the model loss information does not meet the preset loss conditions, update the parameter information of the above neural network layer and the pruned parameter information according to the model loss information, that is, update the weights W, b and Z, U, and continue from the above step 2423 until the model loss information meets the preset loss conditions.
[0196] Optionally, the preset loss condition is the condition that the model loss data is greater than or equal to a preset loss threshold. When the model loss data is greater than or equal to the preset loss threshold, a pruned task processing model is generated based on the pruned parameter information. The aforementioned model loss data refers to the aforementioned model loss information, and the aforementioned preset loss condition is that the model loss data is greater than or equal to the preset loss threshold.
[0197] Step 240: Prune the first task processing model based on the target pruning parameter information to obtain the pruned task processing model.
[0198] Optionally, the pruned task processing model described above is used to process the multimedia data to be processed and generate task processing results. The multimedia data to be processed can be images, videos, text, or mixed-modal multimedia data, etc. The task processing results can be classification results, object recognition results, etc., corresponding to the multimedia data to be processed.
[0199] If the model loss information meets the preset loss conditions, the pruning parameter information at this time can be determined as the target pruning parameter information, and the second task processing model or the first task processing model can be pruned based on the target pruning parameter information to generate the pruned task processing model.
[0200] Optionally, based on the target pruning parameter information, including the pruned weight parameters and the pruned bias parameters, the pruning mask data can be determined according to the pruned weight parameters and the pruned bias parameters, and the current parameters of the target neural network layer can be pruned based on the pruning mask data to obtain the pruned task processing model.
[0201] In addition, the pruned task processing model can be fine-tuned (without applying ADMM constraints) to obtain the final pruned task processing model. This solves the problem of excessively large AI models and slow inference speed in actual business applications, making it easier to place the server-side model on the terminal, accelerating inference speed, reducing model power consumption, and improving user experience.
[0202] The technical solution provided in this disclosure can effectively reduce the number of parameters in the target network layer by pruning the parameters in the task processing model, thereby improving the model pruning efficiency. By calculating the model loss, it can be determined whether the model under the current model parameters meets the preset loss condition. Based on the pruned parameters corresponding to the model loss meeting the preset loss condition, the pruned task processing model is determined. This effectively reduces the number of parameters in the task processing model while ensuring the model accuracy.
[0203] After obtaining the pruned task processing model, it can be deployed and applied. The data to be detected, such as images, videos, text, or audio, is input into the task processing model, and corresponding preset processing, such as object detection and classification, is performed to obtain the detection results. Examples include the classification results and object recognition results corresponding to the data to be detected mentioned above.
[0204] For example, the first task processing model is an image classification model used to identify whether an image belongs to a preset category. However, due to the large number of parameters, it is difficult to deploy on the terminal. The above method can be used to prune it to obtain a pruned image classification model, which reduces the number of parameters of the image classification model and reduces the operating pressure on the terminal device.
[0205] In summary, the technical solution provided by the embodiments of this disclosure can characterize the data transmission dependencies between network layers in the task processing model by determining the hierarchical dependency information corresponding to the task processing model. Based on the hierarchical dependency information and the obtained preset pruning parameter configuration information, the pruning parameter information can be automatically determined, and the task processing model can be automatically pruned according to the pruning parameter information to obtain the pruned task processing model. This effectively reduces the number of parameters in the task processing model and the operating pressure on the device, and improves the model pruning efficiency.
[0206] Figure 6 This is a block diagram of a task processing model generation apparatus according to an exemplary embodiment. (Refer to...) Figure 6 The device 600 includes:
[0207] The sample data acquisition module 610 is configured to acquire sample multimedia data.
[0208] The configuration information acquisition module 620 is configured to acquire hierarchical dependency information corresponding to the first task processing model to be pruned, and acquire pruning parameter configuration information; the hierarchical dependency information represents the data transmission dependency relationship between at least two neural network layers in the first task processing model.
[0209] The pruning parameter determination module 630 is configured to perform pruning training processing on the first task processing model based on the sample multimedia data, the pruning parameter configuration information and the hierarchical dependency information, and determine the target pruning parameter information.
[0210] The model pruning module 640 is configured to perform pruning processing on the first task processing model based on the target pruning parameter information to obtain a pruned task processing model. The pruned task processing model is used to process the multimedia data to be processed and generate task processing results.
[0211] In some possible designs, the pruning parameter determination module includes:
[0212] The parameter initialization submodule is configured to initialize pruning parameter information based on the pruning parameter configuration information and the hierarchical dependency information.
[0213] The pruning training module is configured to perform pruning training on the model parameters in the first task processing model based on the sample multimedia data and the pruning parameter information, so as to obtain the pruned second task processing model and the model loss information corresponding to the second task processing model.
[0214] The pruning parameter determination submodule is configured to update the pruning parameter information based on the model loss information to obtain the target pruning parameter information.
[0215] In some possible designs, the pruning training module includes:
[0216] The target network layer pruning unit is configured to perform pruning processing on the target neural network layer in the first task processing model based on the pruning parameter information to obtain the second task processing model.
[0217] The model loss information determination unit is configured to input the sample multimedia data into the second task processing model for preset processing to obtain the model loss information.
[0218] In some possible designs, the target network layer pruning unit includes:
[0219] The weight parameter determination subunit is configured to determine the weight parameter information corresponding to the target neural network layer.
[0220] The weight parameter pruning subunit is configured to perform pruning processing on the weight parameter information based on the pruning parameter information to obtain the pruned parameter information corresponding to the target neural network layer.
[0221] The model update subunit is configured to update the first task processing model based on the pruned parameter information to obtain the second task processing model.
[0222] In some possible designs, the model loss information determination unit includes:
[0223] The sample detection subunit is configured to input the sample multimedia data into the second task processing model for preset processing to obtain the detection result corresponding to the sample multimedia data.
[0224] The tag data acquisition subunit is configured to acquire the tag data corresponding to the sample multimedia data;
[0225] The detection loss determination subunit is configured to compare the detection results with the label data to obtain the detection loss information corresponding to the second task processing model;
[0226] The pruning loss determination subunit is configured to determine the pruning loss information corresponding to the second task processing model based on the weight parameter information and the post-pruning parameter information.
[0227] The model loss determination subunit is configured to determine the model loss information based on the detection loss information and the pruning loss information.
[0228] In some possible designs, the device further includes:
[0229] The network layer lookup module is configured to perform an operation based on the pruning parameter configuration information and the hierarchical dependency information to determine the first neural network layer to be pruned in the first task processing model and the second neural network layer corresponding to the first neural network layer, wherein there is a data transmission dependency between the second neural network layer and the first neural network layer.
[0230] The target network layer determination module is configured to determine the first neural network layer and the second neural network layer as the target neural network layer.
[0231] In some possible designs, the pruning parameter configuration information includes a pruning method identifier, and the pruning training module further includes:
[0232] The pruning rule determination submodule is configured to determine the pruning processing rule information corresponding to the pruning method identifier;
[0233] The target network layer pruning unit is configured to perform pruning processing on the target neural network layer in the first task processing model based on the pruning processing rule information and the pruning parameter information, so as to obtain the second task processing model.
[0234] In some possible designs, the configuration information acquisition module includes:
[0235] The model file acquisition submodule is configured to acquire the model training configuration file corresponding to the first task processing model, wherein the model training configuration file includes the inserted pruning configuration text.
[0236] The pruning configuration acquisition submodule is configured to read the pruning configuration text and obtain the pruning parameter configuration information.
[0237] In some possible designs, the configuration information acquisition module further includes:
[0238] The input / output information determination submodule is configured to determine the input information and output information corresponding to each of the at least two neural network layers.
[0239] The connection relationship information determination submodule is configured to determine the connection relationship information between the at least two neural network layers.
[0240] The hierarchical dependency determination submodule is configured to determine the hierarchical dependency information based on the connection relationship information, the input information, and the output information.
[0241] In summary, the technical solution provided by the embodiments of this disclosure can characterize the data transmission dependencies between network layers in the task processing model by determining the hierarchical dependency information corresponding to the task processing model. Based on the hierarchical dependency information and the obtained preset pruning parameter configuration information, the pruning parameter information can be automatically determined, and the task processing model can be automatically pruned according to the pruning parameter information to obtain the pruned task processing model. This effectively reduces the number of parameters in the task processing model and the operating pressure on the device, and improves the model pruning efficiency.
[0242] It should be noted that the apparatus provided in the above embodiments is only illustrated by the division of the above functional modules when implementing its functions. In actual applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. In addition, the apparatus and method embodiments provided in the above embodiments belong to the same concept, and the specific implementation process can be found in the method embodiments, which will not be repeated here.
[0243] Figure 7 This is a block diagram illustrating an electronic device for model pruning according to an exemplary embodiment. The electronic device may be a terminal, and its internal structure diagram may be as follows: Figure 7As shown, the electronic device includes a processor, memory, network interface, display screen, and input devices connected via a system bus. The processor provides computing and control capabilities. The memory includes a non-volatile storage medium and internal memory. The non-volatile storage medium stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The network interface is used to communicate with external terminals via a network connection. When the computer program is executed by the processor, it implements a method for generating a task processing model. The display screen can be a liquid crystal display (LCD) or an e-ink display. The input devices can be a touch layer covering the display screen, buttons, a trackball, or a touchpad mounted on the device's casing, or an external keyboard, touchpad, or mouse.
[0244] Those skilled in the art will understand that Figure 7 The structure shown is merely a block diagram of a portion of the structure related to the present disclosure and does not constitute a limitation on the electronic device to which the present disclosure is applied. A specific electronic device may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.
[0245] In an exemplary embodiment, an electronic device is also provided, including: a processor; and a memory for storing processor-executable instructions; wherein the processor is configured to execute the instructions to implement a method for generating a task processing model as described in the embodiments of this disclosure.
[0246] In an exemplary embodiment, a computer-readable storage medium is also provided, wherein when the instructions in the storage medium are executed by a processor of an electronic device, the electronic device is enabled to perform the task processing model generation method of the embodiments of this disclosure.
[0247] In an exemplary embodiment, a computer program product containing instructions is also provided, which, when run on a computer, causes the computer to execute the method for generating the task processing model in the embodiments of this disclosure.
[0248] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. This computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments of the above methods. Any references to memory, storage, databases, or other media used in the embodiments provided in this application can include non-volatile and / or volatile memory. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link DRAM (SLDRAM), RAMbus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and RAMbus dynamic RAM (RDRAM), etc.
[0249] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are indicated by the following claims.
[0250] It should be understood that this disclosure is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this disclosure is limited only by the appended claims.
Claims
1. A method for generating a task processing model, characterized in that, The method includes: Acquire sample multimedia data; Obtain the hierarchical dependency information corresponding to the first task processing model to be pruned, and obtain the pruning parameter configuration information; the hierarchical dependency information represents the data transmission dependency between at least two neural network layers in the first task processing model; Based on the sample multimedia data, the pruning parameter configuration information, and the hierarchical dependency information, the first task processing model is subjected to pruning training to determine the target pruning parameter information. The first task processing model is pruned based on the target pruning parameter information to obtain a pruned task processing model. The pruned task processing model is used to process the multimedia data to be processed and generate task processing results. The step of performing pruning training on the first task processing model based on the sample multimedia data, the pruning parameter configuration information, and the hierarchical dependency information to determine the target pruning parameter information includes: Based on the pruning parameter configuration information and the hierarchical dependency information, initialize the pruning parameter information; Based on the sample multimedia data and the pruning parameter information, the model parameters in the first task processing model are pruned and trained to obtain the pruned second task processing model and the model loss information corresponding to the second task processing model. The pruning parameter information is updated based on the model loss information to obtain the target pruning parameter information.
2. The method according to claim 1, characterized in that, The step of pruning and training the model parameters in the first task processing model based on the sample multimedia data and the pruning parameter information to obtain the pruned second task processing model and the corresponding model loss information includes: Based on the pruning parameter information, the target neural network layer in the first task processing model is pruned to obtain the second task processing model. The sample multimedia data is input into the second task processing model for preset processing to obtain the model loss information.
3. The method according to claim 2, characterized in that, The step of pruning the target neural network layer in the first task processing model based on the pruning parameter information to obtain the second task processing model includes: Determine the weight parameter information corresponding to the target neural network layer; Based on the pruning parameter information, the weight parameter information is pruned to obtain the pruned parameter information corresponding to the target neural network layer; The first task processing model is updated based on the pruned parameter information to obtain the second task processing model.
4. The method according to claim 3, characterized in that, The step of inputting the sample multimedia data into the second task processing model for preset processing to obtain the model loss information includes: The sample multimedia data is input into the second task processing model for preset processing to obtain the detection result corresponding to the sample multimedia data. Obtain the tag data corresponding to the sample multimedia data; The detection results are compared with the label data to obtain the detection loss information corresponding to the second task processing model; Based on the weight parameter information and the post-pruning parameter information, determine the pruning loss information corresponding to the second task processing model; The model loss information is determined based on the detection loss information and the pruning loss information.
5. The method according to claim 2, characterized in that, Before pruning the target neural network layer in the first task processing model based on the pruning parameter information to obtain the second task processing model, the method further includes: Based on the pruning parameter configuration information and the hierarchical dependency information, the first neural network layer to be pruned and the second neural network layer corresponding to the first neural network layer are determined in the first task processing model. There is a data transmission dependency between the second neural network layer and the first neural network layer. The first neural network layer and the second neural network layer are determined as the target neural network layer.
6. The method according to claim 2, characterized in that, The pruning parameter configuration information includes a pruning method identifier. The step of pruning the target neural network layer in the first task processing model based on the pruning parameter information to obtain the second task processing model includes: Determine the pruning processing rule information corresponding to the pruning method identifier; Based on the pruning rules and pruning parameters, the target neural network layer in the first task processing model is pruned to obtain the second task processing model.
7. The method according to any one of claims 1 to 6, characterized in that, The process of obtaining pruning parameter configuration information includes: Obtain the model training configuration file corresponding to the first task processing model, wherein the model training configuration file includes the inserted pruning configuration text; Read the pruning configuration text to obtain the pruning parameter configuration information.
8. The method according to any one of claims 1 to 6, characterized in that, The step of obtaining the hierarchical dependency information corresponding to the first task processing model to be pruned includes: Determine the input information and output information corresponding to each of the at least two neural network layers; Determine the connection relationship information between the at least two neural network layers; The hierarchical dependency information is determined based on the connection relationship information, the input information, and the output information.
9. A device for generating a task processing model, characterized in that, The device includes: The sample data acquisition module is configured to acquire sample multimedia data; The configuration information acquisition module is configured to acquire the hierarchical dependency information corresponding to the first task processing model to be pruned, and to acquire the pruning parameter configuration information; the hierarchical dependency information represents the data transmission dependency relationship between at least two neural network layers in the first task processing model; The pruning parameter determination module is configured to perform pruning training on the first task processing model based on the sample multimedia data, the pruning parameter configuration information, and the hierarchical dependency information, and determine the target pruning parameter information. The model pruning module is configured to perform pruning processing on the first task processing model based on the target pruning parameter information to obtain a pruned task processing model. The pruned task processing model is used to process the multimedia data to be processed and generate task processing results. The step of performing pruning training on the first task processing model based on the sample multimedia data, the pruning parameter configuration information, and the hierarchical dependency information to determine the target pruning parameter information includes: Based on the pruning parameter configuration information and the hierarchical dependency information, initialize the pruning parameter information; Based on the sample multimedia data and the pruning parameter information, the model parameters in the first task processing model are pruned and trained to obtain the pruned second task processing model and the model loss information corresponding to the second task processing model. The pruning parameter information is updated based on the model loss information to obtain the target pruning parameter information.
10. An electronic device, characterized in that, include: processor; Memory used to store the processor's executable instructions; The processor is configured to execute the instructions to implement the method for generating the task processing model as described in any one of claims 1 to 8.
11. A computer-readable storage medium, characterized in that, When the instructions in the storage medium are executed by the processor of the electronic device, the electronic device is able to perform the method for generating the task processing model as described in any one of claims 1 to 8.