Model computation efficiency testing method, system, apparatus and device, and medium and product
By using a model computation efficiency test method, the performance of the model computation system is quantitatively evaluated, which solves the problem of the lack of performance test indicators in the existing technology and realizes the comparability evaluation and improvement guidance of the model computation system.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- INSPUR SUZHOU INTELLIGENT TECH CO LTD
- Filing Date
- 2025-12-11
- Publication Date
- 2026-06-18
Smart Images

Figure CN2025141773_18062026_PF_FP_ABST
Abstract
Description
A method, system, apparatus, equipment, medium, and product for testing computational efficiency.
[0001] Cross-reference to related applications
[0002] This application claims priority to Chinese Patent Application No. 202411823889.3, filed on December 11, 2024, entitled "A method, system, apparatus, device, medium and product for testing computational efficiency", the entire contents of which are incorporated herein by reference. Technical Field
[0003] This application relates to the field of artificial intelligence technology, and in particular to a method, system, device, equipment, medium and product for testing computational efficiency. Background Technology
[0004] With the rapid development of artificial intelligence (AI) technology, AI models are exhibiting trends such as large data processing volumes and high computational complexity. The computational demands on hardware devices for model calculation tasks are increasing rapidly, requiring significant resource consumption. Currently, evaluation metrics for model computation performance are typically focused on the model itself, such as model accuracy. Related technologies lack evaluation metrics for the model computation system comprised of the hardware and software used to perform model calculations, as well as performance testing methods for such systems.
[0005] How to test and quantify the model computing performance of a model computing system is a technical problem that needs to be solved by those skilled in the art. Summary of the Invention
[0006] The purpose of this application is to provide a model computation efficiency testing method, system, device, equipment, medium, and product for testing and evaluating the model computation performance of a model computation system.
[0007] To address the aforementioned technical problems, this application provides a simulation efficiency testing method, comprising:
[0008] Obtain the dataset;
[0009] Control the device under test to run the software framework under test to call the model under test and dataset to perform model calculation tasks;
[0010] Determine the model performance parameters and computational performance parameters based on the model computation task;
[0011] The computational efficiency of the test model computing system is determined based on the model performance parameters, computational performance parameters, and the evaluation parameters of the model parameters of the test model, so as to obtain the test results of the test model computing system.
[0012] The tested model computing system includes the tested model, the tested software framework, and the tested equipment.
[0013] On the one hand, the computational efficiency of the testing model's computational system is determined based on model performance parameters, computational performance parameters, and the number of model parameters of the tested model, including:
[0014] The model parameter efficiency of the computational system of the tested model is determined based on the model performance parameters and the model parameter quantity evaluation parameters.
[0015] The model efficiency is determined based on the efficiency of the model parameters and the computational performance parameters.
[0016] On the other hand, the greater the efficiency of the model parameters, the greater the computational efficiency.
[0017] On the other hand, the model parameter efficiency of the computational system of the tested model is determined based on the model performance parameters and the model parameter quantity evaluation parameters, and is obtained by the following formula:
[0018] Model efficiency = Model parameter efficiency × Computational efficiency;
[0019] Here, computational efficiency is a computational performance parameter.
[0020] On the other hand, the model parameter efficiency of the computational system of the tested model is determined based on model performance parameters and model parameter quantity evaluation parameters, including:
[0021] The model performance parameters and the model parameter quantity evaluation parameters are normalized respectively.
[0022] The efficiency of model parameters is determined based on the normalized model performance parameters, the normalized model parameter quantity evaluation parameters, and the computational performance parameters.
[0023] On the other hand, the model performance parameters are normalized, including:
[0024] The normalized model performance parameters are determined based on the measured model performance parameters in the model calculation task and the reference model performance parameters of the tested model.
[0025] On the other hand, the measured model performance parameters are the model accuracy of the tested model, and the reference model performance parameters are the reference accuracy of the tested model.
[0026] On the other hand, the evaluation parameters of the model parameters are normalized, including:
[0027] The normalized model parameter values are determined based on the number of activation parameters and the number of reference parameters of the tested model.
[0028] The number of activation parameters refers to the number of parameters that the tested model participates in during the model computation task.
[0029] On the other hand, the model parameter efficiency of the computational system of the tested model is determined based on the model performance parameters and the model parameter quantity evaluation parameters, and is obtained by the following formula:
[0030] Here, model accuracy refers to the measured model performance parameters in the model computation task, reference accuracy refers to the reference model performance parameters of the tested model, and the number of activated parameters refers to the number of parameters of the tested model involved in the computation in the model computation task.
[0031] On the other hand, the computational efficiency of the tested model's computational system is determined based on model performance parameters, computational performance parameters, and the evaluation parameters of the model's parameter quantity, including:
[0032] If the model performance parameters are within the range of the benchmark model performance parameters, the model computation efficiency is determined based on the model performance parameters, computational performance parameters, and model parameter quantity evaluation parameters.
[0033] If the model performance parameters are not within the range of the baseline model performance parameters, then set the model performance parameters to 0, and determine the model computation efficiency based on the model performance parameters, computational performance parameters, and model parameter quantity evaluation parameters.
[0034] On the other hand, the model performance parameter is the model accuracy of the tested model, which is determined by the following formula:
[0035] Among them, the measured accuracy of the model is the measured accuracy of the tested model in the model calculation task, and the baseline accuracy is the minimum allowable accuracy of the tested model.
[0036] On the other hand, the steps for determining the performance parameters include:
[0037] The computational performance parameters are determined based on the measured computational performance parameters of the computational system performing the model computation task and the theoretical peak performance parameters of the computational system of the device under test.
[0038] On the other hand, the computational performance parameters are determined based on the measured computational performance parameters of the computational system of the tested model performing the model computation task and the theoretical peak performance parameters of the computational system of the tested equipment, and are obtained by the following formula:
[0039] Here, computational efficiency is a computational performance parameter.
[0040] On the other hand, the measured computational performance parameter is the number of floating-point operations per second performed by the tested device to execute the model calculation task;
[0041] The theoretical peak performance parameter of the computing system is the theoretical peak number of floating-point operations per second of the tested device.
[0042] On the other hand, the steps for determining the measured performance parameters and the steps for determining the theoretical peak performance parameters of the calculation system include:
[0043] Obtain the theoretical peak performance parameters of the computing system corresponding to the first data type of the device under test;
[0044] If all actual data types in the model computation task are of the first data type, then the computational performance parameters corresponding to the actual data types in the model computation task shall be used as the measured computational performance parameters.
[0045] If there is an actual data type in the model calculation task that is not the first data type, then the calculation performance parameters corresponding to the actual data type in the model calculation task are equivalent to the calculation performance parameters corresponding to the first data type, and the measured calculation performance parameters are obtained.
[0046] On the other hand, the first data type is single-precision floating-point.
[0047] On the other hand, the steps for determining the measured performance parameters and the steps for determining the theoretical peak performance parameters of the calculation system include:
[0048] Obtain the theoretical peak performance parameters of the computing system corresponding to the second data type of the device under test;
[0049] If all actual data types in the model computation task are the second data type, then the computational performance parameters corresponding to the actual data types in the model computation task shall be used as the measured computational performance parameters.
[0050] If there is an actual data type in the model calculation task that is not the second data type, then the calculation performance parameters corresponding to the actual data type in the model calculation task are converted into the calculation performance parameters corresponding to the second data type to obtain the measured calculation performance parameters.
[0051] On the other hand, the computational efficiency of the tested model's computational system is determined based on model performance parameters, computational performance parameters, and the evaluation parameters of the model's parameters, in order to obtain the test results of the tested model's computational system, including:
[0052] Based on the computational efficiency of the tested model computing system under multiple test scenarios, the overall computational efficiency of the tested model computing system is determined.
[0053] The overall simulation efficiency was used as the test result;
[0054] Each test scenario corresponds one-to-one with a dataset.
[0055] On the other hand, the computational efficiency of the tested model's computational system is determined based on model performance parameters, computational performance parameters, and the evaluation parameters of the model's parameters, in order to obtain the test results of the tested model's computational system, including:
[0056] Based on the computational efficiency of the tested model computing system under multiple test scenarios, the overall computational efficiency of the tested model computing system is determined.
[0057] The overall simulation efficiency was used as the test result;
[0058] One test scenario corresponds to one or more datasets.
[0059] On the other hand, the steps for determining the computational efficiency for a single test scenario include:
[0060] If a test scenario corresponds to a dataset, then the simulation efficiency corresponding to the dataset is used as the simulation efficiency corresponding to the test scenario.
[0061] If a test scenario corresponds to multiple datasets, the simulation efficiency of the test scenario is determined based on the simulation efficiency of each dataset.
[0062] On the other hand, the computational efficiency of the test scenario is determined based on the computational efficiency of each dataset corresponding to the test scenario, including:
[0063] The arithmetic mean of the computational efficiency for each dataset is used as the computational efficiency for the test scenario.
[0064] On the other hand, the computational efficiency of the test scenario is determined based on the computational efficiency of each dataset corresponding to the test scenario, including:
[0065] The geometric mean of the computational efficiency for each dataset is taken as the computational efficiency for the test scenario.
[0066] On the other hand, the computational efficiency of the test scenario is determined based on the computational efficiency of each dataset corresponding to the test scenario, including:
[0067] Determine the single-scene model performance parameters corresponding to the test scenario based on the model performance parameters corresponding to each dataset of the test scenario.
[0068] Determine the single-scenario computational performance parameters corresponding to the test scenario based on the computational performance parameters corresponding to each dataset of the test scenario.
[0069] The simulation efficiency corresponding to the test scenario is determined based on the single-scenario model performance parameters, single-scenario computational performance parameters, and model parameter quantity evaluation parameters.
[0070] On the other hand, the single-scene model performance parameters corresponding to the test scenario are determined based on the model performance parameters corresponding to each dataset corresponding to the test scenario, including: using the arithmetic mean of the model performance parameters corresponding to each dataset as the single-scene model performance parameters.
[0071] The single-scene computational performance parameters for the test scenario are determined based on the computational performance parameters of each dataset corresponding to the test scenario, including: using the arithmetic mean of the computational performance parameters of each dataset as the single-scene computational performance parameter.
[0072] On the other hand, the single-scene model performance parameters corresponding to the test scenario are determined based on the model performance parameters corresponding to each dataset corresponding to the test scenario, including: using the geometric mean of the model performance parameters corresponding to each dataset as the single-scene model performance parameters;
[0073] The single-scene computational performance parameters for the test scenario are determined based on the computational performance parameters of each dataset corresponding to the test scenario, including: using the geometric mean of the computational performance parameters of each dataset as the single-scene computational performance parameter.
[0074] On the other hand, based on the computational efficiency of the tested model computing system in multiple test scenarios, the overall computational efficiency of the tested model computing system is determined, including:
[0075] The arithmetic mean of the simulation efficiency corresponding to multiple test scenarios is used as the comprehensive simulation efficiency.
[0076] On the other hand, the arithmetic mean of the simulation efficiency corresponding to multiple test scenarios is used as the comprehensive simulation efficiency, which is calculated by the following formula:
[0077] Among them, MCE t MCE represents the overall computational efficiency. i f represents the computational efficiency of the tested model computing system in the i-th test scenario. 1i This represents the first weight coefficient of the tested model calculation system in the i-th test scenario.
[0078] On the other hand, based on the computational efficiency of the tested model computing system in multiple test scenarios, the overall computational efficiency of the tested model computing system is determined, including:
[0079] The geometric mean of the simulation efficiency across multiple test scenarios is used as the overall simulation efficiency.
[0080] On the other hand, the geometric mean of the simulation efficiency corresponding to multiple test scenarios is used as the comprehensive simulation efficiency, which is calculated by the following formula:
[0081] Among them, MCEt MCE represents the overall computational efficiency. i f represents the computational efficiency of the tested model computation system in the i-th test scenario, ∏(·) represents cumulative multiplication, and f 2i This represents the second weight coefficient of the tested model's calculation system in the i-th test scenario.
[0082] On the other hand, controlling the device under test to run the software framework under test to invoke the model under test and dataset to perform model computation tasks includes:
[0083] The software framework under test is loaded onto the device under test and its initialization is performed.
[0084] The software framework under test is started, and the model under test and dataset are invoked to perform model calculation tasks.
[0085] On the other hand, the device under test loads the software framework under test and performs initialization of the software framework under test, including:
[0086] Load the library files of the computing system of the model under test onto the device under test;
[0087] The type of the interface function of the computing system of the tested model is obtained by calling the interface of the library file;
[0088] Initialize the interface function according to its type.
[0089] On the other hand, the types of interface functions include at least: initialization functions, completion functions, data loading functions, data unloading functions, and model calculation control functions;
[0090] The initialization function is the interface function that performs initialization based on the test configuration information; the completion function is the interface function that performs the task after the tested model calculation system has completed the model calculation task; the data loading function is the interface function that loads the model calculation data; the data unloading function is the interface function that unloads the model calculation data; and the model calculation control function is the interface function that performs the model calculation task.
[0091] On the other hand, the software framework under test is started to call the model under test to perform model calculation tasks, including:
[0092] Start the software framework under test to perform model calculation tasks;
[0093] Obtain the first result returned by the tested software framework after calling the data loading function to load the model and calculate the data;
[0094] Obtain the second result of the test software framework calling the model computation control function to execute the model computation task;
[0095] Waiting for the third result to be output after the software framework under test has completed the model calculation task;
[0096] Obtain the fourth result returned by the data unloading function called by the software framework under test to unload the model's calculated data;
[0097] The model accuracy of the tested model is calculated based on the execution results of the model calculation task, including:
[0098] Calculate the model accuracy based on at least one of the first, second, third, and fourth results.
[0099] To address the aforementioned technical problems, this application also provides a simulation efficiency testing system, including testing equipment and the device under test;
[0100] The device under test is used to run the software framework under test to perform model calculation tasks based on the model under test and the dataset.
[0101] The testing equipment is used to acquire datasets and control the device under test to run the software framework under test to call the model under test and the dataset to execute model calculation tasks; determine model performance parameters and computational performance parameters based on the model calculation tasks; determine the model calculation efficiency of the test model calculation system based on the model performance parameters, computational performance parameters and the model parameter quantity evaluation parameters of the test model, so as to obtain the test results of the test model calculation system;
[0102] The tested model computing system includes the tested model, the tested software framework, and the tested equipment.
[0103] To address the aforementioned technical problems, this application also provides a simulation efficiency testing device, comprising:
[0104] The acquisition unit is used to acquire the dataset;
[0105] The control unit is used to control the device under test to run the software framework under test in order to call the model under test and the dataset to perform model calculation tasks.
[0106] The determination unit is used to determine the model performance parameters and computational performance parameters based on the model computation task; and to determine the model computation efficiency of the test model computation system based on the model performance parameters, computational performance parameters and the model parameter quantity evaluation parameters of the test model, so as to obtain the test results of the test model computation system.
[0107] The tested model computing system includes the tested model, the tested software framework, and the tested equipment.
[0108] To address the aforementioned technical problems, this application also provides a simulation efficiency testing device, comprising:
[0109] Memory, used to store computer programs;
[0110] A processor is used to execute computer programs. When a computer program is executed by a processor, it implements the steps of any of the above-mentioned simulation efficiency test methods.
[0111] To address the aforementioned technical problems, this application also provides a non-volatile readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of any of the above-described analog computing efficiency testing methods.
[0112] To address the aforementioned technical problems, this application also provides a computer program product, including a computer program that, when executed by a processor, implements the steps of any of the above-described simulation efficiency testing methods.
[0113] The model computation efficiency testing method provided in this application has the advantage of determining model performance parameters and computational performance parameters based on the model computation task after controlling the software framework under test to call the model under test to execute model computation tasks on the device under test according to the dataset. This allows for the quantification of the performance of the model under test and the performance of the hardware and software architecture. The model computation efficiency of the testing model computation system is determined based on the model performance parameters, computational performance parameters, and the model parameter quantity evaluation parameters of the model under test, thus obtaining the test results of the testing model computation system. This achieves a quantitative evaluation of the model computation performance of the testing model computation system, including the model under test, the software framework under test, and the device under test. The model computation efficiency and testing method provided in this application, as an evaluation scheme applicable to various model computation tasks, enhances the comparability of different models running on different devices, provides a reference for the improvement and perfection of models and model computation systems, and enables the quantitative evaluation of the accuracy and multi-dimensional computing power performance of different models, thereby ensuring effective guidance for improving model capabilities and reducing application costs.
[0114] The simulation efficiency testing system, simulation efficiency testing device, simulation efficiency testing equipment, non-volatile readable storage medium, and computer program product provided in this application have the aforementioned beneficial effects, which will not be elaborated further here. Attached Figure Description
[0115] To more clearly illustrate the technical solutions of the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0116] Figure 1 is a flowchart of a simulation efficiency testing method provided in an embodiment of this application;
[0117] Figure 2 is a timing diagram of a simulation efficiency testing method provided in an embodiment of this application;
[0118] Figure 3 is a schematic diagram of a simulation efficiency testing system provided in an embodiment of this application;
[0119] Figure 4 is a schematic diagram of the structure of a simulation efficiency testing device provided in an embodiment of this application. Detailed Implementation
[0120] The core of this application is to provide a model computation efficiency testing method, system, device, equipment, medium, and product for testing and evaluating the model computation performance of a model computation system.
[0121] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0122] To facilitate understanding of the technical solutions provided in the embodiments of this application, some key terms used in the embodiments of this application will be explained here first.
[0123] Computing power: Used to measure the ability of a computing system (such as a computer, server, data center, etc.) to process information and perform calculations. The most commonly used unit is floating point operations per second (FLOPS).
[0124] Model Computational Efficiency: A scheme for evaluating the efficiency of large models, reflecting the overall efficiency of the tested model under hardware system and software framework. The evaluation objects are the model itself, the tested software framework, and the tested AI bare metal server (computing power base). It is a comprehensive evaluation of the above three types of performance factors.
[0125] Floating point operations per second (FLOPS) is a metric that measures the processing power of a computer system or computing device. It refers to the number of floating-point operations (addition, subtraction, multiplication, division, etc.) a system or device can perform per unit of time.
[0126] Actual computational performance (unit can be TFLOPS (Tera Floating-point Operations Per Second): the amount of computation performed per unit time by the statistical model in addition, subtraction, multiplication, division and other floating-point operations during inference and training.
[0127] Actual computational total (unit can be TFLOPS): The total amount of floating-point operations such as addition, subtraction, multiplication, and division actually performed by the model during training or inference.
[0128] Theoretical peak performance of the computing system (unit can be TFLOPS): The theoretical value calculated from the characteristics of the chip hardware itself, such as the clock frequency and number of cores, and provided by the corresponding chip manufacturer; it can be based on the theoretical peak performance of single-precision floating point.
[0129] Figure 1 is a flowchart of a simulation efficiency testing method provided in an embodiment of this application.
[0130] As shown in Figure 1, the simulation efficiency testing method provided in this application includes:
[0131] S101: Obtain the dataset;
[0132] S102: Control the device under test to run the software framework under test to call the model under test and dataset to perform model calculation tasks;
[0133] S103: Determine the model performance parameters and computational performance parameters based on the model computation task;
[0134] S104: Determine the computational efficiency of the computational system of the tested model based on the model performance parameters, computational performance parameters, and the evaluation parameters of the model parameters of the tested model, so as to obtain the test results of the computational system of the tested model.
[0135] The tested model computing system includes the tested model, the tested software framework, and the tested equipment.
[0136] To address the lack of evaluation schemes for model computational efficiency in related technologies, this application proposes an evaluation index for model computational efficiency. The computational efficiency provided in this application can serve as a unified metric for model computational systems employing different models, software frameworks, and devices, enhancing the comparability of model computations performed by different devices and providing guidance for the improvement and refinement of models, software frameworks, and hardware.
[0137] Once the tested model, the tested software framework, and the tested device included in the tested model computing system are determined, the model computing efficiency testing method provided in the embodiments of this application can be executed.
[0138] The simulation efficiency testing method provided in this application embodiment can be implemented in software by designing and deploying test scripts, and in hardware by executing based on the device under test or another test device.
[0139] When performing tests on the computational system of the model under test, the test environment requirements may include:
[0140] The ambient temperature was (25±5)℃ (Degree Celsius), the relative humidity was 45%~75%, and the atmospheric pressure was 86kPa~106kPa (kilopascal).
[0141] For devices under test with a nominal power of less than 1.5 kW (kiloWatt), the test power supply should be AC (220±1%) V (Volt); for devices under test with a nominal power of more than 1.5 kW, the test power supply should be AC (220±4%) V.
[0142] For devices under test with a nominal power of less than 1.5kW, the total harmonic distortion of the test power supply should not exceed 2%; for devices under test with a nominal power of more than 1.5kW, the total harmonic distortion of the test power supply should not exceed 5%.
[0143] The power input frequency of all tested devices should be (50±1.0%) Hz (Hertz).
[0144] The testing environment should avoid interference factors such as strong magnetic fields and strong vibrations.
[0145] Requirements for the device being tested may include:
[0146] The device under test must include a complete set of components, including a central processing unit (CPU), memory, hard drive, and operating system.
[0147] The device under test should use the standard power supply, and all power supplies should be connected to AC power during the test.
[0148] The device under test should have all software options set to default. Power management technology and / or power saving features should only be tested with the server under test enabled by default.
[0149] The device under test should be tested using the operating system stated by the manufacturer.
[0150] It should be noted that the tested model in this application embodiment can be any type of model. From the perspectives of function, structure, and application domain, it can include, but is not limited to, supervised learning models (classification models, regression models, etc.), unsupervised learning models (clustering models, dimensionality reduction models, association rule learning models, etc.), reinforcement learning models (policy learning models, value learning models, policy optimization models, etc.), semi-supervised learning models (models trained by combining labeled and unlabeled data, such as self-trained models, etc.), ensemble learning models (combining multiple models to improve performance, such as Bagging (Bootstrap Aggregating), Boosting, Stacking (Stacked Generalization), etc.), deep learning models (convolutional neural networks, recurrent neural networks, long short-term memory networks, gated recurrent units, Transformer models, etc.), generative models (generative adversarial networks, variational autoencoders, etc.), interpretable models (decision trees, rule engines, etc.), probabilistic models (Bayesian networks, hidden Markov models, etc.), meta-learning models, etc.
[0151] The software framework under test in this application embodiment is software used to call the model under test to perform model calculation tasks based on the dataset.
[0152] In the test model computing system of this application embodiment, the number of test devices can be one or more. When there are multiple test devices, the different test devices can be devices of the same type or heterogeneous computing devices. In some embodiments of this application, the test device can be a server, which can be an artificial intelligence bare metal server (computing power base).
[0153] In the embodiments of this application, the model computation task can be either a model training task or a model inference task.
[0154] The model training task involves adjusting the model's parameters through optimization algorithms to enable it to accurately learn knowledge from the dataset and thus accurately process unknown data. In this embodiment, when there are multiple devices under test, the model training task can be either synchronous or asynchronous. In synchronous training, all devices under test simultaneously read the model parameters and perform calculations using the same parameters. After each iteration, all devices under test synchronously update their model parameters. In asynchronous training, different devices under test independently read the model parameters and perform calculations without waiting for other devices to complete; updates to the model parameters are shared asynchronously with other devices under test.
[0155] The model inference task involves processing unknown data using the tested model (a trained model). In this embodiment, when there are multiple tested devices, the model inference task can be a synchronous inference task or an asynchronous inference task. A synchronous inference task involves the tested device acquiring the next set of data to be processed only after each model inference is completed. In synchronous inference, at any given time, only one inference request is executing; other inference requests must wait for the current request to complete before starting. Synchronous inference is typically deployed in a linear blocking manner; if an inference task is currently executing, other inference tasks must wait for that task to finish before they can execute. Asynchronous inference tasks allow inference tasks to execute in parallel with other threaded tasks, thereby improving resource utilization. In asynchronous inference, while one set of data to be processed is being inferred by the model, the program immediately acquires the next set of data to be processed for preprocessing, and then waits for the previous inference calculation to complete before re-inferring the data to the model.
[0156] In this embodiment, the required dataset is determined based on at least one of the tested model, the tested software framework, the tested device, and the model computation task. In practical applications, the same model may be capable of performing multiple application tasks. For example, a language model can be used to perform problem-solving tasks, code generation tasks, language understanding tasks, etc., and different application tasks correspond to different datasets. Therefore, in S101, the acquired dataset can be the dataset corresponding to the application task applicable to the tested model. Furthermore, depending on the data type used in the model computation task and the type of data to be processed, one application task can correspond to one or more datasets. For example, an image processing task can correspond to a cat image dataset and a dog image dataset. One application task can correspond to datasets of different data types, such as a dataset of single-precision floating-point data and a dataset of integer data.
[0157] In S102, the test script described above is used to control the device under test to run the software framework under test in order to call the model under test and the dataset to perform model calculation tasks.
[0158] After the tested model computing system completes the model computing task, the computing efficiency of the tested model computing system is calculated.
[0159] The simulation efficiency evaluation index proposed in this application aims to evaluate the performance of a model computing system from both the model itself and the hardware / software framework perspectives. It serves as a comparability indicator for various combinations of model computing systems, evaluating performance from the perspective of computational power, accuracy, and efficiency. This index can be further expressed as follows: the faster the execution speed of the model computing task, the higher the simulation efficiency; the less resources the model computing task consumes, the higher the simulation efficiency; and the higher the utilization rate of limited resources by the model computing task, the higher the simulation efficiency. Therefore, in this application embodiment, the simulation efficiency of the tested model computing system is determined based on the model performance parameters, the model parameter quantity evaluation parameters, and the computational performance parameters of the tested model computing system.
[0160] In this embodiment, model performance parameters are used to quantitatively represent the performance of the model under test. The performance of the model under test can be represented from the perspectives of the accuracy of the results, the reliability of the results, and the stability of the model. In practical applications, the type of model performance parameters is selected based on the requirements of the model's computational task and the characteristics of the data to be processed by the model under test.
[0161] In some embodiments of this application, model performance parameters may be model accuracy. Model accuracy may include, but is not limited to, accuracy, precision, recall, F1 score, receiver operating characteristic (ROC) curve and area under the curve (AUC), mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), logarithmic loss (Log Loss), average precision (AP) and mean average precision (mAP), and coefficient of determination (R-squared). 2 At least one of the following. In general, model accuracy is a key metric for measuring model performance, representing how correctly the model processes data (such as classification or prediction).
[0162] Accuracy is primarily used to measure classification tasks. For binary classification problems, accuracy can be simply expressed as the number of correctly classified samples divided by the total number of samples.
[0163] Precision measures the proportion of samples that the model predicts as positive, but which are actually positive. It is calculated by dividing the number of samples that are actually positive (True Positives) by the number of samples that the model predicts as positive. The latter includes the sum of the number of samples that are actually positive (True Positives) and the number of samples that the model incorrectly predicts as positive (False Positives).
[0164] Recall measures the proportion of samples that are actually positive but are correctly predicted as positive by the model. It is calculated by dividing the number of samples correctly predicted as positive (True Positives) by the number of samples that are actually positive. The latter is the sum of the number of samples correctly predicted as positive (True Positives) and the number of samples incorrectly predicted as negative (False Negatives).
[0165] The F1 score is used to comprehensively consider the performance of precision and recall. It can be expressed as the harmonic mean of precision and recall, and is calculated as 2 × precision × recall / (precision + recall).
[0166] The ROC curve is a graphical tool used to evaluate the performance of classification models. It visually demonstrates a model's classification performance by depicting the relationship between the True Positive Rate (TPR) and False Positive Rate (FPR) at different classification thresholds. When evaluating model performance, the area under the ROC curve (AUC) can serve as a comprehensive indicator, allowing for direct comparison of the performance of multiple models using a single value. AUC values range from 0.5 to 1.0, with a higher AUC indicating better performance.
[0167] Mean squared error (MSE) is mainly used to measure the difference between the model's predicted values and the true values in regression tasks. The calculation formula is: Among them, y i For the true value, is the predicted value, and n is the number of samples.
[0168] The root mean square error (RMSE) is the square root of the mean square error, providing the magnitude of the error in the same units as the original data. The calculation formula is:
[0169] Mean Absolute Error (MAE) is the average of the absolute values of the differences between the model's predicted values and the actual values. Its calculation formula is: Among them, y i For the true value, is the predicted value, and n is the number of samples.
[0170] Log loss, also known as logistic loss or cross-entropy loss, is a loss function used in machine learning to evaluate the performance of classification models. It is particularly suitable for binary classification problems, but can also be extended to multi-class classification problems. Log loss measures the difference between the model's predicted probabilities and the actual labels.
[0171] Precision (AP) and mean average precision (mAP) are important metrics for evaluating the performance of classification and object detection models, especially in information retrieval, image classification, and object detection. AP measures performance on each class. Specifically, AP is the area under the precision-recall curve. To calculate AP, precision and recall values are calculated independently for each class, then the precision-recall curve is plotted, and the area under the curve is calculated to obtain the AP value for that class. mAP is the average of AP values across multiple classes, taking into account the performance of all classes. In object detection tasks, mAP is a key metric for evaluating the model's performance across all classes. It is calculated by calculating the AP value for each class separately and then averaging them to obtain the final mAP value.
[0172] Coefficient of determination (R) 2 R0, also known as the coefficient of determination or goodness of fit, is a statistic used in regression analysis to assess how well a model fits. It represents the proportion of variance of the dependent variable explained by the model, essentially the squared correlation between the model's predicted and actual values. 2 The value is between 0 and 1. The closer the value is to 1, the better the model fits, meaning the higher the proportion of variance explained by the model.
[0173] In the embodiments of this application, one or more model accuracies can be used as model performance parameters, and the combination of different types of model accuracies can be designed according to the type of model computation task.
[0174] In this embodiment, the model parameter evaluation parameter can be used to evaluate the number of parameters in the tested model. For the same model, different model parameters may be used in the actual model computation. Therefore, in this embodiment, the model parameter evaluation parameter can be the number of activated parameters of the tested model or a function calculated based on the number of activated parameters. The number of activated parameters refers to the number of parameters that the tested model participates in the computation during the model computation task.
[0175] In summary, model performance parameters and model parameter quantity evaluation parameters are used to evaluate the tested model itself. In this embodiment, the evaluation parameter calculated based on the model performance parameters and model parameter quantity evaluation parameters can be defined as model parameter efficiency. In this embodiment, model parameter efficiency is determined by the model itself and reflects the contribution of the parameter quantity to the model performance in the corresponding scenario. The fewer the parameters and the better the model performance, the higher the model parameter efficiency.
[0176] In this embodiment, determining the computational efficiency of the tested model computing system in step S104 based on model performance parameters, computational performance parameters, and the model parameter quantity evaluation parameters of the tested model may include: determining the model parameter efficiency of the tested model computing system based on model performance parameters and model parameter quantity evaluation parameters; and determining the computational efficiency based on the model parameter efficiency and computational performance parameters. It can be set that a higher model parameter efficiency results in a higher computational efficiency.
[0177] In this embodiment, the computational performance parameter is used to evaluate the computational performance of the device under test, and is used to quantitatively represent the performance of the device under test when running the software framework under test to perform model computation tasks based on the model under test and the dataset.
[0178] In some embodiments of this application, the step of determining computational performance parameters may include: determining the computational performance parameters based on the measured computational performance parameters of the computational system performing model computation tasks and the theoretical peak performance parameters of the computational system of the device under test. In this embodiment, computational efficiency can be defined to represent the hardware performance utilization rate of the device under test when running the software framework under test to perform model computation tasks based on the model under test and the dataset. The computational efficiency can be the ratio of the measured computational performance parameters to the theoretical peak performance parameters of the computational system, or calculated as a function of this ratio.
[0179] In this embodiment of the application, the higher the model performance of the tested model itself, the fewer the number of model parameters (i.e., less space occupied and less computation), and the higher the utilization rate of the hardware performance of the tested device on the tested software framework, the higher the simulation efficiency of the tested model computing system. Therefore, the simulation efficiency can reflect the overall efficiency of the tested model on the software and hardware framework.
[0180] In some embodiments of this application, in S104, the model parameter efficiency of the computing system of the tested model is determined based on the model performance parameters and the model parameter quantity evaluation parameters, and is calculated using the following formula:
[0181] Model efficiency = Model parameter efficiency × Computational efficiency;
[0182] Here, computational efficiency is a computational performance parameter.
[0183] Accordingly, methods for calculating model performance parameters, methods for calculating model parameter quantity evaluation parameters, methods for calculating computational performance parameters, methods for calculating model parameter efficiency based on model performance parameters and model parameter quantity evaluation parameters, and methods for calculating computational performance parameters based on model parameter efficiency are available. Specific calculation methods will be described in the following embodiments of this application.
[0184] As described above, after the test model computing system completes the model computing task, the true efficiency of the test model under the test equipment and the test software framework can be reflected by determining the computing efficiency of the test model computing system, thereby realizing the comprehensive performance evaluation of the test model itself, the test software framework and the test equipment.
[0185] In S103, after the model calculation task is completed, the model performance parameters and calculation performance parameters can be determined according to the execution process or execution result of the model calculation task.
[0186] In S104, the evaluation parameters for the number of model parameters are determined based on the participation of the model parameters of the tested model in the model computation task. The computational efficiency of the tested model computation system is then determined based on the model performance parameters, computational performance parameters, and the evaluation parameters for the number of model parameters of the tested model.
[0187] Then, the simulation efficiency can be directly used as the test result of the tested model computing system, or the simulation efficiency can be used as one of the indicators of the test result.
[0188] The simulation efficiency testing method provided in this application, after controlling the software framework under test to call the model under test to execute model calculation tasks on the device under test based on the dataset, determines model performance parameters and computational performance parameters according to the model calculation task, so as to quantify the performance of the model under test and the performance of the hardware and software architecture respectively. Based on the model performance parameters, computational performance parameters, and the model parameter quantity evaluation parameters of the model under test, the simulation efficiency of the simulation system of the model under test is determined to obtain the test results of the model calculation system of the model under test. This achieves a quantitative evaluation of the model calculation performance of the model calculation system of the model under test, including the model under test, the software framework under test, and the device under test. The simulation efficiency and testing method provided in this application, as an evaluation scheme applicable to various model calculation tasks, enhances the comparability of different models running on different devices, provides a reference for the improvement and perfection of models and model calculation systems, and can quantitatively evaluate the accuracy and multi-dimensional computing power performance of different models, thereby ensuring effective guidance for improving model capabilities and reducing application costs.
[0189] Based on the above embodiments, this application further describes the calculation steps for model parameter efficiency.
[0190] As described in the above embodiments, the evaluation parameter calculated based on the model performance parameters and the model parameter quantity evaluation parameter can be defined as the model parameter efficiency. The model parameter efficiency can be defined as reflecting the contribution of the parameter quantity to the model performance in the corresponding scenario. The fewer the parameters and the better the model performance, the higher the model parameter efficiency. Based on this, in the embodiments of this application, the model parameter efficiency can be expressed as the ratio of the model performance parameters to the model parameter quantity evaluation parameter.
[0191] To eliminate the influence of units, in this embodiment of the application, determining the model parameter efficiency of the tested model computing system based on model performance parameters and model parameter quantity evaluation parameters may include: normalizing the model performance parameters and model parameter quantity evaluation parameters respectively, and determining the model parameter efficiency based on the normalized model performance parameters, normalized model parameter quantity evaluation parameters, and computing performance parameters.
[0192] The normalization of model performance parameters can include: determining the normalized model performance parameters based on the measured model performance parameters in the model computation task and the reference model performance parameters of the tested model. If the reference model performance parameters and the measured model performance parameters are of the same type, then the measured model performance parameters can be used as the model accuracy of the tested model, and the reference model performance parameters can be used as the reference accuracy of the tested model. For example, if the measured model performance parameters use accuracy, then the reference model performance parameters also use accuracy.
[0193] Reference model performance parameters are used to eliminate the influence of units on measured model performance parameters. Therefore, specific tested models (with the same model parameters) can use the same reference model performance parameters, or tested models performing similar model calculation tasks can use the same reference model performance parameters.
[0194] The normalization of model parameter evaluation parameters can include: determining the normalized model parameter evaluation parameters based on the number of activated parameters and the number of reference parameters of the tested model; where the number of activated parameters refers to the number of parameters that the tested model participates in the computation of the model computation task. The reference parameter is used to eliminate the influence of the unit of the activated parameter, and can be the total number of parameters of the tested model.
[0195] In this embodiment of the application, the model parameter efficiency of the tested model computing system is determined based on model performance parameters and model parameter quantity evaluation parameters, and can be calculated using the following formula:
[0196] Here, model accuracy refers to the measured model performance parameters in the model computation task, reference accuracy refers to the reference model performance parameters of the tested model, and the number of activated parameters refers to the number of parameters of the tested model involved in the computation in the model computation task.
[0197] Based on the above embodiments, since models with excessively low performance cannot be practically applied and have no practical value, calculating their computational efficiency is meaningless. Therefore, in this embodiment, a baseline model performance parameter range can be defined to measure the minimum standard of the model performance of the tested model. In this embodiment, determining the computational efficiency of the tested model's computational system in S104 based on model performance parameters, computational performance parameters, and the model parameter quantity evaluation parameters of the tested model can include: if the model performance parameters are within the baseline model performance parameter range, then determining the computational efficiency based on the model performance parameters, computational performance parameters, and model parameter quantity evaluation parameters; if the model performance parameters are not within the baseline model performance parameter range, then setting the model performance parameters to 0 and determining the computational efficiency based on the model performance parameters, computational performance parameters, and model parameter quantity evaluation parameters.
[0198] In specific implementation, if the model performance parameters are positively correlated with the model performance, i.e., the better the model performance, the larger the model performance parameters, then the range of the benchmark model performance parameters can be represented by the benchmark model performance parameters. In this case, determining the computational efficiency of the tested model's computational system in S104 based on the model performance parameters, computational performance parameters, and the model parameter quantity evaluation parameters can include: if the model performance parameters are greater than or equal to the benchmark model performance parameters, then determining the computational efficiency based on the model performance parameters, computational performance parameters, and model parameter quantity evaluation parameters; if the model performance parameters are less than the benchmark model performance parameters, then setting the model performance parameters to 0 and then determining the computational efficiency based on the model performance parameters, computational performance parameters, and model parameter quantity evaluation parameters.
[0199] If the model performance parameter is the model accuracy, then the baseline model performance parameter is the baseline accuracy. In step S104, determining the computational efficiency of the tested model's computational system based on the model performance parameters, computational performance parameters, and the evaluation parameters of the model parameter quantity of the tested model can include: if the measured accuracy of the model is greater than or equal to the baseline accuracy, then the measured accuracy of the model is used as the model accuracy, and the computational efficiency is determined based on the model accuracy, computational performance parameters, and the evaluation parameters of the model parameter quantity; if the measured accuracy of the model is less than the baseline accuracy, then 0 is used as the model accuracy, and the computational efficiency is determined based on the model accuracy, computational performance parameters, and the evaluation parameters of the model parameter quantity.
[0200] When the model performance parameters use the model accuracy of the tested model, the model accuracy can be determined by the following formula:
[0201] Among them, the measured accuracy of the model is the measured accuracy of the tested model in the model calculation task, and the baseline accuracy is the minimum allowable accuracy of the tested model.
[0202] In the embodiments of this application, different ranges of benchmark model performance parameters can be set for different tested models, different model computation tasks, or different datasets.
[0203] Based on the above embodiments, this application further describes the calculation steps for the performance parameters.
[0204] As described in the above embodiments, the computational performance parameters can be determined based on the measured computational performance parameters of the computational system of the tested model performing the model computation task and the theoretical peak performance parameters of the computational system of the tested device.
[0205] In an implementation method for determining computational performance parameters based on the measured computational performance parameters of the computational system executing model computation tasks according to the tested model and the theoretical peak performance parameters of the computational system of the tested device, the ratio of the measured computational performance parameters to the theoretical peak performance parameters of the computational system can be used as the computational performance parameter. Computational efficiency can be defined to represent the hardware performance utilization rate of the tested device when running the tested software framework to execute model computation tasks based on the tested model and dataset. Therefore, computational efficiency can be the ratio of the measured computational performance parameters to the theoretical peak performance parameters of the computational system, or it can be calculated as a function of this ratio.
[0206] In this embodiment of the application, the computational performance parameters are determined based on the measured computational performance parameters of the computational system of the tested model performing the model computation task and the theoretical peak performance parameters of the computational system of the tested device. These parameters can be calculated using the following formula:
[0207] In this embodiment, for the device under test, the theoretical peak performance parameters of the computing system are used to quantify the theoretical peak computing power of the device under test, while the measured computing performance parameters are used to quantify the actual computing power invested by the device under test when executing model computing tasks. For the model under test, the measured computing performance parameters are used to quantify the amount of computation per unit time during the process of the software framework under test calling the model under test to execute model computing tasks.
[0208] In this embodiment, the theoretical peak performance parameters of the computing system can be parameters provided by the manufacturer of the device under test. These parameters are typically theoretical values calculated based on the hardware configuration of the device under test (such as the frequency and number of cores of the central processing unit). The measured computing performance parameters are determined based on the execution process or results of the model computing task and are parameters actually obtained through testing.
[0209] In some embodiments of this application, the measured computational performance parameters can be expressed as the number of floating-point operations per second (FLOPS) performed by the device under test (DUT) to execute model computation tasks, and the theoretical peak performance parameters of the computational system can be expressed as the theoretical peak value of the number of FLOPS performed by the DUT. Here, FLOPS refers to the number of floating-point operations (addition, subtraction, multiplication, division, etc.) that a system or device can perform per unit time. In other embodiments of this application, the measured computational performance parameters can also be expressed as the number of trillions of floating-point operations per second (TFLOPS) performed by the DUT to execute model computation tasks, and the theoretical peak performance parameters of the computational system can also be expressed as the theoretical peak value of the number of trillions of floating-point operations per second performed by the DUT.
[0210] Actual testing revealed that the data type used during model computation affects both model performance parameters and computational performance parameters. Therefore, data type must be considered when determining the theoretical peak performance parameters and measured computational performance parameters of the computing system.
[0211] In some embodiments of this application, the steps for determining the measured computational performance parameters and the theoretical peak performance parameters of the computational system include: obtaining the theoretical peak performance parameters of the computational system corresponding to the first data type of the device under test; if all actual data types in the model computation task are of the first data type, then the computational performance parameters corresponding to the actual data types in the model computation task are used as the measured computational performance parameters; if there are actual data types in the model computation task that are not of the first data type, then the computational performance parameters corresponding to the actual data types in the model computation task are equivalent to the computational performance parameters corresponding to the first data type, thus obtaining the measured computational performance parameters. That is, regardless of the actual data type used in the model computation task, it is treated as the first data type to determine the measured computational performance parameters, and then combined with the theoretical peak performance parameters of the computational system to determine the computational performance parameters. This greatly simplifies the steps for determining the model computation efficiency, thereby quickly obtaining the test results.
[0212] The first data type can be selected from the theoretical peak performance parameters of the computing system for various data types provided by the manufacturer of the device under test. In practical applications, the first data type can be, but is not limited to, Integer, Floating-point, Character, String, or Array. If the first data type is Floating-point, it can specifically be one of Single Precision Floating Point, Double Precision Floating Point, Extended Precision Floating Point, Quadruple Precision Floating Point, or Half Precision Floating Point.
[0213] In some embodiments of this application, the first data type may be a single-precision floating-point type.
[0214] In some other embodiments of this application, the steps for determining the measured computational performance parameters and the theoretical peak performance parameters of the computational system may further include: obtaining the theoretical peak performance parameters of the computational system corresponding to the second data type of the device under test; if all actual data types in the model computation task are the second data type, then the computational performance parameters corresponding to the actual data types in the model computation task are used as the measured computational performance parameters; if there are actual data types in the model computation task that are not the second data type, then the computational performance parameters corresponding to the actual data types in the model computation task are converted into computational performance parameters corresponding to the second data type to obtain the measured computational performance parameters. In practical applications, the computing power corresponding to different data types can be obtained through testing, that is, there is a definite conversion relationship between the computational performance parameters corresponding to different data types. Therefore, when determining the computational performance parameters of the model computational system under test, the computational performance parameters corresponding to the actual data types used in the model computation task can be converted into computational performance parameters corresponding to the second data type of the selected computational system's theoretical peak performance parameters before determining the measured computational performance parameters. This allows for more refined calculation of computational performance parameters, thereby achieving more accurate calculation of model computational efficiency.
[0215] The second data type can be selected from the theoretical peak performance parameters of the computing system for various data types provided by the manufacturer of the device under test. In practical applications, the first data type can be, but is not limited to, Integer, Floating-point, Character, String, or Array. If the second data type is floating-point, it can specifically be one of Single Precision Floating Point, Double Precision Floating Point, Extended Precision Floating Point, Quadruple Precision Floating Point, or Half Precision Floating Point.
[0216] In some embodiments of this application, the second data type may be a single-precision floating-point type.
[0217] Based on the above embodiments, this application provides a method for determining the computational efficiency suitable for practical applications.
[0218] After the controlled computation system of the tested model has completed the model computation task, the computation efficiency can be calculated using the following formula:
[0219] Model computation efficiency = Model parameter efficiency × Computational efficiency. (1)
[0220] The model parameter efficiency and computational efficiency can be referred to the description of the above embodiments of this application. In the embodiments of this application, the model parameter efficiency can be directly proportional to the model accuracy and inversely proportional to the number of activation parameters. Therefore, in order to eliminate the influence of units, the model parameter efficiency can be calculated by comparing it with the reference accuracy and the number of reference parameters, respectively, and the model parameter efficiency can be obtained by the following formula:
[0221] Based on the proportion of activated parameters to reference parameters, models can be divided into dense models and sparse models. If the model under test is a dense model, then the number of activated parameters is the total number of parameters in the model under test.
[0222] Computational efficiency can be calculated using the following formula:
[0223] Among them, the measured computing performance parameters can be expressed as the number of trillion floating-point operations per second (TFLOPS) performed by the tested device in executing the model computing task, and the theoretical peak performance parameters of the computing system can be expressed as the theoretical peak value of the number of trillion floating-point operations per second of the tested device.
[0224] Model accuracy can be determined by the following formula:
[0225] In other words, the measured accuracy of the model that is less than the baseline accuracy is set to zero.
[0226] In practical applications, observations have revealed that, besides the tested device and software framework, factors affecting model performance include the number of model parameters, the type of application scenario, the type of dataset, and the data type used for model computation. That is, under different combinations of these factors, the same tested model may exhibit different performance characteristics, further impacting the final measured computational efficiency. Therefore, in this embodiment, the computational efficiency of the tested model's computing system can be more comprehensively evaluated by determining its overall computational efficiency.
[0227] In some embodiments of this application, determining the computational efficiency of the tested model computing system in step S104 based on model performance parameters, computational performance parameters, and the evaluation parameters of the model parameter quantity of the tested model to obtain the test result of the tested model computing system may include: determining the comprehensive computational efficiency of the tested model computing system based on the computational efficiency of the tested model computing system in multiple test scenarios; using the comprehensive computational efficiency as the test result; wherein, the test scenarios correspond one-to-one with the datasets. That is, multiple test scenarios can be targeted, and a dataset can be selected for each test scenario to perform model computation tasks, and then the comprehensive computational efficiency of each test scenario can be obtained.
[0228] In some other embodiments of this application, step S104, which determines the computational efficiency of the tested model computing system based on model performance parameters, computational performance parameters, and the evaluation parameters of the model parameter quantity of the tested model to obtain the test result of the tested model computing system, may further include: determining the comprehensive computational efficiency of the tested model computing system based on the computational efficiency of the tested model computing system in multiple test scenarios; and using the comprehensive computational efficiency as the test result; wherein, one test scenario corresponds to one or more datasets. That is, for multiple test scenarios, one or more datasets can be selected for each test scenario to perform model computation tasks. For test scenarios using multiple datasets, the computational efficiency of each dataset is first combined to obtain the computational efficiency corresponding to the test scenario, and then the computational efficiency of each test scenario is combined to obtain the comprehensive computational efficiency.
[0229] Taking a language model as an example, the test scenario can be set as follows:
[0230] Test Scenario 1: The model calculation task is a problem-solving task, and the dataset used may include, but is not limited to, GSM8K (Grade School Math 8K), Math (Mathematics Set), and Math23K (Mathematics 23K).
[0231] Test Scenario 2: The model computation task is a code generation task, and the dataset used may include, but is not limited to, Human Eval (Human Evaluation of Programming Problems Benchmark) and MBPP (Multi-language Benchmark for Programming Problems).
[0232] Test Scenario 3: The model computation task is a large-scale multi-task language understanding task, and the dataset used may include, but is not limited to, MMLU.
[0233] Test Scenario 4: The model computation task is a reading comprehension task, and the dataset used may include, but is not limited to, ARC-C (AI2 Reasoning Challenge-Challenge Set).
[0234] The steps for determining the computational efficiency corresponding to a single test scenario may include: if the test scenario corresponds to one dataset, then the computational efficiency corresponding to the dataset is used as the computational efficiency corresponding to the test scenario; if the test scenario corresponds to multiple datasets, then the computational efficiency corresponding to the test scenario is determined based on the computational efficiency corresponding to each dataset. In some embodiments of this application, determining the computational efficiency corresponding to the test scenario based on the computational efficiency corresponding to each dataset may include: using the arithmetic mean of the computational efficiencies corresponding to each dataset as the computational efficiency corresponding to the test scenario. In other embodiments of this application, determining the computational efficiency corresponding to the test scenario based on the computational efficiency corresponding to each dataset may further include: using the geometric mean of the computational efficiencies corresponding to each dataset as the computational efficiency corresponding to the test scenario.
[0235] In some further embodiments of this application, model performance parameters and computational performance parameters can be calculated separately. Therefore, S104, determining the computational efficiency of the tested model computational system based on the model performance parameters, computational performance parameters, and the model parameter quantity evaluation parameters of the tested model to obtain the test results of the tested model computational system, may further include: determining the single-scene model performance parameters corresponding to the test scenario based on the model performance parameters corresponding to each dataset corresponding to the test scenario; determining the single-scene computational performance parameters corresponding to the test scenario based on the computational performance parameters corresponding to each dataset corresponding to the test scenario; and determining the computational efficiency corresponding to the test scenario based on the single-scene model performance parameters, single-scene computational performance parameters, and model parameter quantity evaluation parameters. That is, by first calculating the single-scene model performance parameters and single-scene computational performance parameters separately, different comprehensive calculation methods can be used for the parameters corresponding to the datasets, so as to better reflect the impact of different types of parameters on the computational efficiency calculation.
[0236] In some embodiments of this application, determining the single-scene model performance parameters corresponding to the test scenario based on the model performance parameters corresponding to each dataset corresponding to the test scenario includes: using the arithmetic mean of the model performance parameters corresponding to each dataset as the single-scene model performance parameter; determining the single-scene computational performance parameters corresponding to the test scenario based on the computational performance parameters corresponding to each dataset corresponding to the test scenario includes: using the arithmetic mean of the computational performance parameters corresponding to each dataset as the single-scene computational performance parameter.
[0237] In some other embodiments of this application, determining the single-scene model performance parameters corresponding to the test scenario based on the model performance parameters corresponding to each dataset of the test scenario includes: using the geometric mean of the model performance parameters corresponding to each dataset as the single-scene model performance parameter; determining the single-scene computational performance parameters corresponding to the test scenario based on the computational performance parameters corresponding to each dataset of the test scenario includes: using the geometric mean of the computational performance parameters corresponding to each dataset as the single-scene computational performance parameter.
[0238] In this embodiment of the application, determining the comprehensive computational efficiency of the tested model computing system based on its computational efficiency in multiple test scenarios may include: using the arithmetic mean of the computational efficiencies corresponding to multiple test scenarios as the comprehensive computational efficiency. The arithmetic mean of the computational efficiencies corresponding to multiple test scenarios as the comprehensive computational efficiency can be calculated using the following formula:
[0239]
[0240] Among them, MCE t MCE represents the overall computational efficiency. i f represents the computational efficiency of the tested model computing system in the i-th test scenario. 1i This represents the first weight coefficient of the tested model calculation system in the i-th test scenario. In the embodiments of this application, the first weight coefficients corresponding to each test scenario may be equal or unequal.
[0241] In this embodiment of the application, determining the comprehensive computational efficiency of the tested model computing system based on its computational efficiency in multiple test scenarios may further include: using the geometric mean of the computational efficiencies corresponding to multiple test scenarios as the comprehensive computational efficiency. The comprehensive computational efficiency, calculated using the geometric mean of the computational efficiencies corresponding to multiple test scenarios, can be obtained through the following formula:
[0242]
[0243] Among them, MCE t MCE represents the overall computational efficiency. i f represents the computational efficiency of the tested model computation system in the i-th test scenario, ∏(·) represents cumulative multiplication, and f 2i This represents the second weight coefficient of the tested model calculation system in the i-th test scenario. In the embodiments of this application, the second weight coefficients corresponding to each test scenario may be equal or unequal.
[0244] In this application, two methods are proposed for determining the combined modular computation efficiency: the arithmetic mean and the geometric mean. In practical applications, the geometric mean is more suitable for comparing data of different scales or units than the arithmetic mean. Therefore, in some embodiments of this application, if the difference between at least two modular computation efficiencies corresponding to the tested model computation system exceeds a preset threshold, the geometric mean of the modular computation efficiencies is used as the comprehensive modular computation efficiency. This mitigates the possibility of test errors or the tested model computation system performing excessively well or poorly in certain test scenarios or datasets, thereby obtaining a more comprehensive comprehensive modular computation efficiency.
[0245] In this embodiment of the application, in addition to the overall simulation efficiency, the simulation efficiency corresponding to each test scenario can also be output as a test result.
[0246] The simulation efficiency testing method provided in this application embodiment can achieve a more comprehensive evaluation of the simulation efficiency of the tested model calculation system.
[0247] Based on the above embodiments, this application further describes the steps for controlling the test model computing system to perform model computing tasks.
[0248] In this embodiment of the application, S102, which controls the device under test to run the software framework under test to call the model under test and the dataset to perform model calculation tasks, may include: loading the software framework under test on the device under test and performing initialization of the software framework under test; starting the software framework under test to call the model under test and the dataset to perform model calculation tasks.
[0249] The process of loading the software framework under test on the device under test and performing initialization of the software framework under test may include: loading the library files of the computing system under test on the device under test; obtaining the types of interface functions of the computing system under test through the calling interfaces of the library files; and initializing the interface functions according to the types of the interface functions.
[0250] After the tested model's computational system is deployed, it is provided as a dynamic library, including the required structure and implementation of specified interface functions. When the test device starts running the test script, it dynamically loads the library files (Lib library) of the tested model's computational system. The Lib library provides access interfaces; if the tested software framework uses C or C++, the Lib library provides a C interface. This interface can be defined as "GetFuncs," meaning "call the interface function." This interface can be further defined as:
[0251] void GetFuncs(SUTInterface*interface).
[0252] SUTInterface is a structure that needs to be implemented by the software framework under test and populated with pointers to the interface functions required by the embodiments of this application. The model computing system under test needs to implement these interface functions to perform model computing tasks and cooperate in determining the model computing efficiency.
[0253] In this embodiment, the required interface functions may include, but are not limited to: initialization function (Initialize), finalization function (Finalize), data loading function (LoadData), data unloading function (UnLoadData), and model calculation control function; the initialization function is the interface function that performs initialization according to the test configuration information; the finalization function is the interface function that performs the task after the tested model calculation system has completed the model calculation task; the data loading function is the interface function that loads model calculation data; the data unloading function is the interface function that unloads model calculation data; and the model calculation control function is the interface function that performs the model calculation task.
[0254] The model computation control function is determined based on the type of model computation task, and the type description can be found in the above embodiments. Taking the model computation task as a model inference task as an example, the model computation control function can include a synchronous inference control function (Predict) and an asynchronous inference control function (PredictAsync). The synchronous inference control function performs synchronous inference based on the input data, fills in the corresponding information, and calls a callback function (func) to return the inference result. The asynchronous inference control function performs asynchronous inference based on the input data, fills in the corresponding information, and calls a callback function (func) to return the inference result.
[0255] The interface function may also include a system theoretical peak return function (GetSystemFLOPS), used by the test model to calculate the system response test script and return the calculated system theoretical peak performance parameters of the device under test. This interface function may not be configured, meaning the test device can obtain the calculated system theoretical peak performance parameters of the device under test through other means and store them locally for use in determining the model's efficiency.
[0256] In some embodiments of this application, the structure SUTInterface can be populated with the following pointers:
[0257] struct SUTInterface{
[0258] RetStatus(*Initialize)(char*config_path);
[0259] RetStatus(*Finalize)();
[0260] RetStatus(*LoadData)(const int[]ids,size_t nLength);
[0261] RetStatus(*UnLoadData)(const int[]ids,size_t nLength);
[0262] RetStatus(*Predict)(QueryMetaData[]metaData,Response*func);
[0263] RetStatus(*PredictAsync)(QueryMetaData[]metaData,Response*func);
[0264] RetStatus(*WaitforComplete)();
[0265] size_t(*GetSystemFLOPS)();
[0266] }。
[0267] Here, `RetStatus` represents the status, `char` represents a character string, and `config_path` represents the configuration path. `Initialize` is the initialization function, called by the software framework under test, passing in the path to the test configuration file for this test. The tested model computation system parses the configuration information and performs other initialization preparations. `Finalize` is the completion function, used for resource release and other operations after the tested model computation system completes the test. `LoadData` is the data loading function, used by the tested model computation system to load data corresponding to the passed identifier (ids). `Const` represents a constant, and `int[]` represents an integer. `size_t` represents the data type of the object's size or length. `nLength` is the variable name used to store the length value of a specific object. `UnLoadData` is the data unloading function, used by the tested model computation system to unload data for the specified identifier (ids). `Predict` is the synchronous inference control function, used by the tested model computation system to perform synchronous inference according to the passed data, fill in the corresponding information, and call the callback function `func` to return the inference result. `QueryMetaData[]` represents querying metadata, `metaData` represents metadata, and `Response*func` represents the response function. `PredictAsync` represents the asynchronous inference control function, used by the tested model's computational system to perform asynchronous inference based on the input data, populate corresponding information, and call the callback function `func` to return the inference result. `WaitforComplete` indicates waiting for completion. `GetSystemFLOPS` represents the system theoretical peak performance return function, used by the tested model's computational system to return the theoretical peak performance parameters of the tested device's computational system after receiving a request from the test device to obtain the system theoretical peak performance.
[0268] After initializing the test model calculation system, the test software framework is started to call the test model to execute model calculation tasks. This may include: starting the test software framework to execute model calculation tasks; obtaining the first result returned by the test software framework after calling the data loading function to load model calculation data; obtaining the second result of the test software framework calling the model calculation control function to execute model calculation tasks; waiting for the third result output by the test software framework after it has completed executing model calculation tasks; obtaining the fourth result returned by the test software framework after calling the data unloading function to unload model calculation data; and calculating the model accuracy of the test model based on the execution results of the model calculation tasks, including: calculating the model accuracy based on at least one of the first, second, third, and fourth results.
[0269] Figure 2 is a timing diagram of a simulation efficiency testing method provided in an embodiment of this application.
[0270] As shown in Figure 2, in some embodiments of this application, after the computing system of the model under test is deployed, the step of the test device running the test script to execute the model computing efficiency test method may include the following steps S201 to S218.
[0271] S201: Startup. Start the test equipment to begin testing the computational efficiency of the model under test.
[0272] S202: Load library files. The test equipment loads the library files of the computational system of the model under test.
[0273] S203: Obtain Interface Functions. The test device obtains the interface functions of the computing system of the model under test.
[0274] S204: Returns interface function. The device under test returns an interface function.
[0275] S205: Invocation of initialization function. The test device invokes the initialization function of the test model computing system to perform the initialization of the test model computing system.
[0276] S206: Return initialization result. The device under test returns the initialization result.
[0277] The following S207 to S214 are the steps included in one iteration of the iterative calculation of the model calculation task.
[0278] S207: Invoking the data loading function. The test device calls the data loading function of the test model's computing system to load the data corresponding to the passed identifier (ids).
[0279] S208: Returns data loading result. The device under test returns the data loading result to the testing device.
[0280] The steps S209 to S210 below are the steps performed for each input data (each sample in the model training task, and each piece of data to be processed in the model inference task) in one iteration.
[0281] S209: Invoking the model calculation control function. The test device invokes the model calculation control function of the model calculation system under test to perform model calculations for the current input data.
[0282] S210: Save calculation results. The test equipment saves the calculation results of the tested model's calculation system for the current input data.
[0283] S211: Waiting for completion. The test equipment waits for the tested model computing system to complete the model computing task and sends a status acquisition request to the tested model computing system to determine the execution progress of the model computing task.
[0284] S212: Returns the result of the model computation task. The tested model computation system returns the result after completing the model computation task.
[0285] S213: Call the data unloading function. The test device calls the data unloading function of the test model's computing system to unload data with the specified identifier (ids).
[0286] S214: Return data unloading result. The device under test returns the data unloading result to the test device.
[0287] S215: Call the system theoretical peak return function. The test device calls the system theoretical peak return function of the model under test to obtain the system theoretical peak performance parameters of the test device.
[0288] S216: Returns the theoretical peak performance parameters of the computing system. The device under test returns the theoretical peak performance parameters of the computing system to the testing device.
[0289] S217: Calculate the computational efficiency. The test equipment calculates the computational efficiency of the test model's computing system according to the method described in any of the above embodiments.
[0290] S218: Generate a test report. The test equipment summarizes the test results of the tested model's computational system, which may include, but is not limited to, simulation efficiency and overall simulation efficiency, and generates a corresponding test report.
[0291] The simulation efficiency testing method provided in this application not only proposes evaluation indicators for simulation efficiency but also gives a method for testing simulation efficiency. This provides a unified evaluation standard applicable to various model computing systems, which can help model users select more efficient models and assist model developers in optimizing models, software frameworks, computing devices, etc., so as to promote the green development of artificial intelligence, promote the efficient use of resources, and enhance model quality.
[0292] It should be noted that in the embodiments of the simulation efficiency testing methods in this application, some steps or features may be ignored or not executed. The hardware or software functional modules are divided for ease of explanation and are not the only implementation of the simulation efficiency testing methods provided in the embodiments of this application.
[0293] The above details various embodiments of the simulation efficiency testing method. Based on this, this application also discloses simulation efficiency testing systems, apparatus, devices, non-volatile readable storage media, and computer program products corresponding to the above methods.
[0294] Figure 3 is a schematic diagram of the structure of a simulation efficiency testing system provided in an embodiment of this application.
[0295] As shown in Figure 3, the simulation efficiency testing system provided in this application embodiment may include a testing device 301 and a device under test 302.
[0296] The device under test 302 is used to run the software framework under test to perform model calculation tasks based on the model under test and the dataset.
[0297] Test device 301 is used to acquire datasets and control device under test 302 to run the software framework under test to call the model under test and dataset to perform model calculation tasks; determine model performance parameters and calculation performance parameters based on model calculation tasks; determine the model calculation efficiency of the model calculation system under test based on model performance parameters, calculation performance parameters and model parameter quantity evaluation parameters of the model under test, so as to obtain the test results of the model calculation system under test.
[0298] The test model computing system includes the test model, the test software framework, and the test device 302.
[0299] In a specific implementation, the test device 301 sends a test control command to the device under test 302 to control the device under test 302 to execute the model test task and return the model calculation results to the test device 301. The steps of the device under test 302 executing the model calculation task and the steps of the test device 301 executing the test of the model calculation efficiency of the model calculation system under test can be referred to the description of any of the above method embodiments of this application.
[0300] The simulation efficiency testing device provided in this application embodiment may include:
[0301] The acquisition unit is used to acquire the dataset;
[0302] The control unit is used to control the device under test to run the software framework under test in order to call the model under test and the dataset to perform model calculation tasks.
[0303] The determination unit is used to determine the model performance parameters and computational performance parameters based on the model computation task; and to determine the model computation efficiency of the test model computation system based on the model performance parameters, computational performance parameters and the model parameter quantity evaluation parameters of the test model, so as to obtain the test results of the test model computation system.
[0304] The tested model computing system includes the tested model, the tested software framework, and the tested equipment.
[0305] It should be noted that in the various embodiments of the simulation efficiency testing device provided in this application, the division of units is only a logical functional division, and other division methods can be used. The connection between different units can be electrical, mechanical, or other connection methods. Separate units can be located in the same physical location or distributed across multiple network nodes. Each unit can be implemented in hardware or as a software functional unit. That is, some or all of the units provided in this application embodiment can be selected according to actual needs, and corresponding connection or integration methods can be used to achieve the purpose of the solution in this application embodiment.
[0306] Since the embodiments of the apparatus and the embodiments of the method correspond to each other, please refer to the description of the embodiments of the method for the embodiments of the apparatus, which will not be repeated here.
[0307] Figure 4 is a schematic diagram of the structure of a simulation efficiency testing device provided in an embodiment of this application.
[0308] As shown in Figure 4, the simulation efficiency testing device provided in this application embodiment includes: a memory 410 for storing a computer program 411; and a processor 420 for executing the computer program 411. When the computer program 411 is executed by the processor 420, it implements the steps of the simulation efficiency testing method provided in any of the above embodiments.
[0309] The processor 420 may include one or more processing cores, such as a 3-core processor or an 8-core processor. The processor 420 may be implemented using at least one hardware form selected from Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 420 may also include a main processor and a coprocessor. The main processor, also known as the Central Processing Unit (CPU), is used to process data in the wake-up state; the coprocessor is a low-power processor used to process data in the standby state. In some embodiments, the processor 420 may integrate a Graphics Processing Unit (GPU) responsible for rendering and drawing the content to be displayed on the screen. In some embodiments, the processor 420 may also include an Artificial Intelligence (AI) processor for handling computational operations related to machine learning.
[0310] The memory 410 may include one or more non-volatile readable storage media, which may be non-transitory. The memory 410 may also include high-speed random access memory and non-volatile memory, such as one or more disk storage devices or flash memory devices. In this embodiment, the memory 410 is used to store at least the following computer program 411, wherein, after being loaded and executed by the processor 420, the computer program 411 is able to implement the relevant steps in the simulation efficiency testing method disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 410 may also include an operating system 412 and data 413, and the storage method may be temporary storage or permanent storage. The operating system 412 may be Windows or other types of operating systems. The data 413 may include, but is not limited to, the data involved in the above methods.
[0311] In some embodiments, the simulation efficiency testing device may further include a display screen 430, a power supply 440, a communication interface 450, an input / output interface 460, a sensor 470, and a communication bus 480.
[0312] Those skilled in the art will understand that the structure shown in Figure 4 does not constitute a limitation on the simulation efficiency testing equipment and may include more or fewer components than shown.
[0313] The simulation efficiency testing device provided in this application includes a memory and a processor. When the processor executes the program stored in the memory, it can implement the steps of the simulation efficiency testing method provided in the above embodiments, and the effect is the same as above.
[0314] This application provides a non-volatile readable storage medium storing a computer program thereon. When executed by a processor, the computer program can implement the steps of the simulation efficiency testing method provided in any of the above embodiments.
[0315] The non-volatile readable storage medium may include: USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks or optical disks, and other media that can store program code.
[0316] For a description of the non-volatile readable storage medium provided in the embodiments of this application, please refer to the above method embodiments. The effect it achieves is the same as the simulation efficiency test method provided in the embodiments of this application, and will not be repeated here.
[0317] This application provides a computer program product, including a computer program that, when executed by a processor, implements the steps of the simulation efficiency testing method provided in any of the above embodiments.
[0318] For a description of the computer program product provided in the embodiments of this application, please refer to the above method embodiments. The effects it achieves are the same as the simulation efficiency testing method provided in the embodiments of this application, and will not be repeated here.
[0319] The foregoing provides a detailed description of a simulation efficiency testing method, system, apparatus, device, medium, and product provided in this application. The various embodiments in the specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the systems, apparatus, devices, non-volatile readable storage media, and computer program products disclosed in the embodiments, since they correspond to the methods disclosed in the embodiments, the descriptions are relatively simple, and relevant parts can be referred to in the method section. It should be noted that those skilled in the art can make several improvements and modifications to this application without departing from the principles of this application, and these improvements and modifications also fall within the protection scope of this application.
[0320] It should also be noted that, in this specification, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element.
Claims
1. A method for testing computational efficiency, characterized in that, include: Obtain the dataset; Control the device under test to run the software framework under test to call the model under test and the dataset to perform model calculation tasks; Determine the model performance parameters and computational performance parameters based on the model computation task; The computational efficiency of the computing system of the tested model is determined based on the model performance parameters, the computational performance parameters, and the model parameter quantity evaluation parameters of the tested model, so as to obtain the test results of the computing system of the tested model. The tested model computing system includes the tested model, the tested software framework, and the tested device.
2. The simulation efficiency testing method according to claim 1, characterized in that, The computational efficiency of the tested model's computing system is determined based on the model performance parameters, the computational performance parameters, and the model parameter evaluation parameters of the tested model, including: The model parameter efficiency of the computational system of the tested model is determined based on the model performance parameters and the model parameter quantity evaluation parameters. The model computation efficiency is determined based on the model parameter efficiency and the computational performance parameters.
3. The simulation efficiency testing method according to claim 2, characterized in that, The greater the efficiency of the model parameters, the greater the computational efficiency.
4. The simulation efficiency testing method according to claim 2, characterized in that, The model parameter efficiency of the tested model computing system is determined based on the model performance parameters and the model parameter quantity evaluation parameters, and is calculated using the following formula: Model efficiency = Model parameter efficiency × Computational efficiency; Wherein, computational efficiency is the computational performance parameter.
5. The simulation efficiency testing method according to claim 2, characterized in that, Determining the model parameter efficiency of the tested model computing system based on the model performance parameters and the model parameter quantity evaluation parameters includes: The model performance parameters and the model parameter quantity evaluation parameters are normalized respectively. The efficiency of the model parameters is determined based on the normalized model performance parameters, the normalized model parameter quantity evaluation parameters, and the computational performance parameters.
6. The simulation efficiency testing method according to claim 5, characterized in that, The model performance parameters are normalized, including: The normalized model performance parameters are determined based on the measured model performance parameters in the model calculation task and the reference model performance parameters of the tested model.
7. The simulation efficiency testing method according to claim 6, characterized in that, The measured model performance parameters are the model accuracy of the tested model, and the reference model performance parameters are the reference accuracy of the tested model.
8. The simulation efficiency testing method according to claim 5, characterized in that, The evaluation parameters of the model parameters are normalized, including: The normalized model parameter evaluation parameters are determined based on the number of activation parameters and the number of reference parameters of the tested model. The activation parameter quantity refers to the number of parameters that the tested model participates in during the model calculation task.
9. The simulation efficiency testing method according to claim 2, characterized in that, The model parameter efficiency of the tested model computing system is determined based on the model performance parameters and the model parameter quantity evaluation parameters, and is calculated using the following formula: Wherein, the model accuracy is the measured model performance parameter in the model calculation task, the reference accuracy is the reference model performance parameter of the tested model, and the number of activation parameters is the number of parameters in which the tested model participates in the calculation in the model calculation task.
10. The simulation efficiency testing method according to claim 1, characterized in that, The computational efficiency of the tested model's computing system is determined based on the model performance parameters, the computational performance parameters, and the model parameter evaluation parameters of the tested model, including: If the model performance parameters are within the range of the benchmark model performance parameters, then the model computation efficiency is determined based on the model performance parameters, the computational performance parameters, and the model parameter quantity evaluation parameters. If the model performance parameters are not within the range of the baseline model performance parameters, then the model performance parameters are set to 0, and the model computation efficiency is determined based on the model performance parameters, the computational performance parameters, and the model parameter quantity evaluation parameters.
11. The simulation efficiency testing method according to claim 1, characterized in that, The model performance parameter is the model accuracy of the tested model, and the model accuracy is determined by the following formula: Wherein, the measured accuracy of the model is the measured accuracy of the tested model in the model calculation task, and the benchmark accuracy is the minimum allowable accuracy of the tested model.
12. The simulation efficiency testing method according to claim 1, characterized in that, The steps for determining the computational performance parameters include: The computational performance parameters are determined based on the measured computational performance parameters of the computational system of the tested model performing the model computation task and the theoretical peak performance parameters of the computational system of the tested device.
13. The simulation efficiency testing method according to claim 12, characterized in that, The computational performance parameters are determined based on the measured computational performance parameters of the computational system of the tested model performing the model computation task and the theoretical peak performance parameters of the computational system of the tested device, and are obtained by the following formula: Wherein, the computational efficiency is the computational performance parameter.
14. The simulation efficiency testing method according to claim 12, characterized in that, The measured computational performance parameter is the number of floating-point operations per second performed by the tested device on the model computation task; The theoretical peak performance parameter of the computing system is the theoretical peak number of floating-point operations per second of the device under test.
15. The simulation efficiency testing method according to claim 12, characterized in that, The steps for determining the measured performance parameters and the steps for determining the theoretical peak performance parameters of the calculation system include: Obtain the theoretical peak performance parameters of the computing system corresponding to the first data type of the device under test; If all actual data types in the model calculation task are the first data type, then the calculation performance parameter corresponding to the actual data type in the model calculation task shall be the measured calculation performance parameter. If there is an actual data type in the model calculation task that is not the first data type, then the calculation performance parameter corresponding to the actual data type in the model calculation task is equivalent to the calculation performance parameter corresponding to the first data type, and the measured calculation performance parameter is obtained.
16. The simulation efficiency testing method according to claim 15, characterized in that, The first data type is a single-precision floating-point type.
17. The simulation efficiency testing method according to claim 12, characterized in that, The steps for determining the measured performance parameters and the steps for determining the theoretical peak performance parameters of the calculation system include: Obtain the theoretical peak performance parameters of the computing system corresponding to the second data type of the device under test; If all actual data types in the model calculation task are the second data type, then the calculation performance parameter corresponding to the actual data type in the model calculation task shall be the measured calculation performance parameter. If there is an actual data type in the model calculation task that is not the second data type, then the calculation performance parameters corresponding to the actual data type in the model calculation task are converted into the calculation performance parameters corresponding to the second data type to obtain the measured calculation performance parameters.
18. The simulation efficiency testing method according to claim 1, characterized in that, The computational efficiency of the tested model computing system is determined based on the model performance parameters, the computational performance parameters, and the model parameter quantity evaluation parameters of the tested model, in order to obtain the test results of the tested model computing system, including: Based on the computational efficiency of the tested model computing system under multiple test scenarios, the overall computational efficiency of the tested model computing system is determined. The overall computational efficiency is used as the test result; The test scenarios correspond one-to-one with the datasets.
19. The simulation efficiency testing method according to claim 1, characterized in that, The computational efficiency of the tested model computing system is determined based on the model performance parameters, the computational performance parameters, and the model parameter quantity evaluation parameters of the tested model, in order to obtain the test results of the tested model computing system, including: Based on the computational efficiency of the tested model computing system under multiple test scenarios, the overall computational efficiency of the tested model computing system is determined. The overall computational efficiency is used as the test result; One of the test scenarios corresponds to one or more of the datasets.
20. The simulation efficiency testing method according to claim 19, characterized in that, The steps for determining the computational efficiency corresponding to a single test scenario include: If the test scenario corresponds to a dataset, then the simulation efficiency corresponding to the dataset is taken as the simulation efficiency corresponding to the test scenario; If the test scenario corresponds to multiple datasets, then the simulation efficiency corresponding to the test scenario is determined based on the simulation efficiency corresponding to each dataset.
21. The simulation efficiency testing method according to claim 20, characterized in that, Determining the computational efficiency of the test scenario based on the computational efficiency of each dataset corresponding to the test scenario includes: The arithmetic mean of the computational efficiencies corresponding to each dataset is taken as the computational efficiency corresponding to the test scenario.
22. The simulation efficiency testing method according to claim 20, characterized in that, Determining the computational efficiency of the test scenario based on the computational efficiency of each dataset corresponding to the test scenario includes: The geometric mean of the computational efficiency corresponding to each dataset is taken as the computational efficiency corresponding to the test scenario.
23. The simulation efficiency testing method according to claim 20, characterized in that, Determining the computational efficiency of the test scenario based on the computational efficiency of each dataset corresponding to the test scenario includes: The single-scenario model performance parameters corresponding to the test scenario are determined based on the model performance parameters corresponding to each dataset of the test scenario. The single-scenario computational performance parameters corresponding to the test scenario are determined based on the computational performance parameters corresponding to each dataset of the test scenario. The computational efficiency corresponding to the test scenario is determined based on the single-scenario model performance parameters, the single-scenario computational performance parameters, and the model parameter quantity evaluation parameters.
24. The simulation efficiency testing method according to claim 23, characterized in that, Determine the single-scene model performance parameter corresponding to the test scenario based on the model performance parameters corresponding to each dataset corresponding to the test scenario, including: using the arithmetic mean of the model performance parameters corresponding to each dataset as the single-scene model performance parameter; Determining the single-scene computational performance parameter corresponding to the test scenario based on the computational performance parameters corresponding to each dataset of the test scenario includes: using the arithmetic mean of the computational performance parameters corresponding to each dataset as the single-scene computational performance parameter.
25. The simulation efficiency testing method according to claim 23, characterized in that, Determine the single-scene model performance parameter corresponding to the test scenario based on the model performance parameters corresponding to each dataset corresponding to the test scenario, including: using the geometric mean of the model performance parameters corresponding to each dataset as the single-scene model performance parameter; Determining the single-scene computational performance parameter corresponding to the test scenario based on the computational performance parameters corresponding to each dataset of the test scenario includes: using the geometric mean of the computational performance parameters corresponding to each dataset as the single-scene computational performance parameter.
26. The simulation efficiency testing method according to claim 18 or 19, characterized in that, Based on the computational efficiency of the tested model computing system under multiple test scenarios, the overall computational efficiency of the tested model computing system is determined, including: The arithmetic mean of the simulation efficiency corresponding to multiple test scenarios is taken as the comprehensive simulation efficiency.
27. The simulation efficiency testing method according to claim 26, characterized in that, The overall simulation efficiency is calculated using the arithmetic mean of the simulation efficiencies corresponding to multiple test scenarios, obtained through the following formula: Among them, MCE t The MCE represents the overall computational efficiency. i f represents the computational efficiency of the tested model computing system in the i-th test scenario. 1i This represents the first weight coefficient of the test model calculation system in the i-th test scenario.
28. The simulation efficiency testing method according to claim 18 or 19, characterized in that, Based on the computational efficiency of the tested model computing system under multiple test scenarios, the overall computational efficiency of the tested model computing system is determined, including: The geometric mean of the simulation efficiency corresponding to multiple test scenarios is taken as the comprehensive simulation efficiency.
29. The simulation efficiency testing method according to claim 28, characterized in that, The overall simulation efficiency is calculated using the geometric mean of the simulation efficiencies corresponding to multiple test scenarios, obtained through the following formula: Among them, MCE t The MCE represents the overall computational efficiency. i The computational efficiency of the tested model computing system in the i-th test scenario is represented by ∏(·), where f represents cumulative multiplication. 2i This represents the second weight coefficient of the test model calculation system in the i-th test scenario.
30. The simulation efficiency testing method according to claim 1, characterized in that, Controlling the device under test to run the software framework under test to invoke the model under test and the dataset to perform model computation tasks includes: The device under test loads the software framework under test and performs initialization of the software framework under test; The software framework under test is started, and the model under test and the dataset are invoked to perform the model calculation task.
31. The simulation efficiency testing method according to claim 30, characterized in that, The process of loading the software framework under test on the device under test and performing initialization of the software framework under test includes: Load the library files of the test model computing system onto the device under test; The type of the interface function of the computing system of the tested model is obtained through the calling interface of the library file; The interface function is initialized according to its type.
32. The simulation efficiency testing method according to claim 31, characterized in that, The types of the interface functions include at least: initialization functions, completion functions, data loading functions, data unloading functions, and model calculation control functions; The initialization function is the interface function that performs initialization according to the test configuration information; the completion function is the interface function that executes the task after the tested model calculation system has completed the model calculation task; the data loading function is the interface function that loads the model calculation data; the data unloading function is the interface function that unloads the model calculation data; and the model calculation control function is the interface function that executes the model calculation task.
33. The simulation efficiency testing method according to claim 32, characterized in that, The process of launching the software framework under test and invoking the model under test to execute the model computation task includes: The software framework under test is started to execute the model calculation task; Obtain the first result returned by the tested software framework after calling the data loading function to load the model's calculated data; Obtain the second result of the tested software framework calling the model computation control function to execute the model computation task; Wait for the third result output by the software framework under test after it has completed the model calculation task; Obtain the fourth result returned by the tested software framework calling the data unloading function to unload the model's computational data; The model accuracy of the tested model is calculated based on the execution results of the model calculation task, including: The model accuracy is calculated based on at least one of the first result, the second result, the third result, and the fourth result.
34. A simulation efficiency testing system, characterized in that, Includes testing equipment and the equipment under test; The device under test is configured to run the software framework under test to perform model calculation tasks based on the model under test and the dataset. The testing device is configured to acquire the dataset, control the device under test to run the software framework under test to call the model under test and the dataset to perform model calculation tasks; determine model performance parameters and calculation performance parameters according to the model calculation tasks; determine the model calculation efficiency of the model calculation system under test according to the model performance parameters, the calculation performance parameters and the model parameter quantity evaluation parameters of the model under test, so as to obtain the test results of the model calculation system under test. The tested model computing system includes the tested model, the tested software framework, and the tested device.
35. A simulation efficiency testing device, characterized in that, include: The acquisition unit is configured to acquire a dataset. The control unit is configured to control the device under test to run the software framework under test in order to invoke the model under test and the dataset to perform model calculation tasks. The determining unit is configured to determine model performance parameters and computational performance parameters based on the model computation task. The computational efficiency of the computing system of the tested model is determined based on the model performance parameters, the computational performance parameters, and the model parameter quantity evaluation parameters of the tested model, so as to obtain the test results of the computing system of the tested model. The tested model computing system includes the tested model, the tested software framework, and the tested device.
36. A simulation efficiency testing device, characterized in that, include: Memory, configured to store computer programs; A processor is configured to execute the computer program, which, when executed by the processor, implements the steps of the simulation efficiency testing method as described in any one of claims 1 to 33.
37. A non-volatile storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the steps of the simulation efficiency testing method as described in any one of claims 1 to 33.
38. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by the processor, it implements the steps of the simulation efficiency testing method as described in any one of claims 1 to 33.