A Deep Learning-Based Energy Efficiency Optimization Method for AI Computing Equipment
By allocating dedicated hardware computing partitions to AI computing devices, collecting power consumption and performance data in real time, constructing energy efficiency ratio sequences, and optimizing processor frequencies, the problem of low energy efficiency of AI computing devices in deep learning tasks has been solved, achieving more efficient energy management.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- ZHEJIANG WULUO SMART CITY TECHNOLOGY CO LTD
- Filing Date
- 2026-04-01
- Publication Date
- 2026-06-30
AI Technical Summary
Existing AI computing devices lack energy efficiency benchmarks for specific tasks and operational phases when performing deep learning tasks, resulting in frequency mismatch, low energy efficiency, and energy waste, making it difficult to meet computing power requirements at different execution phases.
By allocating dedicated hardware computing partitions to computing tasks, real-time power consumption and performance throughput data are collected, an energy efficiency ratio sequence is constructed, a benchmark energy efficiency curve is matched, frequency adjustment instructions are generated, and the processor frequency is optimized to improve energy efficiency.
It achieves reasonable energy efficiency levels at different operating stages, reduces ineffective power consumption, improves the efficiency of computing resources, enhances the targeting and stability of frequency regulation, and improves long-term operating quality.
Smart Images

Figure CN122308588A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of deep learning technology, and more specifically to a method for optimizing the energy efficiency of AI computing devices based on deep learning. Background Technology
[0002] With the increasing prevalence of deep learning applications, AI computing devices typically need to maintain high load operation for extended periods when performing large-scale model training and inference tasks. This leads to a continuous increase in overall energy consumption, becoming a significant factor limiting the deployment cost and operational efficiency of computing systems. Currently, when running deep learning computational tasks, the processor frequency of existing AI computing devices is often adjusted based on static configuration or simple system-level strategies, which fails to fully reflect the actual computing power requirements and energy efficiency characteristics of specific computational tasks at different execution stages.
[0003] Because deep learning tasks often exhibit significant stage-specific differences during execution, with varying degrees of dependence on scalar, vector, or tensor computation units at different stages, the relationship between performance throughput and power consumption also changes accordingly. This leads to situations where, under a unified frequency strategy, some stages experience low energy efficiency or even energy waste. Furthermore, the lack of energy efficiency benchmarks for specific tasks and operational stages makes it difficult for the system to promptly determine whether its current operating state is within a reasonable energy efficiency range, thus hindering effective decisions regarding frequency adjustments. Summary of the Invention
[0004] The purpose of this invention is to provide a method for optimizing the energy efficiency of AI computing devices based on deep learning, thereby solving the aforementioned technical problems.
[0005] The objective of this invention can be achieved through the following technical solutions: A method for optimizing the energy efficiency of AI computing devices based on deep learning includes the following steps: Allocate a dedicated hardware computing partition for the AI computing task to be executed, and start collecting real-time power consumption sensor data streams for that hardware computing partition. During the execution of the computing task, the processor frequency sequence of the hardware computing partition and the performance throughput sequence reflecting the task progress are collected synchronously. The real-time energy efficiency ratio sequence is calculated based on the real-time power consumption sensor data stream and the performance throughput sequence, where the real-time energy efficiency ratio is the performance throughput achieved per unit power consumption. The processor frequency sequence is matched with the real-time energy efficiency ratio sequence to construct a coordinate point sequence; Retrieve the benchmark energy efficiency curve that matches the current computing task and the current running stage from the pre-built task energy efficiency benchmark library; The target frequency adjustment value is determined based on the coordinate point sequence and the reference energy efficiency curve, and a frequency adjustment command is generated.
[0006] As a further aspect of the present invention: the calculation process of the real-time energy efficiency ratio sequence is as follows: Divide each data point in the performance throughput sequence by the real-time power consumption sensor reading at the same moment to obtain the real-time energy efficiency ratio. The real-time energy efficiency ratio values are sorted in chronological order to obtain a real-time energy efficiency ratio value sequence. The real-time energy efficiency ratio value sequence is then filtered by a moving average to form the real-time energy efficiency ratio sequence. The performance throughput sequence is obtained by periodically querying the application layer progress counter of the computing task. The application layer progress counter counts the number of processed samples or tokens. The difference in the number of two adjacent query periods is divided by the period duration to obtain the performance throughput. The performance throughput is sorted in time axis order to obtain the performance throughput sequence.
[0007] As a further aspect of the present invention, the process of constructing a sequence of coordinate points is as follows: Each data point of the processor frequency sequence is paired with a data point of the real-time energy efficiency ratio sequence at the same moment. Each pairing result forms a coordinate point, which contains a processor frequency value and a real-time energy efficiency ratio value. Arrange all the coordinate points generated in chronological order to form a coordinate point sequence.
[0008] As a further aspect of the present invention: the pre-construction process of the task energy efficiency benchmark library is as follows: Select a baseline computing task and execute it on a hardware computing partition in an interference-free environment; During the execution of the benchmark computing task in the hardware computing partition, the processor is fixed at a certain frequency level, the average power consumption and average performance throughput of a task phase are recorded, and the ratio of the average performance throughput to the average power consumption is calculated as the phase energy efficiency ratio. Traverse all available processor frequency levels, repeatedly execute the same task stage of the benchmark computing task, and obtain the energy efficiency ratio of multiple stages of the task stage at different frequency levels. Each stage energy efficiency ratio corresponds to a frequency level. Mark each frequency level and its corresponding stage energy efficiency ratio on a coordinate system, with processor frequency as the horizontal axis and stage energy efficiency ratio as the vertical axis. Connect all the marked data points in order of processor frequency from low to high to form the baseline energy efficiency curve for this task phase; The benchmark energy efficiency curve is associated with the task feature identifier of the benchmark calculation task and stored together. The set of associated storage of all task feature identifiers and benchmark energy efficiency curves constitutes the task energy efficiency benchmark library.
[0009] As a further aspect of the present invention: the process of dividing the task phases is as follows: The performance profiler monitors the usage ratio of different computing units in the hardware computing partition during the execution of the benchmark computing task. The computing units are scalar computing units, vector computing units, and tensor computing units. The moment when the change in the proportion of computing unit usage exceeds a preset proportion change threshold is marked as the stage boundary, and the stage boundary divides the execution process of the benchmark computing task into multiple consecutive task stages. Each task phase is assigned a unique identifier as its index number.
[0010] As a further aspect of the present invention: the process of retrieving the baseline energy efficiency curve is as follows: Calculate the hash value of the program code for the current computing task to obtain the hash value of the current task; Based on the current task hash value, a search is performed in the task energy efficiency benchmark library to obtain all task feature identifiers corresponding to the current task hash value and their associated benchmark energy efficiency curves; Identify the current stage of the computation task during execution and obtain the current stage index number; Select the benchmark energy efficiency curve that matches the current stage index number from the obtained benchmark energy efficiency curves.
[0011] As a further aspect of the present invention: the process of generating frequency adjustment commands is as follows: Obtain the latest coordinate point in the coordinate point sequence, and extract the processor frequency value of that coordinate point as the current frequency; On the retrieved benchmark energy efficiency curve, find the reference point that is closest to the current frequency value, and record the energy efficiency ratio value of the reference point as the reference energy efficiency ratio; Obtain the real-time energy efficiency ratio value of the latest coordinate point as the current energy efficiency ratio; When the current energy efficiency ratio is lower than the reference energy efficiency ratio, find the point with the highest energy efficiency ratio on the benchmark energy efficiency curve, and determine the processor frequency value corresponding to that point as the target frequency adjustment value. Generate instructions to adjust the processor frequency of the hardware computing partition to the target frequency adjustment value.
[0012] The beneficial effects of this invention compared to the prior art are as follows: This invention dynamically characterizes and judges the energy efficiency status of deep learning computing tasks during actual operation, enabling AI computing devices to maintain a more reasonable energy efficiency level at different operating stages, thereby effectively reducing ineffective power consumption caused by frequency mismatch. This invention can identify deviations between current performance and power consumption for specific computing tasks and their operating stages, avoiding the continuous maintenance of unreasonable operating states during stages of low computing power demand or declining energy efficiency, thus improving the utilization efficiency of computing resources. Simultaneously, by introducing task- and stage-related energy efficiency references, this invention provides clear energy efficiency judgment criteria for devices during operation, thereby enhancing the pertinence and stability of frequency adjustment decisions and reducing energy efficiency degradation caused by environmental fluctuations or load changes. This invention also enables continuous optimization of energy efficiency status while ensuring continuous execution of computing tasks, reducing the impact of frequent adjustments on system stability and helping to improve the overall operating quality of AI computing devices in long-term operation scenarios. Through these effects, this invention makes AI computing devices more aligned with actual computing needs when performing deep learning tasks, achieving a better balance between performance and energy consumption, and has good practical value and promotional significance. Attached Figure Description
[0013] The invention will now be further described with reference to the accompanying drawings.
[0014] Figure 1 This is a flowchart illustrating an energy efficiency optimization method for AI computing power devices based on deep learning, according to the present invention. Figure 2 This is a flowchart illustrating the process of generating frequency adjustment instructions according to the present invention. Detailed Implementation
[0015] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0016] Please see Figures 1-2 As shown, this invention is a method for optimizing the energy efficiency of AI computing devices based on deep learning, comprising the following steps: Allocate a dedicated hardware computing partition for the AI computing task to be executed, and start collecting real-time power consumption sensor data streams for that hardware computing partition. Specifically, before issuing execution instructions for an AI computing task to be executed, a hardware computing partition that can be exclusively used is selected from the AI computing power device based on the resource requirements of the computing task. This hardware computing partition is then bound to the AI computing task, ensuring that the computing task uses only the processor and computing units within the hardware computing partition during execution, avoiding resource competition with other computing tasks. A hardware computing partition refers to an independent computing area within a physical processor or accelerator, partitioned by hardware or a driver layer. This computing area has an independently configurable processor frequency and its power consumption can be independently monitored. After binding, the power sensor identification information corresponding to the hardware computing partition is read, and a power data acquisition channel is established based on this identification information. This allows the power sensor to continuously output power data reflecting the instantaneous power consumption changes of the hardware computing partition according to a preset sampling period.
[0017] During the execution of the computing task, the processor frequency sequence of the hardware computing partition and the performance throughput sequence reflecting the task progress are collected synchronously. Specifically, after the AI computing task enters the execution state on the hardware computing partition, the current frequency of the processor in the hardware computing partition is read and written to the frequency recording buffer according to the preset sampling period. The current frequency of the processor refers to the actual operating frequency point provided by the hardware clock domain and queried by the driver layer. The frequency recording buffer outputs the processor frequency sequence in the order of timestamps. The real-time power consumption sensor reading corresponding to the timestamp is read within the same sampling period and aligned with the frequency timestamp.
[0018] The real-time energy efficiency ratio sequence is calculated based on the real-time power consumption sensor data stream and the performance throughput sequence. The real-time energy efficiency ratio is the performance throughput achieved per unit power consumption. In another preferred embodiment of the present invention, the calculation process of the real-time energy efficiency ratio sequence is as follows: The performance throughput sequence reflecting task progress is provided by an application-layer progress counter. This counter is a monotonically increasing count variable maintained by the application layer for computational tasks, used to accumulate the number of work units that have been processed. The work unit is either the number of training samples or the number of inference tokens. During periodic queries, the current period's count value, the previous period's count value, and the query timestamp are recorded. The count difference between two adjacent queries is correlated with the interval between the two queries to obtain the performance throughput data points for the corresponding time period, forming a performance throughput sequence in timestamp order. Each performance throughput data point is matched with the real-time power consumption sensor reading at the same moment using its timestamp. Based on the matching result, a real-time energy efficiency ratio (EER) value is obtained. The EER value represents the performance throughput achieved per unit of power consumption at that sampling moment. The real-time EER values are arranged in time axis order to form a real-time EER value sequence. This sequence is then processed by a moving average with a fixed window length. The moving average process involves averaging multiple consecutive EER values within the window and outputting the smoothed EER data points using the window's center timestamp. Continuous output forms the real-time EER sequence.
[0019] Understandably, synchronously collecting processor frequency, performance throughput, and power consumption readings on the same time base and aligning them with timestamps ensures that the energy efficiency ratio at each moment corresponds to a clear change in operating frequency and task progress, thereby avoiding energy efficiency distortion caused by delays in different sampling channels. The application-layer progress counter describes task progress using the number of completed work units, allowing throughput to directly reflect the processing rate of the task at the current stage. The energy efficiency ratio, combined with real-time power consumption readings, characterizes the energy utilization status of the task at the current frequency. A moving average smooths out instantaneous jitter, forming a stable energy efficiency sequence.
[0020] The processor frequency sequence is matched with the real-time energy efficiency ratio sequence to construct a coordinate point sequence; In another preferred embodiment of the present invention, the process of constructing the coordinate point sequence is as follows: Each data point of the processor frequency sequence is paired with a data point of the real-time energy efficiency ratio sequence at the same moment. Each pairing result forms a coordinate point, which contains a processor frequency value and a real-time energy efficiency ratio value. Arrange all the coordinate points generated in chronological order to form a coordinate point sequence.
[0021] Retrieve the benchmark energy efficiency curve that matches the current computing task and the current running stage from the pre-built task energy efficiency benchmark library; In a preferred embodiment of the present invention, the pre-construction process of the task energy efficiency benchmark library is as follows: A deep learning training or inference program is selected as the benchmark computing task and started to execute under the condition that the hardware computing partition is in an exclusive state and no other tasks occupy the computing resources of the hardware computing partition. The interference-free environment means that no other computing tasks other than the benchmark computing task are scheduled to enter the hardware computing partition and the power consumption sensor and performance acquisition channel are in a stable and available state.
[0022] During execution, the performance profiler continuously reads the usage ratio of scalar computing units, vector computing units, and tensor computing units within the hardware computing partition. The usage ratio is the proportion of time that the corresponding computing unit is busy within the sampling period. The change in the usage ratio between adjacent sampling periods is compared with a preset ratio change threshold. Sampling moments that exceed the threshold are recorded as stage boundaries. The execution process is divided into continuous task stages based on the stage boundaries, and a unique task stage index number is assigned to each task stage.
[0023] For any task phase, the processor is locked to a specific frequency level and maintained thereafter. This frequency level is a discrete frequency setting supported by the processor. Upon entering this task phase, real-time power consumption readings and performance throughput data obtained from the application layer progress counter are continuously collected. These are then summarized by timestamp to obtain the average power consumption and average performance throughput for that phase. Based on the average performance throughput and average power consumption, the phase energy efficiency ratio is calculated and stored in association with the current frequency level. This process is repeated for all available frequency levels for the same task phase, resulting in multiple phase energy efficiency ratios for each frequency level. Each frequency level and its phase energy efficiency ratio are written into a coordinate data set. This coordinate data set uses processor frequency as the horizontal axis and phase energy efficiency ratio as the vertical axis, arranged from low to high frequency values and connected sequentially to form a baseline energy efficiency curve for that task phase. This baseline energy efficiency curve is then associated with the task feature identifier of the baseline computing task. The task feature identifier is uniquely identifying the baseline computing task. The set of all associated records of task feature identifiers and their corresponding baseline energy efficiency curves constitutes a task energy efficiency benchmark library and serves as input for subsequent searches.
[0024] It is worth noting that executing a benchmark computing task under interference-free conditions with a fixed processor operating frequency ensures that power consumption and performance variations are solely caused by frequency factors. This results in a stable and reproducible performance-energy consumption relationship, and the stage energy efficiency ratios formed at different frequency levels accurately reflect the energy efficiency characteristics of the task in its corresponding operating stage. By dividing the computing task execution process into stages, the computing patterns and hardware resource usage characteristics within each stage remain relatively consistent, avoiding the mixing of different computing characteristics in the same energy efficiency description, thereby improving the representativeness of the energy efficiency curve. Using a performance profiler to identify moments where the proportion of computing unit usage changes significantly as stage boundaries ensures that stage division is based on hardware execution behavior rather than manual settings, helping to guarantee the differentiation of computing power requirements in different stages. The benchmark energy efficiency curve constructed based on the above method can serve as a reference benchmark in subsequent operations to determine the energy efficiency deviation under the current operating state, providing a clear reference direction for frequency adjustment. This supports AI computing devices in achieving a more reasonable energy efficiency state during deep learning tasks, avoiding energy efficiency degradation caused by inappropriate frequency selection.
[0025] In another preferred embodiment of the present invention, the process of retrieving the reference energy efficiency curve is as follows: Calculate the hash value of the program code for the current computing task to obtain the hash value of the current task; Based on the current task hash value, a search is performed in the task energy efficiency benchmark library to obtain all task feature identifiers corresponding to the current task hash value and their associated benchmark energy efficiency curves; Identify the current stage of the computation task during execution and obtain the current stage index number; Select the benchmark energy efficiency curve that matches the current stage index number from the obtained benchmark energy efficiency curves.
[0026] In one optional implementation, the task feature identifier can be determined not only by the hash value generated from the program code of the computing task, but also by combining the runtime characteristics exhibited by the computing task during operation. These runtime characteristics include, but are not limited to: the distribution of the actual usage ratio of different computing units in each task stage, the statistical characteristics of performance throughput per unit time, the trend characteristics of power consumption changes, and the response characteristics of processor frequency changes.
[0027] When retrieving the task energy efficiency benchmark library, the system can first perform preliminary screening of candidate benchmark energy efficiency curves based on the hash value of the program code. Then, based on the real-time runtime characteristics of the current computing task during its execution phase, it compares the similarity with the runtime characteristics corresponding to the candidate benchmark energy efficiency curves to determine the benchmark energy efficiency curve that is closest to the runtime characteristics of the current computing task. Through this method, even when the same program code corresponds to different input scales, different parallel configurations, or different execution environments, the retrieved benchmark energy efficiency curves can still be guaranteed to have effective reference value in energy efficiency optimization decisions.
[0028] The target frequency adjustment value is determined based on the coordinate point sequence and the reference energy efficiency curve, and a frequency adjustment command is generated.
[0029] In another preferred embodiment of the present invention, the process of generating frequency adjustment instructions is as follows: After constructing the coordinate point sequence, the coordinate point with the largest timestamp is read from the sequence as the latest coordinate point. The latest coordinate point refers to the frequency and energy efficiency observation results corresponding to the current sampling period. The processor frequency value is read from this coordinate point and determined as the current frequency. Based on the benchmark energy efficiency curve obtained from the current computing task and current running stage, the curve data point closest to the current frequency value is retrieved from the benchmark energy efficiency curve. "Closest" means the difference between the frequency value of the curve data point and the current frequency is the smallest. The energy efficiency ratio value corresponding to this curve data point is read and used as the reference energy efficiency ratio. Simultaneously, the real-time energy efficiency ratio value is read from the latest coordinate point and used as the current energy efficiency ratio. The real-time energy efficiency ratio value reflects the energy efficiency level of the computing task at the current frequency and current running state. When it is determined that the current energy efficiency ratio is lower than the reference energy efficiency ratio, all curve data points are traversed along the benchmark energy efficiency curve and their energy efficiency ratio values are compared. The curve data point with the largest energy efficiency ratio value is selected, and the processor frequency value corresponding to this curve data point is determined as the target frequency adjustment value. A frequency adjustment instruction is generated based on the target frequency adjustment value. The frequency adjustment instruction is a control instruction that can be recognized by the processor frequency control interface. It is used to adjust the operating frequency of the processor in the hardware computing partition from the current frequency to the target frequency adjustment value, and output the instruction to the frequency control execution unit for execution.
[0030] Understandably, by comparing the frequency and energy efficiency observations under real-time operating conditions with the benchmark energy efficiency curve, the current operating state can be mapped to a pre-established energy efficiency reference space. When the actual operating energy efficiency deviates from the benchmark level, a more reasonable operating frequency range can be determined based on the known energy efficiency distribution relationship in the benchmark curve. The benchmark energy efficiency curve reflects the energy efficiency characteristics at different frequencies during a specific task phase. Selecting the frequency with the best energy efficiency performance as the adjustment target provides a clear reference for frequency adjustment, thereby guiding the processor's operating state to converge towards a more reasonable energy efficiency region. This helps to achieve continuous energy efficiency optimization of AI computing devices during task execution.
[0031] In a preferred embodiment, when the current real-time energy efficiency ratio is detected to be lower than the reference energy efficiency ratio corresponding to the benchmark energy efficiency curve, the target frequency adjustment value does not necessarily have to be adjusted to the processor frequency value with the highest energy efficiency ratio on the benchmark energy efficiency curve in one go. Instead, it is used as a target reference frequency for energy efficiency optimization to indicate the direction of processor frequency adjustment.
[0032] Specifically, the system can gradually adjust the processor frequency based on the frequency difference between the current processor frequency and the target reference frequency, combined with preset frequency adjustment step size limits, adjustment periods, and system stability constraints. After each frequency adjustment, real-time power consumption sensor data and performance throughput data are re-acquired, and the real-time energy efficiency ratio sequence is updated. When the real-time energy efficiency ratio is detected to reach or approach the expected energy efficiency level of the benchmark energy efficiency curve within the corresponding frequency range, or when preset performance constraints are met, further frequency adjustments are stopped. Through this gradual frequency adjustment method, frequency oscillations caused by instantaneous operational disturbances, measurement errors, or load changes during task phases can be avoided, thereby ensuring the stability and controllability of the energy efficiency optimization process.
[0033] The above formulas are all dimensionless calculations. The formulas are derived from software simulations based on a large amount of collected data to obtain the most recent real-world results. The preset parameters and thresholds in the formulas are set by those skilled in the art according to the actual situation. The foregoing has provided a detailed description of one embodiment of the present invention, but this description is merely a preferred embodiment and should not be construed as limiting the scope of the invention. All equivalent variations and modifications made within the scope of the present invention should still fall within the scope of the present invention.
Claims
1. A method for optimizing the energy efficiency of AI computing devices based on deep learning, characterized in that, Includes the following steps: Allocate a dedicated hardware computing partition for the AI computing task to be executed, and start collecting real-time power consumption sensor data streams for that hardware computing partition. During the execution of the computing task, the processor frequency sequence of the hardware computing partition and the performance throughput sequence reflecting the task progress are collected synchronously. The real-time energy efficiency ratio sequence is calculated based on the real-time power consumption sensor data stream and the performance throughput sequence, where the real-time energy efficiency ratio is the performance throughput achieved per unit power consumption. The processor frequency sequence is matched with the real-time energy efficiency ratio sequence to construct a coordinate point sequence; Retrieve the benchmark energy efficiency curve that matches the current computing task and the current running stage from the pre-built task energy efficiency benchmark library; The target frequency adjustment value is determined based on the coordinate point sequence and the reference energy efficiency curve, and a frequency adjustment command is generated.
2. The method for optimizing the energy efficiency of AI computing power equipment based on deep learning according to claim 1, characterized in that, The calculation process for the real-time energy efficiency ratio sequence is as follows: Divide each data point in the performance throughput sequence by the real-time power consumption sensor reading at the same moment to obtain the real-time energy efficiency ratio. The real-time energy efficiency ratio values are sorted in chronological order to obtain a real-time energy efficiency ratio value sequence. The real-time energy efficiency ratio value sequence is then filtered by a moving average to form the real-time energy efficiency ratio sequence. The performance throughput sequence is obtained by periodically querying the application layer progress counter of the computing task. The application layer progress counter counts the number of processed samples or tokens. The difference in the number of two adjacent query periods is divided by the period duration to obtain the performance throughput. The performance throughput is sorted in time axis order to obtain the performance throughput sequence.
3. The method for optimizing the energy efficiency of AI computing power equipment based on deep learning according to claim 2, characterized in that, The process of constructing a sequence of coordinate points is as follows: Each data point of the processor frequency sequence is paired with a data point of the real-time energy efficiency ratio sequence at the same moment. Each pairing result forms a coordinate point, which contains a processor frequency value and a real-time energy efficiency ratio value. Arrange all the coordinate points generated in chronological order to form a coordinate point sequence.
4. The method for optimizing the energy efficiency of AI computing power equipment based on deep learning according to claim 3, characterized in that, The pre-construction process for the task energy efficiency benchmark library is as follows: Select a baseline computing task and execute it on a hardware computing partition in an interference-free environment; During the execution of the benchmark computing task in the hardware computing partition, the processor is fixed at a certain frequency level, the average power consumption and average performance throughput of a task phase are recorded, and the ratio of the average performance throughput to the average power consumption is calculated as the phase energy efficiency ratio. Traverse all available processor frequency levels, repeatedly execute the same task stage of the benchmark computing task, and obtain the energy efficiency ratio of multiple stages of the task stage at different frequency levels. Each stage energy efficiency ratio corresponds to a frequency level. Mark each frequency level and its corresponding stage energy efficiency ratio on a coordinate system, with processor frequency as the horizontal axis and stage energy efficiency ratio as the vertical axis. Connect all the marked data points in order of processor frequency from low to high to form the baseline energy efficiency curve for this task phase; The benchmark energy efficiency curve is associated with the task feature identifier of the benchmark calculation task and stored together. The set of associated storage of all task feature identifiers and benchmark energy efficiency curves constitutes the task energy efficiency benchmark library.
5. The method for optimizing the energy efficiency of AI computing power equipment based on deep learning according to claim 4, characterized in that, The process of dividing the task into phases is as follows: The performance profiler monitors the usage ratio of different computing units in the hardware computing partition during the execution of the benchmark computing task. The computing units are scalar computing units, vector computing units, and tensor computing units. The moment when the change in the proportion of computing unit usage exceeds a preset proportion change threshold is marked as the stage boundary, and the stage boundary divides the execution process of the benchmark computing task into multiple consecutive task stages. Each of the divided task phases is assigned a unique identifier as its index number.
6. The method for optimizing the energy efficiency of AI computing power equipment based on deep learning according to claim 5, characterized in that, The process of retrieving the baseline energy efficiency curve is as follows: Calculate the hash value of the program code for the current computing task to obtain the hash value of the current task; Based on the current task hash value, a search is performed in the task energy efficiency benchmark library to obtain all task feature identifiers corresponding to the current task hash value and their associated benchmark energy efficiency curves; Identify the current stage of the computation task during execution and obtain the current stage index number; Select the benchmark energy efficiency curve that matches the current stage index number from the obtained benchmark energy efficiency curves.
7. The method for optimizing the energy efficiency of AI computing power equipment based on deep learning according to claim 6, characterized in that, The process of generating frequency adjustment instructions is as follows: Obtain the latest coordinate point in the coordinate point sequence, and extract the processor frequency value of that coordinate point as the current frequency; On the retrieved baseline energy efficiency curve, find the reference point that is closest to the current frequency value, and record the energy efficiency ratio of that reference point as the reference energy efficiency ratio. Obtain the real-time energy efficiency ratio value of the latest coordinate point as the current energy efficiency ratio; When the current energy efficiency ratio is lower than the reference energy efficiency ratio, find the point with the highest energy efficiency ratio on the benchmark energy efficiency curve, and determine the processor frequency value corresponding to that point as the target frequency adjustment value. Generate instructions to adjust the processor frequency of the hardware computing partition to the target frequency adjustment value.