An efficient integrated power supply and heat dissipation management system for large model fine-tuning platform system

By using a dual-power input interface, a shunt resistor + amplifier current sharing control circuit, a supercapacitor energy storage cluster, and an active filter unit with a 12V lithium battery, combined with a GPU water-cooling and CPU air-cooling split design, the problems of scattered power management, low heat dissipation efficiency, and high noise in the large model fine-tuning platform system are solved. This achieves integrated power management and improved stability, significantly improving the server's operational reliability and environmental adaptability.

CN224399807UActive Publication Date: 2026-06-23STATE GRID CORPORATION OF CHINA +3

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Utility models(China)
Current Assignee / Owner
STATE GRID CORPORATION OF CHINA
Filing Date
2025-05-20
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

In existing large-scale model fine-tuning platform systems, server power management is decentralized and has poor adaptability, there is a prominent contradiction between heat dissipation coverage and noise, and power failure protection and status reporting are insufficient, which affects power supply efficiency and system reliability.

Method used

It adopts a dual power input interface, a shunt resistor + amplifier current sharing control loop, a supercapacitor energy storage cluster and a 12V lithium battery active filter unit, combined with a GPU water cooling and CPU air cooling split design, and uses an STM32 microcontroller to achieve dynamic current sharing, voltage smoothing and high-precision monitoring, and integrates I2C bus and PWM interface to optimize power management and heat dissipation control.

Benefits of technology

It achieves integrated and stable power management, synergistic optimization of heat dissipation efficiency and noise control, enhanced integrated reliability of monitoring and control, and improved modularity and anti-interference capability of the chassis structure, significantly improving the stability and environmental adaptability of server operation.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN224399807U_ABST
    Figure CN224399807U_ABST
Patent Text Reader

Abstract

The utility model provides a kind of high-efficiency integrated power supply and heat dissipation management system for large model fine-tuning platform system, comprising: interconnect: power management module, heat dissipation management module and control module;The power management module includes double-path power input interface, current control unit, active filter unit and monitoring unit;The heat dissipation management module includes split heat dissipation assembly and temperature sensor by GPU water cooling module and CPU air cooling module;The control module includes single-chip microcomputer.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This utility model relates to the technical fields of power management and heat dissipation management, and in particular to a high-efficiency integrated power and heat dissipation management system for a large model fine-tuning platform system. Background Technology

[0002] With the rapid development of artificial intelligence technology, large model fine-tuning platform systems play a crucial role in data processing and model training. As the core equipment of large model fine-tuning platforms, servers require long-term, high-power computation (especially GPUs, which need to run under continuous high load during the fine-tuning phase), placing higher demands on power supply and heat dissipation management for stable operation.

[0003] Currently, most large-scale model fine-tuning platforms employ a separate design for server power supply and thermal management, which presents the following specific problems in practical applications:

[0004] Poor adaptability of distributed power management: When fine-tuning large models, the GPU needs to be continuously powered at high power (e.g., the power consumption of a single GPU can reach 300W). However, traditional discrete power supplies can only independently power the CPU and GPU, and cannot dynamically coordinate the output current of the dual power supplies. This can easily lead to overload of some power supplies and redundancy of others, affecting power supply efficiency and system reliability.

[0005] The contradiction between heat dissipation coverage and noise is prominent: When fine-tuning large models, the GPU core, memory and power supply unit generate high load heat simultaneously. Traditional all-in-one air cooling solutions, due to their limited coverage area and fixed fan speed, cannot solve the heat dissipation needs of multiple heat sources at the same time, often resulting in local overheating or high noise problems (such as high-speed fan noise exceeding 50dB), affecting the adaptability of deployment in office scenarios.

[0006] Insufficient power outage protection and status reporting: Fine-tuning of large models requires long-term continuous operation. Traditional systems rely solely on simple power outage protection in the event of a sudden power outage, and cannot report critical statuses (such as current load and temperature) in real time and complete a safe shutdown, which poses a risk of data loss. Utility Model Content

[0007] To address the shortcomings and deficiencies of existing server power management, such as decentralized power management, low heat dissipation efficiency, high noise under high load, and lack of integrated monitoring, this utility model provides a highly efficient integrated power and heat dissipation management system for large-scale model fine-tuning platform systems. Through the coordinated structural design of power supply, heat dissipation, and control modules, it achieves a dual improvement in power supply stability and heat dissipation efficiency.

[0008] The core innovations of this system are as follows: First, the power management module adopts a dual power input interface (compatible with 1000W and 800W general server power supplies), integrating a current sharing control loop of "shunt resistor + amplifier" and an active filter unit of "supercapacitor energy storage cluster (2.7V / 500F single cell series and parallel) + 12V / 20Ah lithium battery", combined with the INA180A4 current detection chip and voltage divider resistor monitoring module, forming a physical structure for dynamic current sharing, voltage stabilization, and high-precision monitoring; Second, the heat dissipation management module adopts a separate design for GPU water cooling and CPU air cooling. The water cooling head covers the GPU core, memory, and power supply unit, driven by a high-flow ceramic bearing pump and a 120mm... The system features a PWM silent fan (1500RPM±10%), an independently designed CPU air-cooling module with heat sinks and a low-speed fan, and a vertical side airflow design (external blowing from the bottom / middle / top + exhaust from the top) and a bracket-style mounting structure for multiple temperature sensors (AHT20) to achieve efficient heat dissipation and noise control. Thirdly, the control module is based on an STM32 microcontroller, integrating an I2C bus (for connecting temperature sensors), a PWM interface (for driving the fan), and an ADC interface (for receiving current / voltage signals). Furthermore, it utilizes a pre-fabricated cable management board, a separate upper and lower chassis (GPU and CPU areas are layered), and a layered, isolated wiring structure (power lines / signal lines / coolant piping) to ensure system integration and operational reliability.

[0009] Through the above-mentioned structural innovation, this utility model effectively solves the problems of uneven power supply, inefficient heat dissipation, and excessive noise in office environments in large model fine-tuning platforms, and significantly improves the stability and environmental adaptability of server operation.

[0010] The present invention specifically adopts the following technical means:

[0011] A high-efficiency integrated power and thermal management system for large-scale model fine-tuning platform systems includes:

[0012] The following modules are interconnected: power management module, thermal management module, and control module.

[0013] The power management module includes a dual power input interface, a current sharing control unit, an active filter unit, and a monitoring unit.

[0014] The heat dissipation management module includes a split heat dissipation component consisting of a GPU water cooling module and a CPU air cooling module, as well as a temperature sensor.

[0015] The control module includes a microcontroller.

[0016] Furthermore, the dual power input interfaces are dedicated interfaces for 1000W and 800W general server power supplies, and each power supply line has an independent control node, which is connected to the microcontroller.

[0017] Furthermore, the current sharing control unit includes: a shunt resistor and an amplifier sampling circuit independently set for each power supply, the sampling circuit interacting with the microcontroller via onboard communication; the active filtering unit includes a supercapacitor energy storage cluster and a fast-charging lithium battery pack, the energy storage cluster and the lithium battery pack being connected to the power supply side via a high-speed current sampling circuit.

[0018] Furthermore, the monitoring unit includes:

[0019] Current detection module: The INA180A4 current detection chip is connected in series in the main power supply path of the GPU. After passing through the shunt resistor, it outputs a linear analog voltage signal to the microcontroller ADC.

[0020] Voltage monitoring module: The main power supply path converts the voltage into a low-voltage signal that is compatible with the microcontroller's acquisition through voltage divider resistors, and then inputs it to the microcontroller's ADC.

[0021] Furthermore, in the GPU water cooling module: the water cooling block covers the GPU core, memory and power supply unit, drives the coolant circulation through a ceramic bearing pump, and is equipped with a 120mm PWM silent fan;

[0022] The CPU air-cooling module is equipped with independent heat dissipation fins and a fan, which are physically isolated from the GPU water-cooling module.

[0023] Furthermore, the airflow layout of the heat dissipation management module is a vertical side-mounted design, and the fans include a lower external blowing fan, a middle external blowing fan, an upper external blowing fan, and a top exhaust fan; a guide plate is provided in the airflow duct to optimize the exhaust of residual heat from uncovered chips; the temperature sensor is mounted on the GPU power supply board, CPU heat sink, and chassis exhaust vent via a fixed bracket or clip, and communicates with the microcontroller via an I2C bus.

[0024] Furthermore, the piping interface of the GPU water cooling module adopts a quick-release structure.

[0025] Furthermore, the chassis structure adopted includes:

[0026] The GPU area and the CPU motherboard area are arranged in separate compartments, one above the other.

[0027] Power lines, signal lines, and coolant lines are isolated in layers, and high-current wire harnesses are fixed to the cable management board by prefabricated clips;

[0028] The side of the chassis has a circular grille ventilation hole for four fans, and the top has a grille-shaped ventilation area for two fans.

[0029] Furthermore, the control module includes an STM32 microcontroller, which integrates:

[0030] I2C bus interface: connects to the temperature sensor;

[0031] PWM control interface: Connects to drive the cooling fan;

[0032] ADC interface: Connects the output signals of the current detection module and the voltage monitoring module.

[0033] Furthermore, the control module also includes:

[0034] Backup power module: The lithium battery pack is fixed to the bottom of the chassis by a metal bracket and connected to the main control board through plug-in terminals;

[0035] Serial communication module: used to connect to the host computer.

[0036] Compared with the prior art, the present invention and its preferred embodiments have at least the following beneficial effects:

[0037] Integrated power management and improved stability: The dual power input interface and independent control node design enable unified management of power supplies for both the CPU and GPU motherboards; the physical structure of the current sharing control unit, consisting of a shunt resistor and an amplifier sampling circuit, combined with onboard communication and microcontroller interaction, solves the problem of uneven current distribution when power supplies are connected in parallel; the integrated structure of the active filter unit, consisting of a supercapacitor energy storage cluster and a fast-charging lithium battery pack, is connected to the power supply side through a high-speed current sampling circuit, effectively smoothing out voltage fluctuations caused by large-scale load cycle changes, thus improving power supply stability from a hardware perspective.

[0038] Synergistic optimization of heat dissipation efficiency and noise control: The structural design of the split heat dissipation component (GPU water cooling module covering the core / memory / power supply unit + CPU air cooling module independently set) specifically addresses the simultaneous heat dissipation needs of multiple heat sources in high-load GPUs; the vertical side airflow layout (lower / middle / upper external blowing + top exhaust) combined with the physical structure of the air guide plate optimizes the efficiency of heat dissipation from uncovered chips; the temperature sensor is installed in a key position through a fixed bracket and communicates with the microcontroller via I2C, realizing dynamic PWM speed control of the cooling fan, avoiding the high noise problem of traditional fixed-speed fans, and improving adaptability to office scenarios.

[0039] Enhanced reliability of integrated monitoring and control: The control module, based on an STM32 microcontroller, integrates I2C (for connecting temperature sensors), PWM (for driving fans), and ADC (for receiving current / voltage signals) interfaces, enabling real-time acquisition and coordinated control of power supply parameters and heat dissipation status. The design of the backup power module (lithium battery pack fixed by a metal bracket and connected to the main control board) ensures the reporting of critical states and safe shutdown of the system during power outages, solving the problem of lack of early warning protection in traditional solutions.

[0040] Modular and anti-interference enhancements in chassis structure: The layout of upper and lower compartments (GPU area and CPU area are layered), combined with the layered isolation of power lines / signal lines / coolant pipes and the cable routing structure with pre-fabricated clip-on cable management boards, optimizes space utilization and electromagnetic isolation, reduces the risk of interference between different modules, and improves the overall integration and operational reliability of the system. Attached Figure Description

[0041] The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments:

[0042] Figure 1 This is a schematic diagram illustrating the principle of an embodiment of the present utility model;

[0043] Figure 2 The circuit diagram for this embodiment is simplified.

[0044] Figure 3 For reference of the internal structure of the chassis of this utility model Figure 1 ;

[0045] Figure 4 For reference of the internal structure of the chassis of this utility model Figure 2 ;

[0046] Figure 5 For reference of the internal structure of the chassis of this utility model Figure 3 ;

[0047] Figure 6 For reference of the internal structure of the chassis of this utility model Figure 4 ;

[0048] Figure 7 For reference of the internal structure of the chassis of this utility model Figure 5 ;

[0049] Figure 8 A block diagram illustrating the flow sharing control principle of a preferred embodiment of this utility model;

[0050] Figure 9 This is a power circuit architecture diagram provided for a preferred embodiment of the present utility model. Detailed Implementation

[0051] In the following, specific embodiments of this application will be described in detail with reference to the accompanying drawings. Based on these detailed descriptions, those skilled in the art will be able to clearly understand and implement this application. Without departing from the principles of this application, features from various embodiments can be combined to obtain new implementations, or certain features from some embodiments can be substituted to obtain other preferred implementations.

[0052] This invention aims to solve the problems of server power management and heat dissipation management in existing large model fine-tuning platform systems. It proposes a highly efficient integrated power and heat dissipation management system, which improves the server's operating efficiency and reliability by unifying power supply and dynamically controlling heat dissipation.

[0053] Its basic implementation points include:

[0054] 1. Centralized power supply management: Unified power supply for server CPU motherboard and GPU motherboard to avoid instability caused by distributed power supply.

[0055] 2. Real-time Monitoring and Early Warning: A microcontroller is used to monitor input voltage and operating current, and in conjunction with a host computer, real-time monitoring and early warning of the server's power supply status are achieved. In the event of a power outage, the backup power module can provide temporary power to the microcontroller, ensuring a safe system shutdown.

[0056] 3. Dynamic heat dissipation control: By placing sensor modules in the motherboard and chassis, the temperature is sensed in real time, and the speed of the cooling fan is dynamically adjusted according to the temperature to maximize heat dissipation and improve the stability and lifespan of the server when running under high load.

[0057] The design will be described in detail below through specific embodiments.

[0058] To meet the high-efficiency and reliable power supply and heat dissipation management requirements of large-scale model fine-tuning platforms, this utility model proposes a system structure integrating power monitoring, distribution, and intelligent temperature control management. It primarily achieves precise power supply and real-time monitoring of high-power loads such as GPUs on server platforms, and utilizes advanced sensors to realize intelligent cooling fan speed control. Figure 1 , Figure 2 The specific technical solutions for each basic functional module shown are as follows:

[0059] Power Input and Distribution Management Module: The system supports multiple power input interfaces, such as 1000W and 800W general-purpose server power supplies, connected via dedicated input interfaces. Each power supply, after distribution management, supplies power to high-load subsystems such as the motherboard, CPU, and GPU. Each power supply line has an independent control and monitoring node, implemented through a microcontroller for time-sharing switching to meet the power supply needs of different operating stages (a specific implementation scheme is as follows: each power supply line has an independent control node (such as a switching element SW), which is connected to the microcontroller's I / O interface, and the microcontroller's hardware logic implements time-sharing switching control to meet the power supply needs of different operating stages). This effectively prevents power overload problems caused by simultaneous startup of the motherboard and GPU.

[0060] Current Detection and Power Measurement Module: To ensure the safe and efficient operation of core components such as the GPU, this embodiment is designed for a GPU with a maximum power consumption of 300W and a maximum current of 25A. A high-precision INA180A4 current sensing chip is used, connected in series in the GPU's main power supply path to collect the operating current flowing through the GPU in real time. Combined with a low-resistance precision shunt resistor, the INA180A4 outputs an analog voltage signal linearly related to the actual current. After conversion by the microcontroller's ADC, high-precision current monitoring within the 0~25A range is achieved. The system uses this scheme for current acquisition in all core power supply branches, ensuring balanced and reliable power distribution across the entire platform.

[0061] Voltage Monitoring Module: The system uses a resistor divider method for voltage monitoring in each main power supply path. By matching precision voltage divider resistors, high-voltage power (such as 12V, 5V, etc.) is converted into a low-voltage signal that the microcontroller can acquire and input to the microcontroller's ADC module, enabling real-time detection of the main power supply line voltage. Combined with current monitoring, accurate assessment of power consumption for each path can be achieved, supporting remote real-time querying and historical data comparison analysis by the host computer.

[0062] Temperature detection and intelligent fan speed control module: AHT20 digital temperature sensors are installed at key heat sources (such as GPU, CPU, main power supply circuit, etc.) on the motherboard and inside the chassis. The AHT20 communicates with the microcontroller via the I2C bus to collect temperature data in real time. Based on the actual temperature after signal processing, the microcontroller dynamically adjusts the speed of the cooling fan (supports PWM control) to achieve on-demand cooling, effectively avoiding overheating and reducing unnecessary power consumption. When the window temperature is too high, multiple alarm levels can be triggered, and the fan will automatically accelerate to protect the hardware.

[0063] Microcontroller Control and Backup Power Supply Module: The entire power monitoring and heat dissipation management system uses an STM32 microcontroller as the core control unit. All sensing and measurement signals are acquired, processed, and judged by the microcontroller. To ensure normal system operation in abnormal situations such as main power failure, the microcontroller and main monitoring and control circuits are equipped with a lithium battery backup power supply module, realizing UPS functionality. Even in the event of a main power failure, it can still perform safety actions such as status reporting and power outage warnings. These microcontroller control processes are all mature existing technologies, thus achieving low implementation costs.

[0064] Sensor Interface and Signal Processing Module: The sensor interface module is compatible with various external sensors (including temperature, humidity, wind speed, etc.). All external input signals are amplified and filtered before being sent to the microcontroller, improving signal quality and data accuracy. The system has abundant I / O interfaces, allowing for easy expansion of monitoring points and heat dissipation units.

[0065] Data communication and host computer management: The microcontroller communicates with the host computer via serial port or other bus methods, uploading key parameters such as voltage, current and temperature in real time. The host computer can monitor, analyze and schedule configurations through system software, supporting functions such as fault alarm, status query and historical data archiving, providing decision-making basis for platform operation and maintenance.

[0066] like Figures 3-7 As shown, compared to traditional all-in-one quad-card parallel platforms, the GPU and CPU of this type of platform are in the same heat dissipation environment, and temperature control usually relies on enhanced cooling from the overall fan, resulting in higher ambient noise. Because quad-card parallel platforms are convenient for debugging and hardware plugging and unplugging, they are often deployed in personal work areas or vacant office spaces, and their high noise output has an adverse impact on the office environment and personnel. Therefore, the innovative focus of this utility model embodiment is on how to design a hardware system suitable for office environments that ensures stable fine-tuning performance.

[0067] In a further structural design of this embodiment, a separate GPU and CPU structure is adopted. Given that the CPU is not a primary computing unit during large-scale model fine-tuning and generates relatively little heat, active air cooling is sufficient to meet its cooling requirements. However, the GPU, as a high-load computing unit, typically uses high-speed fans to ensure performance stability during fine-tuning. This not only generates significant noise but also accelerates fan wear and failure. In contrast, a water-cooling solution significantly increases the contact area between the heatsink and the GPU core and memory units, effectively improving heat dissipation efficiency. Considering that the GPU power supply unit needs to withstand large current fluctuations during fine-tuning, resulting in significant heat generation, insufficient heat dissipation will lead to reduced power supply capacity and GPU performance degradation. Therefore, this embodiment covers the GPU core, memory, and power supply unit on the water-cooling head structure, achieving sufficient thermal coverage and efficient heat dissipation.

[0068] In terms of water-cooling airflow design, this embodiment adopts a vertical side-mounted configuration with fans arranged for bottom outward blowing, middle and top outward blowing, and a top fan to promptly expel high-temperature airflow. This layout helps to address the residual heat generated by high-heat chips (such as signal switching chips and separate control chips) not covered by the water cooling system, and improves overall heat exchange efficiency. Preferably, all fans support PWM speed control, dynamically adjusting their speed based on the temperature sampler inside the chassis to achieve intelligent temperature control and noise optimization.

[0069] To address the uneven temperature distribution within the chassis, temperature sensors are strategically placed in key locations such as the split motherboard, fan exhaust vents, and power supply control board. (For reference, the temperature sensor (AHT20) is fixed to the GPU power supply area of ​​the split motherboard, the metal mesh cover of the fan exhaust vent, and the surface of the heatsink on the power supply control board using plastic clips.) Considering the heat dissipation effect in the actual space, the temperature sensors communicate with the microcontroller via an I2C bus. The microcontroller integrates hysteresis logic control circuitry (such as a delay unit composed of comparator U3 and capacitor C5). Hardware signal processing avoids frequent switching of the PWM speed control signal (Note: Hysteresis logic is an existing control logic; this solution implements its physical carrier through the hardware circuitry integrated into the microcontroller), while also extending the fan's lifespan. In summary, this design achieves stable fine-tuning performance and efficient, reliable hardware operation within an office environment while maintaining controllable noise levels.

[0070] In a preferred embodiment of this invention, the power lines, signal lines, and coolant lines are arranged in a layered isolation design within the chassis to avoid electromagnetic interference and temperature rise. For example, the chassis has three layers of isolation plates (made of FR4, 1mm thick): the bottom layer is the power line layer (50mm wide), the middle layer is the signal line layer (30mm wide), and the top layer is the coolant line layer (40mm wide). Each layer is physically separated by the isolation plates to avoid electromagnetic interference and temperature rise.

[0071] In a preferred embodiment, prefabricated clips and cable management boards are introduced to ensure the mechanical safety and simulated thermal stability of the high-current power supply harness in hot and cold airflow environments.

[0072] In a preferred embodiment, the water cooling pipe interface adopts a quick-release design (such as a quick-release snap structure, including a male (plastic) and a female (metal) connector, which can be quickly plugged in and out by pressing the spring sheet, making it convenient for hardware replacement and maintenance in office settings). This facilitates hardware replacement and maintenance in office settings, forming a differentiated structural innovation. The heat dissipation contact structure adopts a "CPU-GPU split + GPU-power supply-memory integrated water cooling head" structure specifically designed for high-performance server fine-tuning scenarios.

[0073] In a preferred embodiment of this invention, the internal areas of the chassis are clearly divided into physical compartments, with the GPU area and CPU motherboard area separated into upper and lower compartments, which is more conducive to modular assembly and thermal management.

[0074] Key components and parameter specifications:

[0075] The active power filter (APF) module uses supercapacitors with specific capacity and voltage rating (energy storage clusters formed by connecting 2.7V / 500F cells in series and parallel) and a 12V / 20Ah fast-charging lithium battery pack, with a maximum charging and discharging current of 10A.

[0076] The water cooling system uses a high-flow ceramic bearing pump and is equipped with a 120mm PWM silent temperature-measuring fan with a speed of 1500RPM±10% and a static air pressure of 1.53mm H2O.

[0077] The actual noise level of the entire machine is ≤45dB(A) under normal fine-tuning conditions, which meets the human comfort standards for ordinary office spaces.

[0078] From another perspective, for reference, this embodiment also proposes a more optimized and innovative design to address the typical technical challenges of dual-server power supply parallel systems. It proposes a novel control architecture combining the "democratic current sharing method" and "active filtering (APF)," such as... Figure 8 As shown, this is to achieve current sharing control of multiple input power supplies and stable output voltage smoothing.

[0079] 1. Solves the problems of current sharing and voltage fluctuations that exist in the simple parallel connection of traditional power supplies.

[0080] When traditional multi-input power supplies are used in direct parallel connection, the differences in internal impedance between the different power supplies prevent the output current from being ideally evenly distributed. Instead, the current is split according to the impedance ratio, causing some power supplies to be overloaded and others to be underloaded, preventing them from operating within their optimal range. Furthermore, considering the periodic changes in the load of large server models, the current load exhibits periodic fluctuations, further leading to power supply output voltage oscillations and inrush current phenomena, affecting system stability and reliability.

[0081] To address the aforementioned issues, this embodiment innovatively incorporates a reasonable current sharing control mechanism at the dual server power input. By combining load characteristics with dynamic current sharing and voltage stability as objectives, it systematically improves power supply quality.

[0082] 2. A combined control strategy of "democratic flow equalization" and "active filtering (APF)" is adopted.

[0083] Democratic Current Sharing Method: This embodiment abandons the shortcomings of the traditional drooping method and adopts the democratic current sharing method (this method is existing technology and is not an innovative design of this solution, nor is it the object of protection of this utility model). This method enables each input power supply to participate in the overall current coordination and distribution. Through real-time communication and feedback adjustment, dynamic adjustment and balancing of the dual power supply output current are achieved, improving the overall system efficiency and reliability. Preferably, the current sharing control unit adopts a shunt resistor + amplifier sampling circuit (such as an INA180A4 chip connected in series with a shunt resistor) independently set for each power supply. The sampling circuit interacts with the microcontroller through onboard communication circuits (such as I2C bus or SPI interface) to achieve dynamic adjustment and balancing of the dual power supply output current (Note: the democratic current sharing method is existing control logic, and this solution implements its physical carrier through the above hardware structure).

[0084] Active Power Filter (APF): Active power filtering technology is introduced into the power supply side energy storage module. It utilizes the energy storage characteristics of supercapacitors and batteries to actively compensate for current oscillations caused by the cyclical changes of large model loads, smooth out DC bus voltage fluctuations, and improve power supply stability and response speed.

[0085] The combination of these two elements forms a multi-layered, multi-angle power supply regulation structure, which not only solves the problem of uneven current distribution but also effectively suppresses voltage disturbances caused by the load, ensuring the continuous and stable operation of the server.

[0086] 3. The power circuit architecture employs a high-efficiency half-bridge isolated DC / DC converter.

[0087] To ensure power conversion efficiency and system safety, this design employs a half-bridge isolated DC / DC converter in the power section, which satisfies isolation requirements while providing good power conversion performance. For example... Figure 9 As shown, this topology is adapted to the control strategy and can work with the active filter module to achieve fast response and high-precision adjustment.

[0088] 4. The control system architecture and signal flow are clear and reasonable.

[0089] This embodiment features a comprehensive signal sampling and feedback mechanism, including sampled signals (represented by dashed lines) and power flow direction (represented by solid lines), forming a closed-loop control system. The controller comprehensively manages the input power supply and energy storage module, achieving dynamic balancing of the load current and real-time stable control of the DC bus voltage.

[0090] As a circuit implementation of the above preferred embodiment, in the preferred embodiment of this example:

[0091] The active power filter (APF) unit module has a separate high-speed current sampling circuit and communicates with the main control microcontroller on board, which facilitates unified signal control and redundant protection.

[0092] The power supply current sharing sampling adopts a separate shunt resistor + amplifier scheme, and each power supply has an independent feedback control loop, ensuring the accuracy of current sharing through hardware.

[0093] In the description of this specification, references to terms such as "an embodiment," "example," "specific example," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of this disclosure. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.

[0094] The foregoing has shown and described the basic principles, main features, and advantages of this disclosure. Those skilled in the art should understand that this disclosure is not limited to the above embodiments. The embodiments and descriptions in the specification are merely illustrative of the principles of this disclosure. Various changes and modifications can be made to this disclosure without departing from its spirit and scope, and all such changes and modifications fall within the scope of this disclosure as claimed.

[0095] This utility model is not limited to the above-described preferred embodiment. Anyone can derive other forms of efficient integrated power supply and heat dissipation management system for large model fine-tuning platform systems under the guidance of this utility model. All equivalent changes and modifications made within the scope of the patent application of this utility model shall fall within the scope of this utility model.

Claims

1. An efficient integrated power and thermal management system for large model fine-tuning platform systems, characterized in that, include: The following modules are interconnected: power management module, thermal management module, and control module. The power management module includes a dual power input interface, a current sharing control unit, an active filter unit, and a monitoring unit. The heat dissipation management module includes a split heat dissipation component consisting of a GPU water cooling module and a CPU air cooling module, as well as a temperature sensor. The control module includes a microcontroller.

2. The efficient integrated power supply and thermal management system for large model fine-tuning platform system of claim 1, wherein: The dual power input interfaces are dedicated interfaces for 1000W and 800W general server power supplies. Each power supply line has an independent control node, and the control node is connected to the microcontroller.

3. The high-efficiency integrated power supply and heat dissipation management system for a large model fine-tuning platform system according to claim 1, characterized in that: The current sharing control unit includes: a shunt resistor and amplifier sampling circuit independently set for each power supply, and the sampling circuit interacts with the microcontroller through onboard communication; the active filter unit includes a supercapacitor energy storage cluster and a fast-charging lithium battery pack, and the energy storage cluster and the lithium battery pack are connected to the power supply side through a high-speed current sampling circuit.

4. The high-efficiency integrated power supply and heat dissipation management system for a large model fine-tuning platform system according to claim 1, characterized in that: The monitoring unit includes: Current detection module: The INA180A4 current detection chip is connected in series in the main power supply path of the GPU. After passing through the shunt resistor, it outputs a linear analog voltage signal to the microcontroller ADC. Voltage monitoring module: The main power supply path converts the voltage into a low-voltage signal that is compatible with the microcontroller's acquisition through voltage divider resistors, and then inputs it to the microcontroller's ADC.

5. The high-efficiency integrated power supply and heat dissipation management system for a large model fine-tuning platform system according to claim 1, characterized in that: In the GPU water cooling module: the water cooling block covers the GPU core, memory and power supply unit, drives the coolant circulation through a ceramic bearing pump, and is equipped with a 120mm PWM silent fan; The CPU air-cooling module is equipped with independent heat dissipation fins and a fan, which are physically isolated from the GPU water-cooling module.

6. The high-efficiency integrated power supply and heat dissipation management system for a large model fine-tuning platform system according to claim 1, characterized in that, The heat dissipation management module has a vertical side-mounted airflow layout, and the fans include a lower external fan, a middle external fan, an upper external fan, and a top exhaust fan. A guide plate is provided in the airflow duct to optimize the exhaust of residual heat from uncovered chips. The temperature sensor is mounted on the GPU power supply board, CPU heat sink, and chassis exhaust vent via a fixed bracket or clip, and communicates with the microcontroller via an I2C bus.

7. The high-efficiency integrated power supply and heat dissipation management system for a large model fine-tuning platform system according to claim 1, characterized in that: The piping interface of the GPU water cooling module adopts a quick-release structure.

8. The high-efficiency integrated power supply and heat dissipation management system for a large model fine-tuning platform system according to claim 1, characterized in that: The chassis structure used includes: The GPU area and the CPU motherboard area are arranged in separate compartments, one above the other. Power lines, signal lines, and coolant lines are isolated in layers, and high-current wire harnesses are fixed to the cable management board by prefabricated clips; The side of the chassis has a circular grille ventilation hole for four fans, and the top has a grille-shaped ventilation area for two fans.

9. The high-efficiency integrated power supply and heat dissipation management system for a large model fine-tuning platform system according to claim 1, characterized in that: The control module includes an STM32 microcontroller, which integrates: I2C bus interface: connects to the temperature sensor; PWM control interface: Connects to drive the cooling fan; ADC interface: Connects the output signals of the current detection module and the voltage monitoring module.

10. The high-efficiency integrated power supply and heat dissipation management system for a large model fine-tuning platform system according to claim 9, characterized in that: The control module also includes: Backup power module: The lithium battery pack is fixed to the bottom of the chassis by a metal bracket and connected to the main control board through plug-in terminals; Serial communication module: used to connect to the host computer.