A type of server
By implementing an orthogonal layout design within the server chassis, integrating compute nodes, switching nodes, power supplies, and liquid cooling pipes, the problem of low server computing power density is solved, achieving efficient space utilization and ease of maintenance, and improving overall computing power density and scalability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- INSPUR SUZHOU INTELLIGENT TECH CO LTD
- Filing Date
- 2026-05-29
- Publication Date
- 2026-06-30
AI Technical Summary
There is a conflict between space and computing power in servers, resulting in low computing power density. The space utilization of rack-mounted supernode servers is insufficient and cable management is complicated, while the computing power of all-in-one servers cannot meet the training needs of large models and their scalability is limited.
Multiple compute nodes, switching nodes, power supplies, and liquid cooling pipes are integrated within the chassis and orthogonally arranged through a backplane assembly. This optimizes the position of the power supplies and liquid cooling pipes, enabling a compact and coordinated arrangement of compute, switching, power supply, and heat dissipation resources, thereby improving computing density.
Increasing the computing power density of server systems within a limited space, while taking into account system expansion efficiency, maintenance convenience, and deployment adaptability, reducing cable crossover interference, and improving the space utilization and maintenance efficiency of the entire rack.
Smart Images

Figure CN122308569A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the technical field of servers, and more particularly to a server. Background Technology
[0002] With the increasing prevalence of large-scale model training, complex scientific computing, and real-time data processing, servers are facing computing power bottlenecks. Servers can include rack-mounted supernode servers and standalone all-in-one servers. Rack-mounted supernode servers offer large computing power, but require 28U of space, and centralized cable management necessitates disassembling the entire cable tray to replace faulty components. All-in-one servers offer flexible deployment and simple maintenance, but their computing power cannot meet the needs of large-scale model training, and their scalability is limited.
[0003] In related technologies, there is a conflict between space and computing power in servers, resulting in low computing power density. Summary of the Invention
[0004] This application provides a server to at least address the problem of low computing power density in servers in related technologies.
[0005] This application provides a server, including a chassis, which includes multiple computing nodes, multiple switching nodes, a first power supply, a second power supply, a first liquid cooling pipe, a second liquid cooling pipe, and a backplane assembly.
[0006] Multiple switching nodes are arranged on the first side of the backplane assembly, a first power supply group and a second power supply group are arranged on both sides of the multiple switching nodes along a first direction, and a first liquid cooling pipe and a second liquid cooling pipe are arranged on both sides of the multiple switching nodes along a first direction.
[0007] The distance between the first liquid cooling pipe and the second liquid cooling pipe along the first direction is smaller than the dimension of the calculation node along the first direction.
[0008] Multiple computing nodes are arranged along a second direction on the second side of the backplane assembly, with the first and second directions perpendicular to each other.
[0009] This application integrates multiple computing nodes, multiple switching nodes, a first power supply group, a second power supply group, a first liquid cooling pipe, a second liquid cooling pipe, and a backplane assembly within a chassis. Multiple switching nodes are positioned on the first side of the backplane assembly, and multiple computing nodes are positioned on the second side of the backplane assembly along a second direction perpendicular to the first direction. Furthermore, the relative positions of the power supply group and liquid cooling pipe on both sides of the switching nodes, as well as the spacing between the first and second liquid cooling pipes, are defined. This allows for a compact and coordinated arrangement of computing, switching, power supply, and heat dissipation resources within a limited chassis space, thereby improving the computing power density of the server system while also considering system expansion efficiency, maintenance convenience, and deployment adaptability. Attached Figure Description
[0010] To more clearly illustrate the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0011] Figure 1 This is a schematic diagram of the structure of a rack-mounted supernode server provided in an embodiment of this application;
[0012] Figure 2 This application provides a schematic diagram of the structure of a computing node in a rack-mounted supernode server.
[0013] Figure 3 This is a schematic diagram of the structure of an all-in-one server provided in an embodiment of this application;
[0014] Figure 4 This application provides a schematic diagram of the structure of a server according to an embodiment of the present application.
[0015] Figure 5 This is a schematic diagram of the structure of a backplane assembly provided in an embodiment of this application;
[0016] Figure 6 This is a schematic diagram of another backplane assembly provided in an embodiment of this application;
[0017] Figure 7 This application provides a schematic diagram of the structure of a computing node.
[0018] Figure 8 This is a schematic diagram of another computing node structure provided in an embodiment of this application;
[0019] Figure 9 This is a schematic diagram of the structure of a processor motherboard provided in an embodiment of this application;
[0020] Figure 10 This is a schematic diagram of the structure of a network card interface module provided in an embodiment of this application;
[0021] Figure 11 This is a schematic diagram of the structure of a first substrate provided in an embodiment of this application;
[0022] Figure 12 This is a schematic diagram of the structure of a switching node provided in an embodiment of this application;
[0023] Figure 13 This is a topology diagram of a server architecture provided in an embodiment of this application;
[0024] Figure 14This is a schematic diagram of another server interconnection topology provided in an embodiment of this application.
[0025] The above figures include the following reference numerals:
[0026] 101-Top rack switch; 102-Cable management rack; 103-Power supply rack; 104-Computing node;
[0027] 105-Forwarding node; 106-Blank space; 107-Power supply frame; 108-Copper busbar;
[0028] 201 - Network cable interface card; 202 - Unlimited bandwidth high-speed network slot; 203 - Storage medium; 204 - Power supply board; 205 - Switching board;
[0029] 206-Fan; 207-Motherboard; 208-Operations and Maintenance Management Board; 209-Water Inlet Pipe; 210-Power Supply Clip;
[0030] 211 - High-density connector; 212 - Water outlet pipe;
[0031] 301 - Expansion Interface; 302 - Accelerator Node; 303 - Processor Node;
[0032] 304 - Power Supply Unit Slot; 305 - Hard Disk Drive; 306 - Network Interface Card; 307 - PCIe Slot;
[0033] 308 - Fan assembly; 309 - Power supply unit; 310 - Power supply unit connector;
[0034] 401 - Chassis; 402 - Compute Node; 403 - Switching Node; 404 - First Power Supply Group;
[0035] 405 - Second power supply unit; 406 - First liquid cooling pipe; 407 - Second liquid cooling pipe; 408 - Backplane assembly;
[0036] 501 - First backplate assembly; 502 - Second backplate assembly; 503 - Horizontal backplate;
[0037] 601 - Switching node connector; 602 - First power supply backplane; 603 - First compute node backplane;
[0038] 604 - First connector; 605 - First power connector; 606 - Second power backplane;
[0039] 607 - Second compute node backplane; 608 - Second connector; 609 - Second power connector;
[0040] 610 - First compute node connector; 611 - Second compute node connector; 612 - Management controller;
[0041] 613 - First power supply unit; 614 - Second power supply unit;
[0042] 701 - Processor motherboard; 702 - Substrate module; 7021 - First substrate; 7022 - Second substrate;
[0043] 703-Middle back plate; 704-First liquid cooling plate;
[0044] 802 - Second liquid cooling plate; 803 - Network card interface module; 804 - Network card; 805 - Hard disk module;
[0045] 901 - Processor; 902 - Memory; 903 - Input power connector; 904 - Power supply connector;
[0046] 9041 - First power supply connector; 9042 - Second power supply connector; 9043 - Third power supply connector;
[0047] 905 - Signal connector; 9051 - First signal connector; 9052 - Second signal connector;
[0048] 9053 - Third signal connector; 906 - Power converter; 907 - First hard disk;
[0049] 1001 - Network interface; 1002 - First signal connector group; 1003 - Second signal connector group;
[0050] 1101 - First switch chip; 1102 - Fourth signal connector;
[0051] 1103 - Fifth signal connector; 1104 - First orthogonal connector; 1105 - Accelerator;
[0052] 1106 - Power connector; 1107 - Programmable logic device;
[0053] 1201 - Second Orthogonal Connector; 1202 - Switching Processor; 1203 - Hardware Manager;
[0054] 1204 - Third liquid cooling pipe; 1301 - Power supply assembly; 1302 - Hardware management controller;
[0055] 1303 - Sideband Management Interface; 1304 - Power Supply Module; 1305 - Clock Buffer;
[0056] 1306 - Management bus signal for the I2C intelligent platform; 1307 - Global reset signal;
[0057] 1308 - Chassis / Node ID signal; 1309 - Presence detection signal;
[0058] 1310 - UART serial port signal 0; 1311 - UART serial port signal 1;
[0059] 1312 - Second UART serial port signal; 1313 - Power enable signal;
[0060] 1314-100MHz clock signal. Detailed Implementation
[0061] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the protection scope of this application.
[0062] It should be noted that the terms "center," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," "axial," "radial," and "circumferential," etc., indicating orientation or positional relationships based on the orientation or positional relationships shown in the accompanying drawings, are only for the convenience of describing this application and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation of this application. The terms "installed," "connected," and "linked" should be interpreted broadly, for example, they can be fixed connections, detachable connections, or integral connections; they can be mechanical connections or electrical connections; they can be direct connections or indirect connections through an intermediate medium; they can be internal connections between two elements. The terms "parallel," "perpendicular," and "equal" include the described situation and situations similar to the described situation, the range of which is within an acceptable deviation range, wherein the acceptable deviation range is determined by those skilled in the art taking into account the measurement under discussion and the error associated with the measurement of a particular quantity (i.e., the limitations of the measurement system). For example, "parallel" includes absolute parallelism and approximate parallelism, where an acceptable deviation range for approximate parallelism can be, for example, within 5°; "perpendicular" includes absolute perpendicularity and approximate perpendicularity, where an acceptable deviation range for approximate perpendicularity can also be, for example, within 5°. "Equal" includes absolute equality and approximate equality, where an acceptable deviation range for approximate equality can be, for example, a difference between the two equal items being less than or equal to 5% of either one. Those skilled in the art will understand the specific meaning of the above terms in this application based on the specific circumstances.
[0063] To enable those skilled in the art to better understand the present application, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments.
[0064] With the increasing prevalence of large-scale model training, complex scientific computing, and real-time data processing, servers are facing computing power bottlenecks. Servers can include rack-mounted supernode servers and standalone all-in-one servers. Rack-mounted supernode servers offer large computing power, but require 28U of space, and centralized cable management necessitates disassembling the entire cable tray to replace faulty components. All-in-one servers offer flexible deployment and simple maintenance, but their computing power cannot meet the needs of large-scale model training, and their scalability is limited.
[0065] Below, in conjunction with Figure 1 An example of a rack-mounted supernode server will be provided.
[0066] Figure 1 This is a schematic diagram of a rack-mounted supernode server provided as an embodiment of this application. Please refer to... Figure 1 , Figure 1 This can include the front and back of a 64-card supernode rack-mount supernode server.
[0067] The height of the rack-mounted supernode server is 46U, and the whole machine is compatible with 21-inch, 1.2m standard racks. U is a rack unit, which is an industry standard unit used to measure the height of the equipment in a standard rack.
[0068] In the front view, the rack-mounted supernode server may include a top-mounted switch 101, a cable management rack 102, a power supply rack 103, computing nodes 104, forwarding nodes 105, and empty spaces 106.
[0069] The top-of-rack switch 101 can be deployed at the top of the rack to connect all servers within the rack and access the upper-layer network, serving as a node for network communication within the rack.
[0070] The cable management rack 102 can be used to organize and fix cables, preventing messy wiring.
[0071] The power supply rack 103 can be used to deploy rack-level power supply equipment, and the power supply rack 103 has a height of 2U.
[0072] Computing node 104 can be a server unit that undertakes data computing tasks, providing computing power support for business applications. The solution deploys a total of 16 computing nodes, each with a height of 1U, a depth of 950mm, and a width of 536mm.
[0073] Below, in conjunction with Figure 2 This section provides an example of a compute node in a rack-mounted supernode server.
[0074] Figure 2 This is a schematic diagram of the structure of a compute node in a rack-mounted supernode server provided as an embodiment of this application. Please refer to... Figure 2 , Figure 2 It can include computing node 104.
[0075] Computing node 104 can integrate multiple key components to achieve efficient computing, network communication, and operation and maintenance management.
[0076] The computing node 104 may include a network cable interface card 201, an unlimited bandwidth high-speed network slot 202, a storage medium 203, a power board 204, a switching board 205, a fan 206, a motherboard 207, an operation and maintenance management board 208, a water inlet pipe 209, a power supply clip 210, a high-density connector 211, and a water outlet pipe 212.
[0077] exist Figure 2 In the middle, the network cable interface card 201 in the upper left of the left area can undertake high-speed network communication tasks of 200Gbps, ensuring efficient data transmission and reception; the four unlimited bandwidth high-speed network slots 202 below it can be designed for high-performance computing scenarios, supporting low latency and high bandwidth node interconnection; the four storage media 203 in the lower left can provide large-capacity, high read and write storage capabilities for computing nodes 104.
[0078] The central area can be the core functional area of the computing node. The power board 204 in the upper middle can be used for the reasonable allocation of input power, providing power supply and management for various components. The switching board 205 on its right can be based on the high-speed serial computer expansion bus standard (PCI Express, PCIe) bus technology to expand PCIe devices and realize high-speed data exchange, improving the efficiency of resource sharing between devices. The motherboard 207 in the center can be the computing core, which can be used for data computing, processing and system resource scheduling. The eight fans 206 on the left side of the motherboard 207 cool the high-heat components through forced air cooling, maintaining the stable operating temperature of the computing node 104.
[0079] The outlet pipe 212 and inlet pipe 209 can be used for liquid cooling and operation and maintenance management. The inlet pipe 209 on the upper right can introduce coolant into the internal liquid cooling module of the computing node to dissipate heat from high-heat components such as the processor. The power supply clip in the middle of the right side is used to fix the power connection components to avoid poor contact and ensure the stability of power supply. The four high-density connectors on the side can support multi-pin, high-bandwidth data / power transmission to meet the high-speed interconnection requirements of components. The outlet pipe 212 on the lower right can discharge the coolant after heat absorption and enter the external heat dissipation equipment to complete the circulation. In addition, the four operation and maintenance management boards in the middle of the right side can undertake the operation, management and maintenance of the computing node, providing functions such as status monitoring, fault diagnosis and configuration management to ensure the maintainability of the computing node.
[0080] Forwarding node 105 can be used for network switching within or across racks, ensuring efficient data transmission between different devices. This solution deploys a total of 8 switching nodes.
[0081] Blank space 106 can be used to indicate that no equipment has been deployed in this U-position, which may be reserved for expansion, a spare location, or a location where no equipment needs to be installed for the time being.
[0082] In the rear view, the rack-mounted supernode server also includes a power supply frame 107 and a copper busbar 108.
[0083] The power supply frame 107 can be used as a tray-type / trough-type bracket to support various types of cables. All interconnect cables for the 16 compute nodes and 8 switching nodes can be connected to the power supply frame 107 via high-density connectors, enabling centralized management of the entire cabinet's cables.
[0084] For example, cables can be network cables, fiber optic cables, power cables, etc.
[0085] The copper busbar 108 can be used to deliver electrical energy from the power supply to various components of the server.
[0086] Among them, the rack-mounted supernode server can adopt a centralized power supply architecture, which can be composed of a 54V power supply frame 107 and a copper busbar 108 to form the power supply core. Power distribution is achieved through the large cross-section copper busbar 108, and the node draws power through alligator clip contact.
[0087] Considering the structural features of the rack and the requirement for 64 cards, this solution has shortcomings in terms of space utilization. The problems lie in the excessive space occupied by multiple devices and the additional space encroached upon by auxiliary components at the rear, as specifically manifested below:
[0088] First, the nodes occupy too much rack space. The entire rack is 46U high, and the 16 compute nodes (1U each) and 8 switching nodes require at least 28U of rack space. The core computing power and switching nodes alone occupy nearly two-thirds (28 / 46≈60.9%) of the rack's height resources, directly compressing the available space for auxiliary equipment and blank reserved slots.
[0089] Secondly, the auxiliary components on the back of the rack further encroach on the usable space inside the rack. According to the back structure design of the computing node 104 and the forwarding node 105, the power supply frame 107 and the copper busbar 108 need to be fixedly installed on the back of the rack, and must fully cover and connect all interconnected and powered device nodes in the entire rack, including the rack top switch 101 deployed on the front, all computing nodes 104, all forwarding nodes 105, and related auxiliary equipment on the back, in order to realize the power supply and signal transmission of the entire rack.
[0090] In this installation method, the power supply frame 107 and copper busbar 108 not only occupy the vertical height space at the back of the rack, but also a certain amount of horizontal depth space, further reducing the available capacity inside the rack. As a result, the remaining U-slots and depth space of the rack are insufficient to meet the needs of other auxiliary equipment (such as monitoring equipment and backup power supply) deployment and cable redundancy reservation. At the same time, it cannot provide sufficient expansion space for subsequent upgrade operations such as adding computing nodes and switching nodes, ultimately resulting in a significant reduction in the overall rack space utilization.
[0091] Below, in conjunction with Figure 3 An example of an all-in-one server will be given.
[0092] Figure 3 This is a schematic diagram of the structure of an all-in-one server provided in an embodiment of this application. Please refer to [link / reference]. Figure 3 , Figure 3 This can include the front and back of an 8-card all-in-one server.
[0093] The 8-card all-in-one server can be adapted to a general 19-inch 1m rack. It is 890mm long, 447mm wide, and 8U high, and consists of a functional module on the front and a heat dissipation and power supply module on the back.
[0094] The front of the all-in-one server may include an expansion interface 301, an accelerator node 302, a processor node 303, a power supply unit slot 304, a hard disk 305, a network interface card 306, and a PCIe slot 307.
[0095] In the front layout of the all-in-one server, the 4U accelerator node 302 can provide powerful parallel computing capabilities for scenarios such as artificial intelligence training and inference.
[0096] Accelerator nodes can employ hardware acceleration devices such as graphics processing units (GPUs) and field-programmable gate arrays (FPGAs).
[0097] The expansion interface 301 can support cross-chassis accelerator networking, enabling multi-chassis computing power interconnection and operation and maintenance management expansion.
[0098] The 3U processor node 303 can be used for data processing, instruction execution, and resource scheduling.
[0099] Multiple 305 hard drives can meet the high read / write requirements of large-scale data in AI training / inference.
[0100] Multiple network interface cards 306 and multiple PCIe slots 307 enable low-latency, high-bandwidth direct memory access data interaction between nodes.
[0101] The power supply unit slot 304 can provide stable power support for the device.
[0102] The back of the all-in-one server may include a fan assembly 308, a power supply unit 309, and a power supply unit connector 310.
[0103] In the rear layout of the all-in-one server, the fan group 308 can cool down high-heat components such as accelerator node 302 and processor node 303 through forced air cooling, ensuring long-term reliable operation of the equipment.
[0104] The power supply unit 309 can supply power to low-voltage components such as fans and control circuits.
[0105] The power supply unit connector 310 can deliver sufficient power to high-power computing modules such as accelerator node 302 and processor node 303.
[0106] In specific large-scale language model training scenarios, the computing power of a single all-in-one server configuration is insufficient to meet performance requirements; or when the computing power of a single card is relatively low, more accelerator nodes need to be stacked to improve computing power. If, based on the current architecture, such an all-in-one machine needs to be expanded to a 64-card configuration, the height of the single machine will increase significantly, which will lose the inherent characteristic of high space efficiency of all-in-one machines. Moreover, the existing power supply architecture cannot meet the power requirements of a 64-card configuration, resulting in obvious expansion limitations.
[0107] In related technologies, there is a conflict between space and computing power in servers, resulting in low computing power density.
[0108] To address the aforementioned technical problems, this application provides a server that integrates multiple computing nodes, multiple switching nodes, a first power supply group, a second power supply group, a first liquid cooling pipe, a second liquid cooling pipe, and a backplane assembly within a chassis. The multiple switching nodes are positioned on the first side of the backplane assembly, and the multiple computing nodes are positioned on the second side of the backplane assembly along a second direction perpendicular to the first direction. Furthermore, the relative positions of the power supply group and liquid cooling pipe on both sides of the switching nodes, as well as the spacing between the first and second liquid cooling pipes, are defined. This allows for a compact and coordinated arrangement of computing, switching, power supply, and heat dissipation resources within a limited chassis space, thereby improving the computing power density of the server system while also considering system expansion efficiency, maintenance convenience, and deployment adaptability.
[0109] To enable those skilled in the art to better understand the present application, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments.
[0110] Figure 4 This is a schematic diagram of a server structure provided in an embodiment of this application. Please refer to [link / reference]. Figure 4 , Figure 4Includes servers.
[0111] The server includes a chassis 401, which contains multiple computing nodes 402, multiple switching nodes 403, a first power supply group 404, a second power supply group 405, a first liquid cooling pipe 406, a second liquid cooling pipe 407, and a backplane assembly 408.
[0112] Multiple switching nodes 403 are disposed on the first side of the backplane assembly 408, a first power supply group 404 and a second power supply group 405 are disposed on both sides of the multiple switching nodes 403 along a first direction, and a first liquid cooling pipe 406 and a second liquid cooling pipe 407 are disposed on both sides of the multiple switching nodes 403 along the first direction.
[0113] The distance between the first liquid cooling pipe 406 and the second liquid cooling pipe 407 along the first direction is smaller than the dimension of the computing node 402 along the first direction.
[0114] Multiple computing nodes 402 are arranged on the second side of the backplane assembly 408 along a second direction, with the first direction and the second direction being perpendicular.
[0115] Among them, chassis 401 can serve as the overall installation carrier for the server, providing installation space and structural support for its internal components, and ensuring the stability of the components after assembly.
[0116] The chassis 401 can be located on the outermost layer of the entire machine. The interior is arranged in partitions along a preset first and second direction, allowing each module to be installed in its corresponding position by means of insertion, screw connection, slide rail insertion, or quick-release locking, and to interface with the back panel assembly, power supply group, and liquid cooling pipes. In terms of form, the chassis 401 can be in the form of a metal cabinet, an aluminum alloy frame shell, or a steel load-bearing housing.
[0117] In one possible embodiment, the chassis 401 may be formed by bending sheet metal to form the enclosure.
[0118] In one possible embodiment, the chassis 401 may be assembled from profiles to form a frame.
[0119] In one possible embodiment, the chassis 401 may be formed into a load-bearing cavity by integral stamping or casting.
[0120] The chassis 401 can be made of cold-rolled steel plate, aluminum-magnesium alloy or stainless steel, and can be coated with anti-corrosion coating, conductive coating or insulating coating on the inner surface.
[0121] The chassis 401 is compatible with standard 19-inch rack environments. The height, depth, and width of its internal mounting positions are compactly configured according to the stacking relationship of compute nodes 402, switching nodes 403, and power supply and heat dissipation units. The wall thickness of the chassis 401 can vary from one to three millimeters depending on the load-bearing and heat dissipation requirements. The spacing of the module mounting positions can be reserved with U-level height or millimeter-level precision to achieve high-density integration in a limited space.
[0122] The first liquid cooling pipe 406, the first power supply group 404, the multiple switching nodes 403, the second power supply group 405, and the second liquid cooling pipe 407 are sequentially arranged on the first side of the backplane assembly 408 along the first direction.
[0123] The first power supply group 404 can be arranged along the first direction on one side of multiple switching nodes 403.
[0124] The first power supply group 404 can provide a set of power supply units for power conversion and distribution for at least some of the computing nodes 402 and switching nodes 403. Its function is to convert the external input power into the DC voltage required by each module inside the server and to perform voltage regulation, filtering and protection.
[0125] The first power supply group 404 is arranged adjacent to multiple switching nodes 403, and is located between or adjacent to the first liquid cooling pipe 406 and the switching nodes 403, which can shorten part of the power supply path and facilitate unified heat dissipation management.
[0126] In one possible embodiment, the first power supply group 404 may include a plurality of parallel redundant power supply modules.
[0127] The length of the first power supply group 404 is usually coordinated with the length of the switching node 403 along the first direction. Its height and depth are set according to the power density, heat sink arrangement and plug-in maintenance space to ensure that it can meet the power supply requirements of high load without occupying too much computing node installation area.
[0128] The second power supply group 405 may refer to a collection of power supply units arranged on the other side of the multiple switching nodes 403 along the first direction and opposite to the first power supply group 404.
[0129] The function of the second power supply group 405 is to form a distributed power supply structure together with the first power supply group to balance the power supply load of different areas and provide redundant or partitioned power supply capabilities for the internal modules of the server.
[0130] The second power supply group 405 can provide power input to the corresponding compute node 402 and switch node 403 by connecting to the backplane assembly 408, power bus interface or independent power supply bus, and can be replaced independently during maintenance.
[0131] The second power supply group 405 and the first power supply group 404 can maintain a corresponding relationship in terms of their arrangement length along the first direction, and coordinate with the chassis side wall, the shape of the switching node 403 and the direction of the liquid cooling pipes, so as to achieve a balanced arrangement on both sides and avoid local congestion.
[0132] The first liquid cooling pipe 406 and the second liquid cooling pipe 407 are arranged on both sides of the plurality of exchange nodes 403 along the first direction.
[0133] The first liquid cooling pipe 406 can be arranged along the first direction on the first side of the first power supply group.
[0134] The first liquid cooling pipe 406 can be used as a liquid cooling passage component to transport cooling medium and exchange heat with adjacent heat sources. Its function is to introduce external cold sources or circulating cooling systems into the chassis and remove heat from the first power supply group 404 and its adjacent areas, thereby maintaining the power supply unit within the allowable temperature rise range.
[0135] The first liquid cooling pipe 406 is typically installed in conjunction with a cold plate or quick-connect fitting and can extend along the length of the chassis to allow the cooling medium to enter and exit in a predetermined direction.
[0136] The diameter, wall thickness, and bending radius of the first liquid cooling pipe 406 must be compatible with the flow requirements, pressure loss, and internal space of the chassis. Its position is located outside the first power supply group 404 to reduce structural interference with the switching node 403 and the computing node 402, and to facilitate quick connection with the external liquid cooling circulation system.
[0137] The second liquid cooling pipe 407 can be arranged along the first direction on the second side of the second power supply group.
[0138] The second liquid cooling pipe 407 can be used as a liquid cooling passage component to cool the second power supply group 405 and its adjacent heat source. Its function is to form a symmetrically distributed cooling system together with the first liquid cooling pipe 406, so that both power supply areas can obtain stable heat dissipation capacity and reduce heat diffusion to the computing node area.
[0139] The second liquid cooling pipe 407 is usually arranged adjacent to the outer boundary of the second power supply group 405 and is connected to the external cooling system through a quick-connect connector, a shunt connector or an independent circuit to achieve the circulation of the cooling medium.
[0140] The routing and installation spacing of the second liquid cooling pipe 407 should match the outer contour of the second power supply group 405 to avoid obstructing the disassembly and assembly passage and maintenance line of sight while maintaining cooling efficiency.
[0141] The distance between the first liquid cooling pipe 406 and the second liquid cooling pipe 407 along the first direction is less than the dimension of the computing node 402 along the first direction.
[0142] The dimension of the computing node 402 along the first direction is greater than the distance between the first liquid cooling pipe 406 and the second liquid cooling pipe 407, so that the computing node can be connected to the liquid cooling pipe and form a suitable insertion depth and maintenance space with the backplane assembly.
[0143] Multiple computing nodes 402 can be arranged on the second side of the backplane assembly 408 along the second direction, and the first direction and the second direction are perpendicular to each other, forming an orthogonal layout.
[0144] Multiple computing nodes 402 can be used as functional units to perform data processing, storage access and parallel computing tasks. Their role is to serve as the main computing power carrier of the server and to establish electrical and data signal connections with multiple switching nodes through the backplane assembly.
[0145] Multiple compute nodes 402 can be arranged along the second direction on the side of the backplane assembly facing away from the switching node. They can be installed in a board-type, drawer-type, or tray-type manner and fixed to the chassis by guide rails, slots, or limit brackets, making it easy to pull them out from the side of the chassis for maintenance.
[0146] The thickness and length of a single compute node 402 are typically matched with the chassis depth to meet the requirements for insertion depth and heat dissipation clearance with the backplane components. Multiple compute nodes 402 can be arranged in a compact parallel array, while reserving the necessary space for airflow or liquid cooling interfaces to ensure installation consistency and ease of replacement under high-density deployment.
[0147] The first power supply group 404 and the second power supply group 405 work together. In addition to the first power supply group 404 supplying power to the switching node 403, the two also jointly provide dual power support for multiple computing nodes 402, improving the redundancy and reliability of power supply, avoiding the downtime of computing node 402 due to the failure of a single power supply group, and ensuring the continuous operation of the server.
[0148] Multiple switching nodes 403 can be disposed on the first side of the backplane assembly. The multiple switching nodes 403 can be used as network switching functional units to implement data forwarding and interconnection communication between multiple computing nodes 402 and between computing nodes 402 and external networks. The function of the multiple switching nodes 403 is to aggregate, switch, and distribute the high-speed data streams of computing nodes 402, and to form short-path, highly consistent signal connections through the backplane assembly.
[0149] Multiple switching nodes 403 can be located on the opposite side of the backplane assembly facing the computing node 402, and can be arranged side by side as several adjacent modules along the first direction, and can be connected to the backplane assembly through plug-in ports, cable ports or board-to-board connectors.
[0150] Multiple switching nodes 403 can be matched with the array size of computing nodes 402, and their width and height can be controlled according to the port layout density of the backplane assembly, so that they form a communication layer corresponding to computing nodes 402 inside the chassis, thereby reducing cross-area cabling requirements and improving the consistency of switching links.
[0151] The backplane assembly 408 can be located in the middle of the chassis. The backplane assembly 408 can be used as a basic connection component to establish electrical interconnection and mechanical positioning between multiple switching nodes on the first side and multiple computing nodes on the second side. Its function is to provide a unified interface for high-speed data exchange, power distribution and structural support, and to serve as the reference interface for the module partitioning on both sides of the chassis.
[0152] The backplane assembly 408 can be fixed to the chassis frame or the intermediate load-bearing beam. Multiple switching nodes 403 are connected to the backplane through their first side ports, and multiple computing nodes 402 are connected to the backplane through their second side ports, thereby enabling data links and some power links to complete transmission along a shorter path.
[0153] The length and port layout of the backplane assembly 408 should be consistent with the module spacing of the switching node 403 and the computing node 402 to ensure plugging accuracy, signal integrity and assembly reliability, while providing a clear boundary for maintenance and replacement.
[0154] Multiple switching nodes 403 and multiple compute nodes 402 can be orthogonally connected through a backplane assembly 408. The backplane assembly 408 provides a precise interface for the docking of switching nodes 403 and compute nodes 402, enabling high-speed signal interaction and data transmission between them, ensuring that multiple compute nodes 402 work collaboratively. At the same time, the switching nodes 403 facilitate signal transfer between compute nodes 402 and external systems, improving the overall computing and interaction efficiency of the server.
[0155] This orthogonal layout design not only makes reasonable use of the internal space of the chassis 401 and improves space utilization, but also reduces signal interference between components. At the same time, it facilitates the individual disassembly and maintenance of each component, taking into account both structural rationality and functional reliability, and is suitable for the high-load and high-stability working requirements of the server.
[0156] The server provided in this application embodiment features a hierarchical layout centered on the backplane assembly. Multiple switching nodes are located on the first side of the backplane assembly, multiple computing nodes on the second side, and a first power supply group and a second power supply group are positioned on either side of the switching nodes. The first and second liquid cooling pipes are further arranged outside the power supply groups. This creates a layered, partitioned layout within the chassis, with each functional module unfolding in different directions and isolated from the others. Power paths, communication paths, and cooling paths are spatially ordered, reducing the probability of internal cable crossings and device obstruction, minimizing ineffective reserved space, and increasing the integration density of computing units, switching units, and power supply / heat dissipation units within the chassis. Furthermore, since the switching nodes are concentrated on one side of the backplane assembly and the computing nodes on the other, related signal links can be connected over short distances through the backplane assembly, reducing wiring complexity and improving accessibility during maintenance. The power supply groups and liquid cooling pipes are located further out, eliminating the need for extensive disassembly of computing and switching nodes when replacing power modules or inspecting piping, thereby improving overall maintenance efficiency and engineering adaptability.
[0157] Below, in conjunction with Figure 5 The back panel assembly 408 will be explained.
[0158] Figure 5 This is a schematic diagram of a backplane assembly provided in an embodiment of this application. Please refer to... Figure 5 , Figure 5 It includes a backplane assembly 408. The backplane assembly 408 includes a first backplane group 501 and a second backplane group 502, which are arranged in parallel.
[0159] The first power supply group 404 is disposed on the first side of the first backplane group 501, and the second power supply group 405 is disposed on the first side of the second backplane group 502.
[0160] Multiple computing nodes 402 are located on the second side of the first backplane group 501 and the second backplane group 502.
[0161] Among them, based on Figure 5 The coordinate system in the diagram allows for a correct understanding of the server's structure. The first direction X represents the server's width, the second direction Z represents the server's height, and the third direction Y represents the server's length. The first direction X, the second direction Z, and the third direction Y are perpendicular to each other.
[0162] The first backplane group 501 and the second backplane group 502 are arranged in parallel and symmetrically distributed, together providing an installation and docking carrier for the computing node 402 and the power supply group.
[0163] The first power supply group 404 is located on the first side of the first backplane group 501, and the second power supply group 405 is located on the first side of the second backplane group 502. This layout enables the power supply group and the backplane group to be precisely connected, which facilitates the rapid transmission of power signals, while avoiding interference with other components and improving space utilization.
[0164] Multiple computing nodes 402 are respectively located on the second side of the first backplane group 501 and the second backplane group 502, and are precisely connected to the first backplane group 501 and the second backplane group 502. They can obtain power support from the first power supply group 404 and the second power supply group 405 through the backplane group. Furthermore, the multiple computing nodes 402 are orthogonally connected to multiple switching nodes 403 to realize data interaction.
[0165] The backplane assembly structure, consisting of the first backplane group 501 and the second backplane group 502, not only has good structural stability, but also enables partitioned docking of power supply groups and computing nodes, clearly defines the functional division of each component, reduces signal interference, and facilitates the disassembly, assembly, maintenance and expansion of each component. It enables high-density deployment within the same chassis, improves space utilization and increases the computing power density of the server.
[0166] In one possible implementation, backplane assembly 408 includes a horizontal backplane 503 to which a plurality of switching nodes are connected;
[0167] The horizontal backplate 503 is located between the first backplate group 501 and the second backplate group 502 along the first direction;
[0168] The first power supply group supplies power to the horizontal backplane through the first backplane group, and / or the second power supply group supplies power to the horizontal backplane through the second backplane group.
[0169] Among them, the horizontal backplane 503 can be an intermediate backplane unit set between two sets of backplanes and used to carry the interconnection of switching nodes and power supply transfer.
[0170] The horizontal backplane 503 can provide a unified electrical connection interface for multiple switching nodes, enabling each switching node to complete high-speed signal interconnection and power input or distribution through the same backplane.
[0171] The horizontal backplate 503 can be extended along the first direction and installed between the first backplate group 501 and the second backplate group 502.
[0172] The length of the horizontal backplane 503 in the first direction is generally not less than the total width of the horizontal arrangement of multiple switching nodes.
[0173] The first power supply group 404 can input power to the horizontal backplane 503 through the first backplane group 501, or the second power supply group 405 can supply power to the horizontal backplane 503 through the second backplane group 502, or the two power supply groups can work together to supply power to the horizontal backplane 503. This allows the horizontal backplane 503 to form a unified power supply junction point and signal exchange node inside the chassis. This intermediate interconnection structure reduces the cross-area wiring length between exchange nodes, reduces connection loss and transmission impedance discontinuity, and improves power supply consistency and interconnection stability.
[0174] When the system starts up, the first power supply group and / or the second power supply group transmit DC power to the horizontal backplane via the corresponding backplane group. Each switching node then obtains working power and exchanges data signals through the connection interface with the horizontal backplane. This forms a continuous power distribution path and a high-speed interconnection path within the backplane assembly, enabling multiple switching nodes to work stably and collaboratively within a small structural space. This improves the internal communication efficiency, power supply reliability, and overall integration of the server.
[0175] Below, in conjunction with Figure 6 The horizontal backplate 503, the first backplate group 501, and the second backplate group 502 are explained.
[0176] Figure 6 This is a schematic diagram of another backplane assembly provided in an embodiment of this application. Please refer to... Figure 6 , Figure 6 Including the backplane assembly.
[0177] A switching node connector 601 corresponding to each switching node 403 is provided on the first side of the horizontal backplane 503.
[0178] Multiple switching nodes 403 are connected to their respective switching node connectors 601.
[0179] In this design, multiple switching nodes 403 are precisely connected to their corresponding switching node connectors 601. This design allows the horizontal backplane 503 to supply power to each switching node 403 through independent switching node connectors 601, while also facilitating the disassembly, assembly, and maintenance of individual switching nodes 403, thus improving maintenance convenience.
[0180] In one possible implementation, the first backplane group 501 includes a first power backplane 602 and a first computing node backplane 603.
[0181] The first power supply backplane 602 and the first computing node backplane 603 are parallel to each other and electrically connected;
[0182] The first power supply group 404 is electrically connected to the first power supply backplane 602 and is located on the side opposite to the first computing node backplane (603).
[0183] The first power backplane 602 is located on the first side of the first backplane group 501, corresponding to the first power group 404. The first computing node backplane 603 is located on the second side of the first backplane group 501, corresponding to the computing node 402 disposed on that side. The first power backplane 602 and the first computing node backplane 603 are electrically connected through the first connector 604, so that the power of the first power group 404 can be conducted to the first computing node backplane 603 through the first power backplane 602 and the first connector 604, thereby supplying power to the corresponding computing node 402.
[0184] The first power backplane 602 is provided with a plurality of first power connectors 605. The first power group 404 is fixedly mounted on the first power backplane 602 through the plurality of first power connectors 605, which not only ensures the stability of the power group installation, but also realizes the accurate transmission of power.
[0185] In one possible implementation, the second backplane group 502 includes a second power backplane 606 and a second computing node backplane 607.
[0186] The second power supply backplane 606 and the second computing node backplane 607 are parallel to each other and electrically connected.
[0187] The second power supply group 405 is electrically connected to the second power supply backplane 606 and is located on the side opposite to the second computing node backplane 607.
[0188] The second power supply backplane 606 is located on the first side of the second backplane group 502 and is connected to the second power supply group 405. The second computing node backplane 607 is located on the second side of the second backplane group 502 and is connected to the computing node 402 located on that side.
[0189] The second power supply backplane 606 and the second computing node backplane 607 are electrically connected through the second connector 608, enabling the conduction of power from the second power supply group 405 to the corresponding computing node 402.
[0190] The second power backplane 606 is provided with multiple second power connectors 609. The second power group 405 is fixedly mounted on the second power backplane 606 through these multiple second power connectors 609, ensuring that the power group is firmly installed and the power transmission is stable. Together with the first backplane group 501, it realizes the power supply and computing node docking functions of the whole machine.
[0191] In one possible implementation, the first computing node backplane 603 is electrically connected to a portion of the computing nodes 402; the second computing node backplane 607 is electrically connected to the remaining computing nodes 402.
[0192] The first computing node backplane 603 can be disposed within the first backplane group 501 and arranged parallel to the first power supply backplane 602. Some computing nodes are installed on the corresponding side of the first computing node backplane 603 or connected to it through connectors. The second computing node backplane 607 can be disposed within the second backplane group 502 and arranged parallel to the second power supply backplane 606. The remaining computing nodes are installed on the corresponding side of the second computing node backplane 607 or connected to it through connectors, so that different computing nodes can complete electrical connection and mechanical support on the two backplane units respectively.
[0193] Based on the above analysis, it can be seen that this structure can improve the organization and maintainability of computing node access without changing the overall server architecture, and helps to improve the overall deployment efficiency, connection reliability and adaptability to high computing scenarios.
[0194] In one possible implementation, a first computing node connector 610 corresponding to each computing node 402 is provided on the first computing node backplane 603, and a second computing node connector 611 corresponding to each computing node 402 is provided on the second computing node backplane 607.
[0195] Multiple computing nodes 402 are respectively connected to the corresponding first computing node connector 610 and second computing node connector 611.
[0196] The first computing node connector 610 and the second computing node connector 611 correspond one-to-one with the computing node 402, and their layout positions are precisely adapted to the installation positions of the computing node 402 to ensure smooth docking.
[0197] In terms of docking, multiple computing nodes 402 are bidirectionally docked with corresponding first computing node connectors 610 and second computing node connectors 611, that is, one end of each computing node 402 is connected to the corresponding first computing node connector 610 on the first computing node backplane 603, and the other end is connected to the corresponding second computing node connector 611 on the second computing node backplane 607.
[0198] This design achieves a stable connection between the computing node 402 and the backplanes of the two computing nodes, improving installation reliability. On the other hand, it enables the computing node 402 to obtain power from the first power supply group 404 and the second power supply group 405 through the first computing node connector 610 and the second computing node connector 611, further improving the stability and efficiency of server operation.
[0199] In one possible implementation, the number of first power connectors 605 provided on the first power backplane 602 is greater than the number of second power connectors 609 provided on the second power backplane 606.
[0200] The differentiated design between the first power backplane 602 and the second power backplane 606 is tailored to their respective power supply requirements.
[0201] In one possible implementation, the number of power supply units in the first power supply group 404 is greater than the number of power supply units in the second power supply group 405.
[0202] In one possible implementation, the first power backplane 602 is provided with M first power connectors 605, where M is an integer greater than 1.
[0203] The second power backplane 606 is provided with M-1 second power connectors 609.
[0204] The first power supply group 404 needs to supply power to the corresponding computing node 402 and switching node 403. M first power connectors 605 can ensure the stability and load capacity of power transmission. The second power supply group 405 supplies power to the corresponding computing node 402. M-1 second power connectors 609 can meet the power supply requirements. At the same time, the design of M is greater than 1 can avoid power interruption caused by the failure of a single connector, and take into account redundancy and cost rationality.
[0205] M first power connectors 605 correspond to M first power units 613.
[0206] The first power supply backplane 602 is equipped with a first power supply bus, and M first power supply units 613 are electrically connected to the first power supply bus to form a unified power supply transmission link.
[0207] M first power supply units 613 work together to transmit power to the first power supply bus, and then to the first computing node backplane 603, ultimately providing stable power to the computing nodes installed on the first computing node backplane 603 and ensuring the normal operation of the computing nodes.
[0208] The power of multiple first power supply units 613 is integrated through the first power supply bus to achieve power distribution and load balancing, avoid failure caused by excessive load on a single power supply unit, and ensure that power can be transmitted stably and efficiently to the backplane 603 of the first computing node.
[0209] In this way, multiple first power supply units 613 work together to increase power supply capacity and meet the power supply needs of multiple computing nodes at the same time. Furthermore, the integration function of the first power supply bus achieves load balancing, extends the service life of the first power supply unit 613, and improves power supply reliability. In addition, the maintenance and replacement of a single first power supply unit 613 reduces the difficulty of operation and maintenance, and the failure of a single power supply unit will not affect the normal operation of the overall power supply link.
[0210] M-1 second power connectors 609 correspond to M-1 second power units 614.
[0211] The working principle of the second power supply unit 614 can be referred to the working principle of the first power supply unit 613 mentioned above, and will not be repeated here.
[0212] In one possible implementation, a management controller 612 is also provided on the second power backplane 606, wherein the management controller 612 is electrically connected to at least one of the plurality of computing nodes 402, the plurality of switching nodes 403, the first power group 404, and the second power group 405.
[0213] The management controller 612 can monitor the operating status of each computing node 402, the signal transmission status of each switching node 403, and the power supply voltage and load status of the first power supply group 404 and the second power supply group 405 in real time. It can adjust the working mode of each component according to the operating requirements and promptly report abnormal status. At the same time, it can realize fault location, which facilitates later inspection and maintenance, greatly improves the controllability and operation and maintenance efficiency of the server, and fits the design concept of high efficiency and reliability of the whole machine.
[0214] Since the second power backplane 606 has M-1 second power connectors 609, one less than the first power backplane 602, the management controller 612 can be placed on the side of the second power backplane 606. This allows the first power backplane 602 to centrally arrange a sufficient number of first power connectors 605, fully ensuring that it meets the power supply load requirements of both the computing node 402 and the switching node 403, without needing to reserve installation space for the management controller. The second power backplane 606 can then use the space freed up by the one less power connector to install the management controller 612, achieving reasonable use of space. This not only does not affect the normal power supply of the second power group 405, but also allows the management controller 612 to interface with the core components in close proximity, balancing power supply reliability and efficient management.
[0215] Below, in conjunction with Figure 7 The following explanation is provided for computing node 402.
[0216] Figure 7 This is a schematic diagram of a computing node provided in an embodiment of this application. Please refer to... Figure 7 , Figure 7 Includes computing nodes.
[0217] Computing node 402 includes a processor motherboard 701, a baseboard module 702, and a middle backplane 703;
[0218] The processor motherboard 701 and the substrate module 702 are disposed on both sides of the middle backplate along a third direction and are electrically connected through the middle backplate. The third direction is perpendicular to the first direction and the second direction, respectively.
[0219] The first substrate 7021 and the second substrate 7022 in the substrate module 702 are stacked along the second direction, and multiple accelerators are disposed in the first substrate 7021 and the second substrate 7022.
[0220] Among them, the processor motherboard 701 can be used to carry the server's main processor and provide control and computing interfaces to the outside world to realize instruction scheduling and storage access of computing nodes.
[0221] The processor motherboard 701 can be installed on the first or upper side of the computing node and is arranged opposite to the middle backplane 703 in the third direction. The two are electrically connected through a board-to-board connector or a high-speed signal connector, so that the high-speed bus signal, power supply signal and management signal of the processor side are introduced into the middle backplane and then distributed to the baseboard module 702.
[0222] During operation, the processor motherboard 701 performs processor calculations, memory access, and node control, and transmits the calculation results and control information to the baseboard module 702 through the middle backplane, thus forming the core control unit of the computing node.
[0223] The substrate module can be used to centrally arrange graphics processors, accelerators or other heterogeneous computing units in a limited space to improve the parallel processing capability of computing nodes.
[0224] The baseboard module 702 and the processor motherboard 701 are located on opposite sides of the middle backplate 703, and the signal and power supply are transferred through the middle backplate.
[0225] The substrate module 702 can be implemented as a double-layer stacked module in terms of shape.
[0226] Since the first substrate 7021 and the second substrate 7022 are stacked along the second direction, the overall thickness of the substrate module 702 is mainly determined by the stack thickness of the two boards and the installation space of the heat dissipation components. It usually needs to match the installation height of the computing nodes in the chassis to avoid interference with adjacent components.
[0227] During operation, the substrate module 702 carries multiple accelerators to perform parallel computing. Through the stacked structure, it shortens the interconnection distance between boards and reduces signal path loss, thereby improving data exchange efficiency and computing density. This is conducive to achieving higher heterogeneous computing power configuration in a limited space.
[0228] The backplane 703 can be an intermediate electrical connection board located between the processor motherboard 701 and the baseboard module 702.
[0229] The backplane 703 can be sandwiched between the processor motherboard 701 and the substrate module 702 along a third direction and electrically connected to both of them. Specifically, the transmission of signals and power can be accomplished through high-speed connectors, blind-mating terminals or inter-board docking structures.
[0230] When the system starts up, the processor motherboard 701 first distributes power and control signals to the baseboard module 702 through the backplane 703. Then, multiple accelerators in the baseboard module 702 simultaneously receive tasks and perform parallel calculations on the first baseboard 7021 and the second baseboard 7022. The calculation results are then returned to the processor motherboard via the backplane for aggregation, scheduling, and output. Since the processor motherboard and the baseboard module are respectively located on both sides of the backplane, and the baseboard module uses a stacked first and second baseboard to support multiple accelerators, it is possible to shorten the length of critical signal links, reduce interconnection complexity, and improve the deployment density and data exchange efficiency of heterogeneous computing units while maintaining a compact form factor of the computing node. This makes the computing node more suitable for high-density deployment and stable operation in high-performance servers.
[0231] In one possible implementation, a plurality of accelerators are disposed on a first side of a first substrate 7021, and a plurality of accelerators are disposed on a second side of a second substrate 7022.
[0232] A first substrate 7021 and a second substrate 7022 are stacked together, with the first side of the first substrate 7021 facing the second side of the second substrate 7022.
[0233] The processor motherboard 701 is connected to multiple accelerators on the first substrate 7021 and multiple accelerators on the second substrate 7022 via the middle backplate 703.
[0234] The substrate module 702 can be used to carry and arrange board-level components of multiple accelerators. Its function is to integrate the accelerators in a layered manner within a limited space to increase the device density of computing nodes and shorten the signal path between them and the processor motherboard.
[0235] The first substrate 7021 and the second substrate 7022 can be implemented using high-density interconnect printed circuit boards, metal substrates or composite laminates, respectively. After being stacked, they are kept at a predetermined distance by support pillars, positioning pins, screw connectors or snap-on brackets, and the first side of the first substrate 7021 faces the second side of the second substrate 7022, thereby forming a face-to-face double-layer arrangement structure.
[0236] The first substrate 7021 and the second substrate 7022 are carrier board structures used to mount multiple accelerators respectively. Together, they form a double-layer substrate module to support high-density computing devices and provide an electrical interconnection interface. Their function is to reduce the congestion caused by a single-layer planar arrangement by distributing the accelerators on two opposing substrates, allowing similar devices to be arranged in a vertically partitioned manner. This also enables the processor motherboard 701 to access the upper and lower layers of accelerators in parallel through the backplane 703, thereby improving board-level integration efficiency. The first substrate 7021 is located on the upper layer or one side of the stacked structure, and the second substrate 7022 is located on the lower layer or the opposite side. The two are fixed together by spacers, positioning frames, or stacked connectors. The first side of the first substrate 7021 faces the second side of the second substrate 7022, so that the working surfaces of the two accelerator-carrying components correspond to each other.
[0237] The processor motherboard 701 establishes stable connections with multiple accelerators on the first substrate 7021 and multiple accelerators on the second substrate 7022 via a backplane 703. The backplane 703 acts as a signal interaction hub, precisely transmitting control commands and computational tasks from the processor motherboard 701 to each accelerator, while simultaneously aggregating and feeding back the computational results from each accelerator to the processor motherboard 701. This achieves efficient collaboration between the processor motherboard and the accelerators, further unleashing the computational potential of the computing node and ensuring that the computing node can stably handle high-load computational demands.
[0238] Because the first substrate 7021 and the second substrate 7022 are stacked, with the first side facing the second side of the second substrate 7022, multiple accelerators can form a relatively compact and orderly distribution in the upper and lower layers. This allows devices that originally needed to be deployed on the same plane to be distributed to two adjacent working surfaces, thus significantly reducing the device density on a single-layer board. The backplane 703 in the processing stage acts as a cross-layer transfer mechanism. On the one hand, it transmits the high-speed data link, control signals, and management signals of the processor motherboard 701 to the accelerators on the two substrates. On the other hand, it can also transmit the status information, calculation results, or fault information generated by each accelerator back to the processor motherboard 701, enabling unified scheduling. Since multiple accelerators are arranged on opposite sides of the first substrate 7021 and the second substrate 7022, wiring can be transferred along the backplane 703 over short distances, reducing the problems of cross-wiring, signal detours, and device obstruction that easily occur in traditional single-layer high-density arrangements. It also provides clearer spatial conditions for inter-board heat dissipation, local airflow organization, or liquid cooling plate arrangement.
[0239] Based on the above analysis, it can be seen that this structure can increase the number of accelerators integrated without significantly increasing the area occupied by the computing node plane, and achieve unified connection of the upper and lower double-layer accelerators through the middle backplane. Therefore, it helps to improve the computing power density of the computing node, improve the internal wiring order, and reduce the degree of interference to single-layer devices during maintenance.
[0240] In one possible implementation, computing node 402 further includes a first liquid cooling plate 704, which is disposed between a first substrate 7021 and a second substrate 7022.
[0241] The first liquid cooling plate 704 is connected to the first liquid cooling pipe and the second liquid cooling pipe respectively.
[0242] The first liquid cooling plate 704 is disposed between the first substrate 7021 and the second substrate 7022, and is closely attached to the opposite surfaces of the two substrates. It can directly absorb the heat generated by the accelerators on the two substrates during operation, achieve efficient heat dissipation, and avoid high temperature affecting the accelerator's computing performance and the lifespan of the substrate module.
[0243] The first liquid cooling plate 704 is connected to the first liquid cooling pipe and the second liquid cooling pipe, respectively, and is connected to the server's overall liquid cooling circulation system. During operation, the liquid cooling medium flows into the first liquid cooling plate 704 through the first liquid cooling pipe, absorbs heat, and is then discharged through the second liquid cooling pipe, completing the heat circulation and discharge.
[0244] Below, in conjunction with Figure 8 The computing nodes will be further explained.
[0245] Figure 8 This is a schematic diagram of another computing node structure provided in an embodiment of this application. Please refer to... Figure 8 , Figure 8 This includes compute node 402.
[0246] The computing node also includes a network interface module 803, a network card 804, and a hard disk module 805, among which,
[0247] The processor motherboard 701, the second liquid cooling plate 802, and the network card interface module 803 are stacked along the second direction;
[0248] The network card 804 and hard disk module 805 are laid flat on the side of the network card interface module 803, away from the processor motherboard 701;
[0249] The second liquid cooling plate 802 is connected to the first liquid cooling plate 704;
[0250] The sum of the dimensions of the hard disk module 805, the network card interface module 803, and the processor motherboard 701 along the second direction is less than or equal to the dimension of the baseboard module 702 along the second direction.
[0251] Among them, the network card interface module 803 can be used to carry the network card electrical connection interface and realize the interconnection between the computing node and the external network. Its function is to provide stable plug-in, fixation and signal conversion conditions for the network card 804, and transmit the network signal to the processor motherboard 701 or related circuits on the backplane via the interface.
[0252] The 804 network interface card (NIC) can be a communication component used to provide network connectivity, enabling data transmission between the server and external networks, storage, or other nodes.
[0253] The hard disk module 805 can be a module for installing storage media, whose function is to provide local data storage for computing nodes.
[0254] The processor motherboard 701 and the network card interface module 803 are stacked along the second direction. The network card 804 and the hard disk module 805 are laid flat on the side of the network card interface module 803 away from the processor motherboard 701. The first signal connector group and the second signal connector group on the second surface of the network card interface module 803 are located on the side close to the middle backplate, thus forming a plug-in structure facing the middle backplate.
[0255] The sum of the dimensions of the processor motherboard 701, the network card interface module 803, and the hard disk module 805 along the second direction is less than or equal to the dimension of the substrate module 702 along the second direction, so as to ensure that they can be stacked and accommodated.
[0256] The network card 804 can be installed on top of the network card interface module 803 and communicate with the processor motherboard 701 or the accelerator board through the first signal connector.
[0257] The network card 804 and the hard disk module 805 can be laid flat on the side of the network card interface module 803 away from the processor motherboard 701. The hard disk module 805 can communicate with the processor motherboard 701 through the second signal connector.
[0258] When the system starts up, the processor motherboard 701 and the network interface module 803 are stacked along the second direction. The network interface module 803, as a network function carrying unit, connects the external network interface and the internal high-speed channel. The network card 804 is installed flat on the side of the network interface module 803 away from the processor motherboard 701, and completes link establishment, rate negotiation and message transmission and reception with a small installation thickness. The hard disk module 805 is also installed flat on this side and undertakes the reading and writing tasks of the local storage medium, so that both the network communication link and the storage link can be arranged within the limited space of the second direction.
[0259] Since the sum of the dimensions of the hard disk module 805, the network card interface module 803, and the processor motherboard 701 along the second direction is less than or equal to the dimension of the baseboard module 702 along the second direction, each module can be stably accommodated within the installation range reserved by the baseboard module 702 after assembly, and maintain the necessary assembly gap and heat dissipation space to avoid collision of the stacked structure or exceeding the outer envelope.
[0260] Based on the above arrangement, network data can quickly enter the processor motherboard 701 via the network card interface module 803 and the network card 804, and storage data can be exchanged with the processor motherboard 701 via the hard disk module 805. This enables the computing node 402 to achieve compact integration of network and storage functions while maintaining a high computing power configuration, thereby improving the functional density per unit volume and reducing wiring complexity, and thus improving the overall assembly efficiency and system integration.
[0261] In one possible implementation, computing node 402 further includes a second liquid-cooled plate 802, wherein,
[0262] The second liquid cooling plate 802 is disposed between the network card interface module 803 and the processor motherboard 701 along the second direction;
[0263] The second liquid cooling plate 802 is connected to the first liquid cooling plate 704.
[0264] The second liquid cooling plate 802 can be used to conduct heat and cool the heat source on the processor motherboard 701 side. Its function is to work with the first liquid cooling plate 704 to form a graded liquid cooling path to reduce the temperature rise of the processor motherboard and its adjacent interface areas and improve heat distribution.
[0265] The second liquid cooling plate 802 can be arranged between the processor motherboard 701 and the network card interface module 803, and is connected to the first liquid cooling plate 704 through pipe joints, liquid guiding channels or thermal coupling structures. After the first liquid cooling plate undertakes the main heat dissipation on the substrate module side, the second liquid cooling plate further compensates for the local hot spots near the motherboard and interface area.
[0266] Based on the above structural configuration, after the second liquid cooling plate 802 is connected to the first liquid cooling plate 704, they can jointly form a through-type liquid cooling heat conduction link, so that the heat of the accelerator in the substrate module 702 is first discharged through the first liquid cooling plate, and then the second liquid cooling plate further cools the processor motherboard 701 and its adjacent areas. At the same time, the network card interface module 803 is located between the processor motherboard and the network card and hard disk module, which can integrate signal connection, fixed support and spatial isolation into one, avoiding the network card 804 and hard disk module 805 directly occupying the main heat dissipation channel. Because the processor motherboard 701, the second liquid cooling plate 802, and the network interface module 803 are stacked, and the network card 804 and the hard disk module 805 are laid flat on the side away from the processor motherboard, the stacking height of the entire computing node is controlled. The first height formed by the hard disk module 805, the network interface module 803, the second liquid cooling plate 802, and the processor motherboard 701 can be less than or equal to the second height formed by the baseboard module 702, so that the newly added network and storage functions will not exceed the original height boundary of the baseboard module. Thus, during system startup and operation, the data flow can enter the network interface module 803 via the network card 804 and be forwarded to the processor motherboard 701. Storage access requests can be locally responded to by the hard disk module 805, while the heat generated by the processor motherboard and its control circuits is dissipated step by step through the second liquid cooling plate 802 and the first liquid cooling plate 704, forming a relatively independent yet compact and coordinated layout relationship between signal paths, storage paths, and heat dissipation paths.
[0267] This approach not only improves the utilization of internal space in computing nodes and reduces interference from stacked devices to surrounding structures, but also integrates network and storage functions without significantly increasing node height. Furthermore, the use of dual liquid cooling plates reduces localized heat buildup, thereby enhancing the uniformity of heat dissipation, connection stability, and ease of assembly and maintenance of servers under high-density deployment conditions.
[0268] The processor motherboard 701, the second liquid cooling plate 802, and the network card interface module 803 are stacked in sequence and closely fitted together, which not only makes reasonable use of the internal vertical space of the computing node, but also enables efficient collaboration among the components.
[0269] The network card 804 and hard disk module 805 are centrally located on the side of the network card interface module 803 away from the processor motherboard 701. This layout enables the partitioned installation of the network card and hard disk, avoiding interference with components such as the processor motherboard and liquid cooling plate. It also facilitates the individual disassembly and maintenance of the network card and hard disk module. The network card 804 is used to realize the network connection between the computing node 402 and external devices, ensuring smooth data exchange. The hard disk module 805 is used to store computing data and related programs, improving the storage capacity of the computing node.
[0270] The 804 network interface card can include a first network interface card and a second network interface card.
[0271] For example, the first network card can be a high-speed network card with unlimited bandwidth, and the second network card can be a 25G network card.
[0272] The second liquid cooling plate 802 is connected to the first liquid cooling plate 704, enabling coordinated heat dissipation between the two liquid cooling plates and forming a complete internal liquid cooling loop for the computing node. The second liquid cooling plate 802 can directly absorb the heat generated by the processor motherboard 701 and the network card interface module 803 during operation. Through its connection with the first liquid cooling plate 704, the heat is conducted to the first liquid cooling plate 704, and then discharged to the overall heat dissipation system via the first and second liquid cooling pipes, further improving the heat dissipation efficiency of the computing node.
[0273] The assembly of computing nodes is performed through the following steps:
[0274] Step 1: Fix the backplate 703 to the chassis using structural components, connect the processor motherboard 701 to the backplate 703, stack and assemble the first substrate 7021, the first liquid cooling plate 704, and the second substrate 7022, and connect the assembled first substrate 7021, the first liquid cooling plate 704, and the second substrate 7022 to the backplate 703.
[0275] Step 2: Assemble the second liquid cooling plate 802 onto the processor motherboard 701;
[0276] Step 3: Assemble the network card 804 and hard disk module 805 onto the network card interface module 803;
[0277] Step 4: Assemble the assembled network card interface module 803 onto the second liquid cooling plate 802, and make the network card 804 and hard disk module 805 contact the second liquid cooling plate 802 respectively to achieve heat dissipation.
[0278] Below, in conjunction with Figure 9 The processor motherboard 701 will be further explained.
[0279] Figure 9 This is a schematic diagram of a processor motherboard provided in an embodiment of this application. Please refer to [link / reference]. Figure 9 , Figure 9 Including the processor motherboard 701.
[0280] The processor motherboard 701 includes a processor 901, multiple memory modules 902, an input power connector 903, multiple power supply connectors 904, and multiple signal connectors 905;
[0281] An input power connector 903, multiple power supply connectors 904, and multiple signal connectors 905 are disposed on the first side of the processor motherboard 701, which is the side closest to the backplate 703.
[0282] The processor 901 is located in the central area of the processor motherboard 701, and multiple memory modules 902 are located on both sides of the processor 901.
[0283] The processor motherboard 701 may also include a power converter 906 and a first hard disk 907.
[0284] The power converter 906 can convert the first voltage value corresponding to the power supply to the second voltage value corresponding to the motherboard.
[0285] The processor 901, multiple memory modules 902, multiple power supply connectors 904, and multiple signal connectors 905 are all located on the first side of the processor motherboard 701, which is close to the second liquid cooling plate 802, so as to facilitate contact with the second liquid cooling plate 802 and achieve efficient heat dissipation.
[0286] The input power connector 903, multiple power supply connectors 904 and multiple signal connectors 905 are located on the first side of the processor motherboard 701, which is the side close to the backplane 703, so as to facilitate precise docking with the backplane 703 and realize stable power and signal transmission.
[0287] The processor 901 is located in the central area of the processor motherboard 701, and multiple memory modules 902 are respectively located on both sides of the processor 901, forming a symmetrical layout. This layout can shorten the signal transmission path between the processor 901 and the memory modules 902, reduce signal interference, improve data read and write efficiency, and ensure efficient collaboration between the processor and the memory.
[0288] In one possible implementation, the plurality of power supply connectors 904 include a first power supply connector 9041, a second power supply connector 9042, and a third power supply connector 9043;
[0289] The first power supply connector 9041 is used to supply power to the hard disk module 805;
[0290] The second power supply connector 9042 is used to supply power to the network card 804;
[0291] The third power connector 9043 is used to power the network card interface module 803.
[0292] The first power supply connector 9041 is used to supply power to the hard disk module 805, ensuring the data storage and retrieval functions of the hard disk module.
[0293] The second power connector 9042 is used to power the network card 804, ensuring a stable network connection for the network card.
[0294] The third power connector 9043 is used to power the network card interface module 803 and support the signal conversion function of the network card interface module.
[0295] The multiple signal connectors 905 include a first signal connector 9051, a second signal connector 9052, and a third signal connector 9053;
[0296] The first signal connector 9051 is used to connect the network card 804;
[0297] The second signal connector 9052 is used to connect the hard disk module 805;
[0298] The third signal connector 9053 is used to connect the substrate module 702.
[0299] The first signal connector 9051 is used to connect the network card 804 to enable high-speed data interaction between the processor motherboard and the network card.
[0300] The second signal connector 9052 is used to connect the hard disk module 805 to ensure smooth data transmission between the processor and the hard disk.
[0301] The third signal connector 9053 is used to connect the substrate module 702 to realize the signal interface between the processor motherboard and the accelerator on the substrate module, supporting high-performance computing tasks.
[0302] The layout design of the processor motherboard not only ensures efficient collaboration of core components, but also improves the overall reliability and maintainability of the computing node through clear division of power and signal interfaces, which is highly consistent with the overall system's efficient and reliable design philosophy.
[0303] Below, in conjunction with Figure 10 The network interface module 803 will be further explained.
[0304] Figure 10 This is a schematic diagram of a network interface module provided in an embodiment of this application. Please refer to [link / reference]. Figure 10 , Figure 10 Including the network card interface module 803.
[0305] The first side of the network interface module 803 is provided with multiple network interfaces 1001 and multiple network interface controllers, and the multiple network interface controllers are respectively connected to the corresponding network interfaces 1001.
[0306] Multiple network interfaces 1001 are arranged sequentially on the first side of the network card interface module 803, which is the side away from the middle backplane;
[0307] A first signal connector group 1002 and a second signal connector group 1003 are provided on the second side of the network card interface module 803;
[0308] The first signal connector group 1002 and the second signal connector group 1003 are disposed on the second side of the network card interface module 803, which is the side closer to the middle backplate;
[0309] The first signal connector group 1002 is used to connect the first substrate, and the second signal connector group 1003 is used to connect the second substrate.
[0310] The network interface module 803 has multiple network interfaces 1001 and multiple network interface controllers on its first surface. The multiple network interface controllers are connected to the corresponding network interfaces 1001 to realize the parsing, forwarding and control of network signals, and to ensure the stability and high speed of network connection.
[0311] Multiple network interfaces 1001 are arranged sequentially on the first side of the network card interface module 803. This first side is the side away from the back panel, which facilitates direct connection with external network cables and improves the convenience of network expansion.
[0312] The second side of the network card interface module 803 is provided with a first signal connector group 1002 and a second signal connector group 1003. Both are located on the second side of the network card interface module 803, which is the side close to the middle backplate, so as to facilitate precise docking with the middle backplate and the base plate module.
[0313] The first signal connector group 1002 is used to connect to the first substrate, and the second signal connector group 1003 is used to connect to the second substrate, so as to realize high-speed signal interaction between the network card interface module and the accelerator on the substrate module, efficiently transmit network data to each accelerator for processing, and at the same time feed back the calculation results to the network interface, ensuring the network and computing collaboration efficiency of the computing node.
[0314] The layout design of this network card interface module not only realizes the core functions of network expansion and signal transfer, but also optimizes space utilization through double-sided layout, improves the overall integration and maintainability of computing nodes, and is highly consistent with the design concept of high efficiency and reliability of the whole machine.
[0315] Below, in conjunction with Figure 11 The first substrate 7021 will be explained.
[0316] Figure 11 This is a schematic diagram of the structure of a first substrate provided in an embodiment of this application. Please refer to [link / reference]. Figure 11 , Figure 11 Including the first substrate 7021.
[0317] A first switch chip 1101 and multiple fourth signal connectors 1102 are also provided on the first side of the first substrate 7021;
[0318] Multiple fourth signal connectors 1102 are respectively disposed on both sides of the first switch chip;
[0319] One end of the fourth signal connector 1102 is connected to the connector in the first signal connector group 1002, and the other end is connected to the first switch chip 1101.
[0320] Among them, multiple fourth signal connectors 1102 are symmetrically arranged on both sides of the first switch chip 1101. The layout is neat and the distance between them and the first switch chip 1101 is close, which can shorten the signal transmission path and reduce signal interference.
[0321] One end of the fourth signal connector 1102 is connected to the corresponding connector in the first signal connector group 1002 of the network card interface module 803, and the other end is connected to the first switch chip 1101. Its core function is to transfer the network signal transmitted by the network card interface module to the first switch chip 1101, and the first switch chip 1101 performs signal distribution and processing to realize efficient interaction between the network signal and the components of the first substrate.
[0322] In one possible implementation, a fifth signal connector 1103 and a plurality of first orthogonal connectors 1104 are also provided on the first side of the first substrate;
[0323] The fifth signal connector 1103 is disposed on the first side of the first substrate 7021, and a plurality of first orthogonal connectors 1104 are disposed on the second side of the first substrate 7021. The first side of the first substrate 7021 is the side close to the middle back plate, and the second side of the first substrate 7021 is the side close to the switching node 403.
[0324] The fifth signal connector 1103 is used to connect to the third signal connector 9053 of the processor motherboard 701;
[0325] Multiple first orthogonal connectors 1104 are used to connect multiple switching nodes 403.
[0326] The fifth signal connector 1103 is disposed on the first side of the first substrate 7021, which is the side close to the middle back plate; a plurality of first orthogonal connectors 1104 are disposed on the second side of the first substrate 7021, which is the side close to the switching node 403. The layout position is precisely adapted to the docking components, thereby improving docking efficiency and stability.
[0327] The fifth signal connector 1103 can be used to connect to the third signal connector 9053 of the processor motherboard 701, so as to realize the signal communication between the first substrate 7021 and the processor motherboard 701, transmit the control instructions of the processor to each component of the first substrate, and at the same time feed back the working data of the accelerator and switch chip on the first substrate to the processor.
[0328] Multiple first orthogonal connectors 1104 can be used to connect multiple switching nodes 403, realize orthogonal signal interaction between the first substrate and the switching nodes 403, ensure smooth data transmission between the computing node 402 and the switching node 403, and support the overall signal coordination.
[0329] The first substrate also includes an accelerator 1105, a power connector 1106, and a programmable logic device 1107.
[0330] Accelerator 1105 can be used to enhance the computing power of the first substrate and work with other accelerators to complete high-load computing tasks.
[0331] The accelerator can employ hardware acceleration devices such as graphics processing units (GPUs) and field-programmable gate arrays (FPGAs).
[0332] The power connector 1106 can be used to access power and provide stable power support for all electronic components on the first substrate.
[0333] The programmable logic device 1107 can be flexibly configured with signal processing logic according to actual operating requirements, improving the adaptability and expandability of the first substrate and adapting to the computing and signal processing needs in different scenarios.
[0334] The overall layout of the first substrate 7021 takes into account multiple functions such as signal transmission, computing, and power supply. The components are clearly divided and closely connected, which not only improves space utilization but also ensures the stable implementation of each function. The collaborative design with the substrate module, processor motherboard, network card interface module and switching node further improves the operating efficiency and reliability of the computing node and the whole machine.
[0335] In one possible implementation, a second switch chip and a plurality of sixth signal connectors are also disposed on the second side of the second substrate;
[0336] Multiple sixth signal connectors are respectively located on both sides of the second switch chip;
[0337] One end of the sixth signal connector is connected to the connector in the second signal connector group, and the other end is connected to the second switch chip.
[0338] The working principle of the second substrate is similar to that of the first substrate, and will not be elaborated here.
[0339] Below, in conjunction with Figure 12 The following explanation is provided for exchange node 403.
[0340] Figure 12This is a schematic diagram of a switching node provided in an embodiment of this application. Please refer to [link / reference]. Figure 12 , Figure 12 This includes exchange node 403.
[0341] The switching node 403 includes a second orthogonal connector 1201, a switching processor 1202, a hardware manager 1203, and a third liquid cooling pipe 1204;
[0342] The second orthogonal connector 1201 is disposed on the first side of the switching node 403, and the third liquid cooling pipe 1204 is disposed on the second side of the switching node 403. The first side of the switching node 403 is the side closer to the computing node, and the second side of the switching node 403 is the side farther away from the computing node.
[0343] The second orthogonal connector 1201 is located on the first side of the exchange node 403, and the third liquid cooling pipe 1204 is located on the second side of the exchange node 403. This layout avoids interference between components and enables precise adaptation with the corresponding docking components.
[0344] The second orthogonal connector 1201 is specifically designed to connect to computing nodes. Specifically, it connects to the first orthogonal connectors on the first and second substrates. Through orthogonal connection, it enables high-speed signal interaction between the switching node and the computing node, ensuring smooth data transmission and supporting collaborative computing of multiple computing nodes.
[0345] The switching processor 1202 can be used to receive and analyze various signals, complete signal distribution and forwarding, regulate signal transmission paths, and improve signal interaction efficiency.
[0346] The Hardware Manager 1203 can be used to monitor the operating status of each component of the switching node, promptly report fault information, facilitate troubleshooting and repair by maintenance personnel, and ensure the stable operation of the switching node.
[0347] The third liquid cooling pipe 1204 can be used to dissipate heat from various electronic components of the switching node, and promptly remove the heat generated by components such as the switching processor 1202 during operation, so as to avoid high temperature affecting signal transmission and equipment life. It works in conjunction with the whole liquid cooling system to form all-round heat dissipation support.
[0348] The overall layout and functional design of the switching node 403 not only achieves efficient connection with the computing node, but also ensures its own operational stability and heat dissipation reliability. It is highly consistent with the collaborative design concept of various server components, further improving the signal interaction efficiency and operational stability of the whole machine.
[0349] Below, in conjunction with Figure 13 This section provides an overall description of the server architecture.
[0350] Figure 13This is a schematic diagram of a server architecture topology provided for an embodiment of this application. Please refer to [link / reference]. Figure 13 , Figure 13 It includes multiple computing nodes 402, multiple switching nodes 403, power supply components 1301, management controller 612 and clock buffer 1305, and the power supply components include a first power supply group and a second power supply group.
[0351] The computing node 402 is the core computing carrier of the server. Its internal structure includes a processor motherboard 701, a first substrate 7021, a second substrate 7022, a first signal connector group 1002 and a second signal connector group 1003 in the network card interface module, a network card 804, and a hard disk module 805. Each component has a clear division of labor and works closely together to form a complete computing unit.
[0352] The processor motherboard 701 is the core control and computing core of the computing node 402, integrating a processor 901, multiple memory modules 902, a power converter 906, multiple power supply connectors 904, multiple signal connectors 905, and a hardware management controller 1302. The hardware management controller 1302 is responsible for the core duties of monitoring and controlling the overall system operation. It establishes communication connections with the processor 901, the power supply module 1301, and the hardware of each node, and can collect the overall system operating parameters in real time, enabling rapid fault diagnosis, remote operation and maintenance management, and intelligent power control, providing core assurance for the stable operation of the server. The first substrate 7021 may include a first switch chip 1101, multiple fourth signal connectors 1102, a fifth signal connector 1103, multiple first orthogonal connectors 1104, an accelerator 1105, a power connector 1106, and a programmable logic device 1107.
[0353] The first substrate 7021 serves as the signal switching and control core within the computing node, comprising a first switch chip 1101, multiple fourth signal connectors 1102, fifth signal connectors 1103, multiple first orthogonal connectors 1104, an accelerator 1105, a power connector 1106, and a programmable logic device 1107. The fourth signal connectors 1102 employ a bidirectional connection design, with one end connecting to the first signal connector group 1002 of the network card interface module and the other end connecting to the first switch chip 1101, enabling efficient switching and distribution of network signals and ensuring smooth network data transmission.
[0354] The programmable logic device 1107 can be the core component for internal management and control of the computing node 402. It can be used for logic control, signal routing and status management of various components. Its functions are mainly implemented by its integrated sideband management interface 1303 and multiple sets of dedicated control signals.
[0355] The sideband management interface 1303 can receive management bus signals 1306 from the I2C intelligent platform of the hardware management controller 1302 to realize out-of-band management, complete the reception of upper-level management instructions and data, and provide instruction basis for the programmable logic device 1107 to execute various control operations.
[0356] The sideband management interface 1303 can receive a global reset signal 1307, which is used to perform a reset operation on the entire computing node or specific internal functional modules, and can quickly restore the normal operation of abnormal modules, ensuring the stability of node operation.
[0357] The side-band management interface 1303 can receive the chassis / node ID signal 1308. Its core function is to identify the physical chassis and computing node number where the programmable logic device 1107 is located, providing an identity identifier for multi-node cluster management, making it easier for the upper-level controller to distinguish different nodes and achieve unified cluster management.
[0358] The programmable logic device 1107 establishes a connection with the accelerator 1105 through multiple sets of dedicated signals, realizing on-site detection, status monitoring, and instruction issuance of the accelerator 1105, providing support for resource scheduling and fault diagnosis of the accelerator 1105, and may include the following signals:
[0359] The presence detection signal 1309 of multiple accelerators 1105 can detect in real time whether each accelerator 1105 is properly connected to the node. The detection result provides the core judgment basis for power control and accelerator 1105 resource scheduling, and avoids performing invalid control operations on accelerators 1105 that are not in place.
[0360] The 0th UART serial port signal 1310 and the 1st UART serial port signal 1311 of each of the multiple accelerators 1105 are mainly used to receive the operating status data and debugging information reported by the accelerators 1105, realize the real-time monitoring of the operating status of each accelerator 1105, and facilitate the rapid detection of accelerator 1105 abnormalities.
[0361] The second UART serial port signal 1312 of multiple accelerators 1105 has the core function of sending control commands and configuration parameters to each accelerator 1105, including but not limited to debugging commands and working mode switching parameters, so as to realize fine-grained control of the accelerators 1105.
[0362] The programmable logic device 1107 can also be used to control the power supply and clock signal distribution within the node, ensuring stable power supply and timing synchronization of each component, and can include the following signals:
[0363] The power enable signal 1313 directly controls the power module 1304 to turn on and off; the programmable logic device 1107 will intelligently control the start and stop of the power module 1304 according to the accelerator 1105's on-site status and upper-level management instructions, so as to realize energy saving and power protection in fault conditions.
[0364] The 100MHz clock signal 1314 is output by the programmable logic device 1107 to the clock buffer 1305. After being buffered and driven by the buffer, it provides a synchronous clock reference to ensure the timing stability and normal operation of the module.
[0365] Below, in conjunction with Figure 14 The following is an example of the server used in this application.
[0366] Figure 14 This is a schematic diagram of another server interconnection topology provided in an embodiment of this application. Please refer to... Figure 14 . Figure 14 It may include 8 computing nodes 402 and 6 switching nodes 403. Each computing node 402 has a first substrate 7021 and a second substrate 7022 fixedly integrated inside. Each substrate includes 4 accelerators 1105.
[0367] The accelerator 1105 adopts the OAM2.0 standard accelerator, and the 8 compute nodes 402 integrate a total of 8×8=64 OAM2.0 standard accelerators to meet the configuration requirements of 64 GPUs.
[0368] Accelerator 1105 can be designed to support up to x48 lane interconnection resources. Each of the six switching nodes 403 integrates one switching chip, and the switching chip is designed to support mainstream 51.2T switching chips to meet high bandwidth interconnection requirements.
[0369] The interconnection bandwidth between the accelerator 1105 and the switching chip in any switching node 403 is 8x100G. At the same time, each computing node 402 is interconnected with 6 switching nodes 403 respectively, ensuring that an efficient and stable fully interconnected topology is formed between the 8 computing nodes, 6 switching nodes and 64 accelerators, giving full play to the computing power advantage of the 64-card all-in-one server.
[0370] This approach achieves efficient, fully interconnected communication across 64 accelerators, while also improving space utilization and increasing computing power density.
[0371] The server provided in this application has been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the embodiments above are merely for the purpose of helping to understand the method and core ideas of this application. It should be noted that those skilled in the art can make various improvements and modifications to this application without departing from its principles, and these improvements and modifications also fall within the protection scope of the claims of this application.
Claims
1. A server, characterized in that, The system includes a chassis (401), which contains multiple computing nodes (402), multiple switching nodes (403), a first power supply group (404), a second power supply group (405), a first liquid cooling pipe (406), a second liquid cooling pipe (407), and a backplane assembly (408). The plurality of switching nodes (403) are disposed on the first side of the backplane assembly (408), the first power supply group (404) and the second power supply group (405) are disposed on both sides of the plurality of switching nodes (403) along the first direction, and the first liquid cooling pipe (406) and the second liquid cooling pipe (407) are disposed on both sides of the plurality of switching nodes (403) along the first direction. The distance between the first liquid cooling pipe (406) and the second liquid cooling pipe (407) along the first direction is smaller than the dimension of the computing node (402) along the first direction; The plurality of computing nodes (402) are disposed on the second side of the backplane assembly (408) along a second direction, wherein the first direction and the second direction are perpendicular.
2. The server according to claim 1, characterized in that, The backplate assembly (408) includes a first backplate group (501) and a second backplate group (502), which are arranged in parallel. The first power supply group (404) is disposed on the first side of the first backplane group (501), and the second power supply group (405) is disposed on the first side of the second backplane group (502). The plurality of computing nodes (402) are disposed on the second side of the first backplane group (501) and the second backplane group (502).
3. The server according to claim 2, characterized in that, The backplane assembly (408) includes a horizontal backplane (503), and the plurality of switching nodes are connected to the horizontal backplane (503); The horizontal backplate (503) is located between the first backplate group (501) and the second backplate group (502) along the first direction; The first power supply group supplies power to the horizontal backplane through the first backplane group, and / or the second power supply group supplies power to the horizontal backplane through the second backplane group.
4. The server according to claim 2, characterized in that, The first backplane assembly (501) includes a first power backplane (602) and a first compute node backplane (603), wherein, The first power supply backplane (602) and the first computing node backplane (603) are parallel to each other and electrically connected; The first power supply group (404) is electrically connected to the first power supply backplane (602) and is located on the side opposite to the first computing node backplane (603).
5. The server according to claim 4, characterized in that, The second backplane assembly (502) includes a second power supply backplane (606) and a second computing node backplane (607), wherein, The second power supply backplane (606) and the second computing node backplane (607) are parallel to each other and electrically connected; The second power supply group (405) is electrically connected to the second power supply backplane (606) and is located on the side opposite to the second computing node backplane (607).
6. The server according to claim 5, characterized in that, The first computing node backplane (603) is electrically connected to some computing nodes (402); the second computing node backplane (607) is electrically connected to the remaining computing nodes (402).
7. The server according to claim 5, characterized in that, The number of first power connectors (605) provided on the first power backplane (602) is greater than the number of second power connectors (609) provided on the second power backplane (606).
8. The server according to claim 7, characterized in that, The number of power units in the first power group (404) is greater than the number of power units in the second power group (405), and the management controller (612) is disposed on the second power backplane (606); The management controller is electrically connected to at least one of the plurality of computing nodes (402), the plurality of switching nodes (403), the first power supply group (404), and the second power supply group (405).
9. The server according to any one of claims 1-8, characterized in that, The computing node (402) includes a processor motherboard (701), a baseboard module (702), and a middle backplane (703), wherein, The processor motherboard (701) and the substrate module (702) are disposed on both sides of the middle back plate along a third direction and are electrically connected through the middle back plate. The third direction is perpendicular to the first direction and the second direction, respectively. The first substrate (7021) and the second substrate (7022) in the substrate module (702) are stacked along the second direction, and a plurality of accelerators are provided in the first substrate (7021) and the second substrate (7022).
10. The server according to claim 9, characterized in that, A plurality of accelerators are disposed on a first side of the first substrate (7021), and a plurality of accelerators are disposed on a second side of the second substrate (7022); The first substrate (7021) and the second substrate (7022) are stacked together, with the first side of the first substrate (7021) facing the second side of the second substrate (7022); The processor motherboard (701) is connected to multiple accelerators on the first substrate (7021) and multiple accelerators on the second substrate (7022) via the middle backplate (703).
11. The server according to claim 10, characterized in that, The computing node (402) further includes a first liquid cooling plate (704), which is disposed between the first substrate (7021) and the second substrate (7022); The first liquid cooling plate (704) is connected to the first liquid cooling pipe and the second liquid cooling pipe respectively.
12. The server according to claim 11, characterized in that, The computing node (402) also includes a network interface module (803), a network card (804), and a hard disk module (805), wherein, The processor motherboard (701) and the network card interface module (803) are stacked along the second direction; The network card (804) and the hard disk module (805) are laid flat on the side of the network card interface module (803) away from the processor motherboard (701); The sum of the dimensions of the hard disk module (805), the network card interface module (803), and the processor motherboard (701) along the second direction is less than or equal to the dimension of the substrate module (702) along the second direction.
13. The server according to claim 12, characterized in that, The computing node (402) also includes a second liquid cooling plate (802), wherein, The second liquid cooling plate (802) is disposed between the network card interface module (803) and the processor motherboard (701) along the second direction; The second liquid cooling plate (802) is connected to the first liquid cooling plate (704).
14. The server according to claim 12, characterized in that, The processor motherboard (701) includes a processor (901), multiple memory modules (902), an input power connector (903), multiple power supply connectors (904), and multiple signal connectors (905). The input power connector (903), the plurality of power supply connectors (904) and the plurality of signal connectors (905) are disposed on the first side of the processor motherboard (701), the first side of the processor motherboard (701) being the side closest to the middle backplate (703); The processor (901) is located in the central area of the processor motherboard (701), and the plurality of memory modules (902) are respectively located on both sides of the processor (901).
15. The server according to claim 14, characterized in that, The plurality of signal connectors (905) includes a first signal connector (9051), a second signal connector (9052), and a third signal connector (9053). The first signal connector (9051) is used to connect the network card (804). The second signal connector (9052) is used to connect the hard disk module (805); The third signal connector (9053) is used to connect the substrate module (702).
16. The server according to claim 12, characterized in that, The network interface module (803) has multiple network interfaces (1001) and multiple network interface controllers on its first surface, and the multiple network interface controllers are respectively connected to the corresponding network interfaces (1001); The plurality of network interfaces (1001) are arranged sequentially on the first side of the network card interface module (803), and the first side of the network card interface module (803) is the side away from the middle backplate (703); The second side of the network card interface module (803) is provided with a first signal connector group (1002) and a second signal connector group (1003). The first signal connector group (1002) and the second signal connector group (1003) are disposed on the second side of the network card interface module (803), and the second side of the network card interface module (803) is the side close to the middle backplate (703); The first signal connector group (1002) is used to connect the first substrate (7021), and the second signal connector group (1003) is used to connect the second substrate (7022).
17. The server according to claim 16, characterized in that, The first substrate (7021) is also provided with a first switch chip (1101) and a plurality of fourth signal connectors (1102) on the first side. The plurality of fourth signal connectors (1102) are respectively disposed on both sides of the first switch chip (1101); One end of the fourth signal connector (1102) is connected to the connector in the first signal connector group (1002), and the other end is connected to the first switch chip (1101).
18. The server according to claim 17, characterized in that, A second switch chip and multiple sixth signal connectors are also provided on the second side of the second substrate (7022); The plurality of sixth signal connectors are respectively disposed on both sides of the second switch chip; One end of the sixth signal connector is connected to the connector in the second signal connector group (1003), and the other end is connected to the second switch chip.
19. The server according to claim 16, characterized in that, The first substrate (7021) is also provided with a fifth signal connector (1103) and a plurality of first orthogonal connectors (1104) on the first side. The fifth signal connector (1103) is disposed on the first side of the first substrate (7021), and the plurality of first orthogonal connectors (1104) are disposed on the second side of the first substrate (7021). The first side of the first substrate (7021) is the side close to the middle back plate (703), and the second side of the first substrate (7021) is the side close to the switching node (403). The fifth signal connector (1103) is used to connect to the third signal connector (9053) of the processor motherboard (701). The plurality of first orthogonal connectors (1104) are used to connect the plurality of switching nodes (403).
20. The server according to any one of claims 1-8, characterized in that, The switching node (403) includes a second orthogonal connector (1201), a switching processor (1202), a hardware manager (1203), and a third liquid cooling pipe (1204). The second orthogonal connector (1201) is disposed on the first side of the switching node (403), and the third liquid cooling pipe (1204) is disposed on the second side of the switching node (403). The first side of the switching node (403) is the side closer to the computing node (402), and the second side of the switching node (403) is the side farther away from the computing node (402).