A MicroLED optical interconnect ring network system for AI large model training
By constructing a MicroLED optical interconnect ring network system, the bandwidth and latency bottlenecks in AI large-scale model training were solved, achieving efficient, reliable, and flexible data interconnection between nodes and supporting parallel computing of large-scale training clusters.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHENZHEN HUACHUANGXINGUANG TECH CO LTD
- Filing Date
- 2026-04-23
- Publication Date
- 2026-06-30
AI Technical Summary
Existing technologies lack dedicated designs for MicroLED optical interconnect ring network devices in AI large-scale model training, resulting in insufficient bandwidth, high latency, and poor scalability, making it difficult to meet the requirements of high bandwidth, low latency, and high scalability.
Parallel optical signal transmission is achieved by using MicroLED optical transmitting modules and MicroPD optical receiving modules. Combined with the ring network data processing core, buffer and synchronization modules, and ring network connection and expansion modules, a flexible and scalable optical interconnection ring network system is constructed, supporting multi-node parallel access and dynamic topology expansion.
It achieves ultra-high bit-width parallel transmission, minimizes transmission latency, improves multi-node synchronization accuracy and scalability, enhances signal integrity, reduces power consumption, adapts to different training scales, and improves training efficiency and stability.
Smart Images

Figure CN122316480A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of optical interconnect technology, specifically to a MicroLED optical interconnect ring network system for training large AI models. Background Technology
[0002] With the explosive development of AI technology, the parameter scale of large models has jumped from tens of billions to hundreds of billions and trillions. The training process requires the parallel collaboration of massive computing resources, which places unprecedentedly stringent requirements on the bandwidth, latency, scalability, and reliability of data interconnection between nodes. The core characteristics of AI large model training are "computation-intensive + data-intensive". During the training process, frequent gradient exchange, parameter synchronization, and data fragmentation and transmission across nodes are required. The performance of the interconnection system directly determines the training efficiency and model convergence speed.
[0003] Current interconnection solutions for large AI model training clusters are mainly based on traditional electrical interconnection technologies. These solutions rely on the SerDes (Serializer / Deserializer) serial communication protocol, combined with PCB (Printed Circuit Board) traces, silicon interposers, or high-speed cables to achieve node interconnection. In addition, some solutions attempt to use traditional optical modules for long-distance interconnection, but these still adhere to the traditional architecture of "electrical-serial-parallel conversion-optical transmission-serial-parallel conversion-electrical".
[0004] In terms of current technology, MicroLED (micro-light-emitting diode) array technology has demonstrated enormous potential in fields such as display and short-range optical communication due to its advantages of high brightness, high response speed, and high integration. Its arrayed structure is naturally suited for parallel data transmission, providing bandwidth density far exceeding that of traditional electrical interconnects, offering a new technological path to solve the interconnect bottleneck in AI large-scale model training. However, existing technologies have not yet developed dedicated MicroLED optical interconnect ring network equipment for AI large-scale model training scenarios. Traditional optical interconnect solutions have the following limitations: First, they lack deep adaptation to AI training cluster node topologies (such as ring networks and mesh networks), resulting in insufficient routing flexibility; second, they are not optimized for the parallel transmission characteristics of training data, still relying on serial-to-parallel conversion processes, leading to latency that cannot meet extreme requirements; and third, they lack targeted processing mechanisms for training data (such as data filtering and cache scheduling), making it difficult to adapt to the dynamic and bursty nature of data transmission in AI training.
[0005] Traditional electrical interconnect solutions are gradually approaching their performance limits in AI large-scale model training scenarios: On the one hand, SerDes serial transmission requires complex parallel-to-serial / serial-to-parallel conversion and clock data recovery (CDR) circuits, which significantly increase conversion delay and power consumption when processing ultra-high bit-width training data; on the other hand, the bandwidth density of PCB traces and silicon interposers is limited, and high-speed electrical signal transmission is susceptible to crosstalk and loss, leading to deterioration of signal integrity, increased bit error rate, and difficulty in supporting parallel collaborative training of large-scale nodes.
[0006] In summary, the contradiction between the high bandwidth, low latency, and high scalability requirements of AI large-scale model training and the performance bottlenecks of existing electrical interconnection solutions is becoming increasingly prominent. Meanwhile, existing optical interconnection technologies lack specialized designs for AI training scenarios, and there is an urgent need for an innovative MicroLED optical interconnection ring network device to achieve efficient, reliable, and flexible interconnection between training cluster nodes. Summary of the Invention
[0007] To address the shortcomings of existing technologies, this invention provides a MicroLED optical interconnect ring network system for AI large-scale model training, which solves the problems mentioned in the background. Technical solution
[0008] To achieve the above objectives, the present invention provides the following technical solution: a MicroLED optical interconnect ring network system for AI large-scale model training, characterized in that it comprises: The MicroLED light emitting module converts the data from electrical signals into parallel optical signals and emits them efficiently. This module includes a parallel driving circuit and a MicroLED array. The parallel driving circuit employs an arrayed parallel driving design and a direct current driving method. The MicroLED array uses a high-density MicroLED chip array. The MicroLED array uses a direct current driving method to ensure that each light-emitting unit can be independently controlled. The optical coupling structure includes a microlens array, a coupling interface, and a multi-core imaging fiber to achieve efficient coupling of optical signals. The microlens array is composed of multiple microlenses arranged in a planar array. Each multi-core imaging fiber contains hundreds of fiber cores arranged in a specific pattern. Each MicroLED light-emitting unit corresponds to one multi-core imaging fiber core and one lens. The MicroPD optical receiver module, paired with the MicroLED optical transmitter module, is responsible for converting received parallel optical signals into electrical signals and outputting them. It includes an optical receiving unit, a transimpedance amplifier array, and a parallel output interface. The optical receiving unit employs a high-density MicroPD array matched to the MicroLED array to receive optical signals and convert them into weak electrical signals. The transimpedance amplifier array is integrated with the MicroPD matching array to amplify the weak electrical signals with low noise and convert them into stable voltage signals. The parallel output interface directly outputs the amplified parallel electrical signals to the ring network data processing core. The core of the ring network data processing module is the core control unit, responsible for the routing, filtering, processing, and scheduling of data within the ring network. It includes multi-channel data input / output, UID identification, data filtering, and a serial-to-parallel conversion (SerDes) adaptation unit. The multi-channel data input / output supports parallel access from multiple AI training nodes. The UID identification and data filtering module has a built-in UID checking unit that extracts the target node UID appended to the data frame and compares it with the UIDs of other nodes in the system and the ring network. The SerDes adaptation unit supports bidirectional conversion between serial and parallel data to meet the serial interface requirements of some traditional training nodes. The caching and synchronization module includes a data receiving unit, a data processing unit, a high-speed storage unit, and a data synchronization unit, ensuring efficient data transmission and consistency.
[0009] The ring network connection and expansion module, based on the interconnected ring network device logic, is responsible for constructing a flexible and scalable ring network topology, supporting dynamic expansion of the cluster size. It includes ring network ports, topology expansion functionality, and node access / exit mechanisms. The ring network ports are configured with at least two ports, connected to adjacent node devices via multi-core imaging optical fibers to form a closed ring network topology. The topology expansion function supports multi-ring network interconnection; by adding ring network gateway devices, multiple independent ring networks can be connected into a larger-scale mesh network topology. The node access / exit mechanism supports hot-swapping, allowing training nodes to dynamically access or exit the ring network.
[0010] Preferably, the UID identification and data filtering comparison method is as follows: if the target UID is consistent with the device, the data is received and processed; if the target UID is different from the device, it is quickly forwarded to the next node through the ring network connection chain until it reaches the target node; if an invalid UID or redundant data is detected, it is directly discarded to avoid invalid transmission occupying bandwidth.
[0011] Preferably, the SerDes adapter unit first identifies the data transmission interface type supported by the traditional training node connected to it through an automatic detection mechanism; once the interface type is determined, the SerDes adapter unit will automatically adjust its working mode according to the detection result.
[0012] Preferably, the data receiving unit is responsible for obtaining the raw data stream from an external system or network interface. This unit is equipped with a dedicated data parsing function, which can identify different types of data formats and perform preliminary preprocessing. The valid information after preprocessing will be passed to the next stage data processing unit.
[0013] Preferably, the data processing unit is responsible for further processing the received information, including but not limited to data cleaning, format conversion, and necessary encryption / decryption processes. In addition, in order to improve the query efficiency in subsequent steps, the unit may also perform some basic index creation activities. The processed data is then sent to the high-speed storage unit.
[0014] Preferably, the high-speed storage unit is used to temporarily store data that has been processed but has not yet been synchronized. It uses high-speed memory as the main medium to achieve fast read and write access and to temporarily store the received training data.
[0015] Preferably, the data synchronization unit will start when the conditions are met, copy the latest state in the cache to the target database or other persistent storage solution, and implement strict error detection and recovery measures throughout the synchronization process to ensure that the task can be completed smoothly even in the case of network instability.
[0016] Preferably, the caching and synchronization module further includes a timing calibration unit and a global synchronization clock; the timing calibration unit has a built-in adjustable delay chain to ensure that the data received by all nodes in the ring network are in the same timing, thus meeting the training parameter synchronization requirements.
[0017] Preferably, the global synchronization clock receives an external reference clock signal, generates a unified synchronization clock for the ring network, and distributes it to all node devices through the accompanying synchronization clock output module to achieve timing synchronization of data transmission and processing, thereby improving parallel training efficiency.
[0018] Preferably, the emission wavelength of the MicroLED light-emitting unit is selected from the 850nm near-infrared band.
[0019] This invention provides a MicroLED optical interconnect ring network system for training large AI models, which has the following beneficial effects: 1. Break through bandwidth bottleneck and achieve ultra-high bit width parallel transmission; adopting a MicroLED array fully parallel transmission architecture, no serial-to-parallel conversion is required, and a single system supports 1024 bits and above of ultra-high bit width data transmission. Compared with the traditional SerDes electrical interconnection solution, the bandwidth density is increased by several times, which fully meets the massive data transmission needs of training large models with hundreds of billions of parameters, and the training data throughput is greatly improved.
[0020] 2. Significantly reduced transmission latency and shorter model training cycle: The fully parallel transmission architecture eliminates time-consuming steps such as serial-to-parallel conversion and CDR clock recovery, resulting in low optical transmission latency and short end-to-end latency control time within the ring network, significantly reducing latency compared to traditional solutions. This significant reduction in latency reduces the waiting time of training nodes and greatly improves model convergence speed.
[0021] 3. Improve multi-node synchronization accuracy and optimize parallel training efficiency; by calibrating timing skew through a global synchronization clock and adjustable delay chain, the timing synchronization error of all nodes in the ring network is small, ensuring the consistency of parameter synchronization; the non-blocking ring network topology and UID precise routing avoid data transmission blockage, improve the utilization rate of parallel computing resources compared with traditional solutions, and significantly improve the overall computing power output of the training cluster.
[0022] 4. High scalability to adapt to different training needs: Supports dynamic expansion of ring network topology, with 32 or 64 nodes connected to a single ring network, and interconnection of multiple ring networks can be expanded to thousands of nodes; the device supports bandwidth configuration on demand, and the parallel transmission bit width can be adjusted through the SPI interface to adapt to the training needs of models with billions to trillions of parameters, without the need for customized design, shortening the development cycle and reducing deployment costs.
[0023] 5. High-density, low-power design enhances cluster deployment capabilities; the compact structure of the MicroLED array and multi-core imaging fiber reduces the size of the device compared to traditional interconnect devices, supporting rack-mounted high-density deployment, with more than 64 devices per rack.
[0024] 6. Enhance signal integrity and improve data transmission reliability; optical signal transmission is unaffected by electromagnetic interference and crosstalk. Combined with the high response speed of MicroLED / MicroPD arrays, the data transmission error rate is reduced, improving the error rate by several orders of magnitude compared to traditional electrical interconnection solutions. With the help of data frame synchronization headers and verification mechanisms, model convergence deviations caused by training data transmission errors are completely avoided, improving training stability. Attached Figure Description
[0025] Figure 1 This is a schematic diagram of the core transmission link of the MicroLED optical interconnect ring network system of the present invention; Figure 2 This is a schematic diagram illustrating the UID identification and data filtering process for ring network data. Figure 3 This is a schematic diagram of the multi-node ring network topology and interconnection of the present invention; Figure 4 A schematic diagram illustrating the data transmission logic for training large AI models; In the diagram, the “…” symbol is used to indicate the number of scalable nodes in the system architecture, aiming to reflect the flexibility and scalability of the topology. Detailed Implementation
[0026] The technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments.
[0027] In this application, the terms "upper," "lower," "inner," "middle," "outer," "front," and "rear," etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings. These terms are primarily for the purpose of better describing this application and its embodiments, and are not intended to limit the indicated device, element, or component to having a specific orientation, or to be constructed and operated in a specific orientation.
[0028] Furthermore, in addition to indicating location or positional relationship, some of the aforementioned terms may also have other meanings. For example, the term "above" may also be used in some cases to indicate a certain dependency or connection relationship. Those skilled in the art can understand the specific meaning of these terms in this application based on the specific circumstances.
[0029] Example 1: Please refer to Figures 1 to 4 This invention provides a technical solution: a MicroLED optical interconnect ring network system for AI large-scale model training, comprising: MicroLED light emitting module: This module is the core emitting unit for achieving parallel light transmission. (Structure reference...) Figure 1 and Figure 3The output section of the MicroLED array receives ultra-high bit-width training data (e.g., 1024 bits or more) from AI training nodes via a high-speed parallel interface. Data types include gradient data, parameter update data, and data fragments. This data is converted from electrical signals to parallel optical signals and emitted efficiently. This module includes a parallel driving circuit and a MicroLED array. The parallel driving circuit employs an array-based parallel driving design and a direct current driving method. The array-based parallel driving design eliminates the need for serial-to-parallel conversion, directly driving each MicroLED light-emitting unit in the MicroLED array, ensuring data transmission in its original parallel format and minimizing latency. The direct current driving method ensures uniform brightness across each MicroLED light-emitting unit in the array. The MicroLED array uses a high-density MicroLED chip array (e.g., 1024×1024 pixel scale), with each MicroLED light-emitting unit corresponding to 1 bit of data. The MicroLED array uses a direct current drive method to ensure that each light-emitting unit can be controlled independently. This driving method is achieved by integrating one or more transistors at each pixel location, thereby allowing precise current adjustment of individual pixels, which in turn adjusts their brightness and on / off state. The application of active matrix technology not only improves display quality but also reduces power consumption, making the MicroLED array more suitable for scenarios with long-term stable operation. Optical coupling structures, such as Figure 1 As shown, the system includes a microlens array, a coupling interface, and a multi-core imaging fiber. It achieves efficient optical signal coupling and reduces transmission loss through the connection of the microlens end faces, coupling interface, and multi-core imaging fiber. The microlens array consists of multiple microlenses arranged in a planar array, with each lens corresponding to a MicroLED light-emitting unit, ensuring that the light emitted by each MicroLED light-emitting unit is accurately focused onto the corresponding fiber core. Each multi-core imaging fiber contains hundreds of cores arranged in a specific pattern, with each MicroLED light-emitting unit corresponding to one multi-core imaging fiber core. This ensures sufficient information transmission capacity and facilitates one-to-one connection with each light-emitting point in the MicroLED array, thereby achieving efficient data transmission.
[0030] MicroPD light receiver module, which works in pair with MicroLED light emitter module, structural reference. Figure 1 and Figure 3The MicroPD array input section is responsible for converting received parallel optical signals into electrical signals and outputting them. It includes an optical receiving unit, a transimpedance amplifier (TIA) array, and a parallel output interface. The optical receiving unit uses a high-density MicroPD array matched to the MicroLED array. Each MicroPD unit corresponds to a core of a multi-core imaging fiber, receiving optical signals and converting them into weak electrical signals. The transimpedance amplifier (TIA) array is integrated with the MicroPD matching array to amplify the weak electrical signals with low noise, converting them into stable voltage signals to ensure signal integrity and adapt to subsequent circuit processing. The parallel output interface directly outputs the amplified parallel electrical signals to the ring network data processing core without serial-to-parallel conversion, maintaining data parallelism and forming a fully parallel electro-optical-electrical transmission link with the transmitter.
[0031] The core of the ring network data processing module, shown in the attached diagram of the manual, is the core control unit that integrates... Figure 2 Loop data processor and Figure 3 The core data processing function is responsible for the routing, filtering, processing, and scheduling of data within the ring network. This includes multi-channel data input / output, UID identification, data filtering, and serial-to-parallel conversion (SerDes) adaptation units. The multi-channel data input / output supports parallel access from multiple AI training nodes (e.g., 8 or 16 ultra-high bandwidth input / output channels), and can simultaneously receive and forward multiple parallel training data streams, achieving non-blocking data exchange. For UID identification and data filtering, please refer to [link / reference needed]. Figure 2 The device has a built-in UID (Unique Device Identifier) checking unit. When receiving data, it first extracts the target node UID attached to the data frame and compares it with the UIDs of this device and other nodes in the ring network. The serial-to-parallel conversion adapter unit, also known as the SerDes adapter unit, has a built-in optional SerDes module to address the serial interface requirements of some traditional training nodes. It supports bidirectional conversion between serial and parallel data, ensuring compatibility between the device and traditional nodes. For nodes that support parallel interfaces, the serial-to-parallel conversion process is skipped, and parallel transmission is performed directly.
[0032] The caching and synchronization module includes a data receiving unit, a data processing unit, a high-speed storage unit, and a data synchronization unit, ensuring efficient data transmission and consistency.
[0033] Ring network connection and expansion module, this module is based on Figure 3 Ring network connection chain and Figure 4The interconnecting ring network device logic is responsible for building a flexible and scalable ring network topology, supporting dynamic expansion of the cluster size. This includes ring network ports, topology expansion functionality, and node access / exit mechanisms. The ring network ports are configured with at least two ports (one input and one output), connected to adjacent node devices via multi-core imaging optical fibers to form a closed ring network topology, ensuring data transmission redundancy and reliability. The topology expansion functionality supports multi-ring network interconnection; by adding ring network gateway devices, multiple independent ring networks can be connected into a larger-scale mesh network topology, adapting to the expansion needs of training clusters from tens to thousands of nodes without redesigning the core architecture. The node access / exit mechanism supports hot-swapping, allowing training nodes to dynamically access or exit the ring network. The device automatically updates the UID routing table and adjusts the data forwarding path to ensure uninterrupted cluster training.
[0034] The UID identification and data filtering comparison method is as follows: if the target UID matches the current device, the data is received and processed; if the target UID differs from the current device, it is quickly forwarded to the next node via the ring network connection chain until it reaches the target node; if an invalid UID or redundant data is detected, it is directly discarded to avoid invalid transmission consuming bandwidth. During the UID identification and data filtering process, the data frame structure is defined as follows: each data frame consists of a synchronization header, a target node UID field, a payload, and a checksum. The synchronization header is used by the receiving end to detect the start position of the frame; the target node UID field follows immediately, and this field has a fixed length of 16 bytes, used to uniquely identify each device in the network; the payload carries the actual transmitted data information; and the checksum at the end of the frame is calculated using the CRC-32 algorithm to ensure data integrity.
[0035] Regarding the generation, maintenance, and update mechanism of the UID routing table, a basic routing table containing the UIDs of all connected devices and their corresponding physical address mappings is automatically built during system initialization. Whenever a new device joins or an existing device is removed from the network (supporting hot-swapping), the system collects the list of currently online devices through broadcast queries and dynamically adjusts the routing table accordingly. Furthermore, to ensure the accuracy of routing information, the system performs a full network scan every certain period (e.g., every hour) to proactively check and update the UID routing table. For situations where temporary network failures cause some nodes to become unreachable, the system can perform intelligent predictive processing based on the most recent successful communication records, minimizing the need for large-scale reconfiguration due to changes in the status of individual nodes.
[0036] The SerDes adapter unit first identifies the data transmission interface type supported by the traditional training nodes it is connected to through an automatic detection mechanism. This process involves sending a series of predefined probe signals to the target node and determining whether the node supports a parallel interface or only a serial interface based on the received response. Once the interface type is determined, the SerDes adapter unit automatically adjusts its operating mode according to the detection result. If a parallel interface is detected, the parallel data transmission path is directly enabled, skipping any unnecessary serial-to-parallel conversion steps; conversely, when the detection result indicates a serial interface, the SerDes module is activated to perform the necessary serial-to-parallel or parallel-to-serial data format conversion tasks. Furthermore, to minimize the impact of this mode switching on the overall timing of the ring network, this design also includes an adaptive clock synchronization mechanism. This mechanism can dynamically adjust the clock frequency and phase relationship according to the currently selected operating mode (i.e., parallel or serial), ensuring stable and reliable data stream transmission regardless of the mode. Specifically, during the transition from one mode to another, there is a brief but controllable transition period during which the clock control system recalibrates the synchronization status between all relevant components, thereby ensuring that the data transmission efficiency within the entire network architecture is not significantly affected.
[0037] The data receiving unit is responsible for acquiring raw data streams from external systems or network interfaces. This unit is equipped with dedicated data parsing functions, capable of identifying different types of data formats and performing preliminary preprocessing, such as removing invalid characters or decompressing compressed data. The preprocessed valid information will then be passed to the next stage data processing unit.
[0038] The data processing unit is responsible for further processing the received information, including but not limited to data cleaning (such as deleting duplicates), format conversion (to adapt to the needs of different application scenarios), and necessary encryption / decryption processes. In addition, in order to improve the query efficiency in subsequent steps, the unit may also perform some basic index creation activities. The processed data is then sent to the high-speed storage unit.
[0039] The high-speed storage unit is used to temporarily store processed but not yet synchronized data. It uses high-speed memory as the main medium to achieve fast read and write access, temporarily store received training data, alleviate sudden pressure on data transmission, and avoid data loss due to mismatch in node processing speed. This unit supports multiple strategies to manage its internal space usage, such as the LRU algorithm, to ensure that the most frequently accessed data is prioritized. At the same time, to ensure data security, a redundant backup mechanism is also set up to prevent information loss due to single point of failure.
[0040] The data synchronization unit is activated when conditions are met (e.g., a predetermined time interval is reached or a certain amount of new data is accumulated), and copies the latest state in the cache to the target database or other persistent storage solutions. The synchronization process usually involves incremental update technology, that is, only the parts that have changed since the last synchronization are transmitted, thereby reducing bandwidth consumption and speeding up the response speed of the entire system. During the entire synchronization, strict error detection and recovery measures will also be implemented to ensure that the task can be completed smoothly even in the case of network instability.
[0041] The aforementioned caching and synchronization module also includes a timing calibration unit and a global synchronization clock. The timing calibration unit has a built-in adjustable delay chain, which configures delay parameters via an SPI interface to compensate for timing skew caused by different transmission paths, ensuring consistent data timing received by all nodes within the ring network and meeting the synchronization requirements of training parameters. The adjustable delay chain built into the timing calibration unit can compensate for small time offsets under high-precision requirements. This delay chain supports a continuously adjustable mode, allowing users to flexibly configure delay parameters via the SPI interface according to actual needs. The calibration method adopts a closed-loop feedback mechanism, that is, during the calibration process, the system automatically detects and adjusts the delay differences between each node based on a reference clock signal until the data transmission timing between all nodes reaches a consistent state. This process does not require the participation of an external synchronization clock source, but to improve calibration accuracy and stability, in some application scenarios, an external high-quality synchronization clock can be selected as a reference benchmark. In addition, this design also has an adaptive learning function, which can dynamically adjust the delay settings according to changes in the network environment, thereby ensuring optimal timing consistency even when the network topology changes or physical layer conditions fluctuate.
[0042] The global synchronization clock receives an external reference clock signal, generates a unified synchronization clock for the ring network, and distributes it to all node devices through the accompanying synchronization clock output module to achieve timing synchronization of data transmission and processing, thereby improving the efficiency of parallel training.
[0043] The MicroLED light-emitting unit uses an 850nm near-infrared wavelength to balance transmission efficiency and anti-interference capabilities.
[0044] Working Process: The workflow of this invention in distributed training of large AI models is as follows, combined with... Figure 4 Learning data transmission logic: Data transmission phase: AI training nodes (computation nodes / parameter servers) generate training data (gradients, parameters, etc.) and output them to the MicroLED light emission module of this device through a parallel interface; the target node's UID identifier and synchronization header information are attached to the data frame.
[0045] Photoelectric conversion and transmission: The MicroLED light emitting module drives the MicroLED array to convert electrical signals into optical signals, which are then transmitted to the next device in the ring network through the microlens coupling interface and multi-core imaging optical fiber.
[0046] Data routing and filtering: The MicroPD optical receiving module of the receiving device converts the optical signal into an electrical signal and transmits it to the ring network data processing core; the UID identification unit compares the target UID in the data frame with the device's UID; if it is the target node, the data is sent to the buffer and synchronization module; if it is forwarded data, it is directly forwarded to the next node through the ring network connection chain until it reaches the target node.
[0047] Data synchronization and output: The buffer and synchronization module performs timing calibration on the received data to ensure synchronization with the local clock, and then outputs the data to the local training node through the parallel interface to complete a data transmission.
[0048] Multi-node collaboration: All devices within the ring network follow a unified synchronous clock, enabling parallel data interaction between multiple nodes; when the training cluster expands, new devices are connected through the ring network port and automatically integrated into the routing system without interrupting the existing training process.
[0049] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.
Claims
1. A MicroLED optical interconnect ring network system for AI large-scale model training, characterized in that, include: The MicroLED light-emitting module converts the data from electrical signals into parallel optical signals and emits them efficiently. This module includes a parallel driving circuit and a MicroLED array; The parallel driving circuit adopts an arrayed parallel driving design and a direct current driving method; the MicroLED array adopts a high-density MicroLED chip array; the MicroLED array adopts a direct current driving method to ensure that each light-emitting unit can be controlled independently. The optical coupling structure includes a microlens array, a coupling interface, and a multi-core imaging fiber to achieve efficient coupling of optical signals. The microlens array is composed of multiple microlenses arranged in a planar array. Each multi-core imaging fiber contains hundreds of fiber cores arranged in a specific pattern. Each MicroLED light-emitting unit corresponds to one multi-core imaging fiber core and one lens. The MicroPD optical receiver module works in pairs with the MicroLED optical emitter module, and is responsible for converting the received parallel optical signals into electrical signals and outputting them. It includes an optical receiving unit, a transimpedance amplifier array, and a parallel output interface; the optical receiving unit adopts a high-density MicroPD array matched with the MicroLED array to receive optical signals and convert them into weak electrical signals; the transimpedance amplifier array is integrated with the MicroPD matching array to amplify the weak electrical signals with low noise and convert them into stable voltage signals; the parallel output interface directly outputs the amplified parallel electrical signals to the ring network data processing core. The core of the ring network data processing module is the core control unit, responsible for the routing, filtering, processing, and scheduling of data within the ring network. It includes multi-channel data input / output, UID identification, data filtering, and a serial-to-parallel conversion (SerDes) adaptation unit. The multi-channel data input / output supports parallel access from multiple AI training nodes. The UID identification and data filtering module has a built-in UID checking unit that extracts the target node UID appended to the data frame and compares it with the UIDs of other nodes in the system and the ring network. The SerDes adaptation unit supports bidirectional conversion between serial and parallel data to meet the serial interface requirements of some traditional training nodes. The caching and synchronization module includes a data receiving unit, a data processing unit, a high-speed storage unit, and a data synchronization unit to ensure efficient data transmission and consistency. The ring network connection and expansion module, based on the interconnected ring network device logic, is responsible for constructing a flexible and scalable ring network topology, supporting dynamic expansion of the cluster size. It includes ring network ports, topology expansion functionality, and node access / exit mechanisms. The ring network ports are configured with at least two ports, connected to adjacent node devices via multi-core imaging optical fibers to form a closed ring network topology. The topology expansion function supports multi-ring network interconnection; by adding ring network gateway devices, multiple independent ring networks can be connected into a larger-scale mesh network topology. The node access / exit mechanism supports hot-swapping, allowing training nodes to dynamically access or exit the ring network.
2. The MicroLED optical interconnect ring network system for AI large-scale model training according to claim 1, characterized in that: The UID identification and data filtering comparison method is as follows: if the target UID is consistent with the device, the data is received and processed; if the target UID is different from the device, it is quickly forwarded to the next node through the ring network connection chain until it reaches the target node; if an invalid UID or redundant data is detected, it is directly discarded to avoid invalid transmission occupying bandwidth.
3. The MicroLED optical interconnect ring network system for AI large model training according to claim 2, characterized in that: The SerDes adapter unit first identifies the data transmission interface type supported by the traditional training node connected to it through an automatic detection mechanism; once the interface type is determined, the SerDes adapter unit will automatically adjust its working mode according to the detection results.
4. The MicroLED optical interconnect ring network system for AI large-scale model training according to claim 1, characterized in that: The data receiving unit is responsible for obtaining raw data streams from external systems or network interfaces. This unit is equipped with a dedicated data parsing function, which can identify different types of data formats and perform preliminary preprocessing. The preprocessed valid information will be passed to the next stage data processing unit.
5. A MicroLED optical interconnect ring network system for AI large-scale model training according to claim 4, characterized in that: The data processing unit is responsible for further processing the received information, including but not limited to data cleaning, format conversion, and necessary encryption / decryption processes. In addition, in order to improve the query efficiency in subsequent steps, the unit may also perform some basic index creation activities. The processed data is then sent to the high-speed storage unit.
6. A MicroLED optical interconnect ring network system for AI large-scale model training according to claim 5, characterized in that: The high-speed storage unit is used to temporarily store data that has been processed but has not yet been synchronized. It uses high-speed memory as the main medium to achieve fast read and write access and to temporarily store the received training data.
7. A MicroLED optical interconnect ring network system for AI large model training according to claim 6, characterized in that: The data synchronization unit will start when the conditions are met, and copy the latest state in the cache to the target database or other persistent storage solutions. During the entire synchronization process, strict error detection and recovery measures will also be implemented to ensure that the task can be completed smoothly even in the case of network instability.
8. A MicroLED optical interconnect ring network system for AI large model training according to claim 7, characterized in that: The caching and synchronization module also includes a timing calibration unit and a global synchronization clock; the timing calibration unit has a built-in adjustable delay chain to ensure that the data received by all nodes in the ring network are in the same timing, thus meeting the training parameter synchronization requirements.
9. A MicroLED optical interconnect ring network system for AI large-scale model training according to claim 8, characterized in that: The global synchronization clock receives an external reference clock signal, generates a unified synchronization clock for the ring network, and distributes it to all node devices through the accompanying synchronization clock output module to achieve timing synchronization of data transmission and processing, thereby improving the efficiency of parallel training.
10. A MicroLED optical interconnect ring network system for AI large model training according to claim 1, characterized in that: The emission wavelength of the MicroLED light-emitting unit is selected from the 850nm near-infrared band.