Instruction processing method, information processing unit, chip and chip system

By introducing an indirect interaction mode into a multi-chip system and utilizing the information processing unit for data encapsulation and parsing, the problem of limited cross-chip memory access is solved, and efficient and flexible instruction processing is achieved.

CN122308912APending Publication Date: 2026-06-30BEIJING XINGYUN INTEGRATED CIRCUIT CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING XINGYUN INTEGRATED CIRCUIT CO LTD
Filing Date
2026-03-13
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In multi-chip or multi-accelerator unit computing systems, the limited BAR space capacity is due to the resource allocation mechanism of the server platform/switching unit, resulting in limited cross-chip storage access. Existing solutions are costly or introduce additional control overhead and latency.

Method used

By introducing an indirect interaction mode during the interaction process, corresponding information processing units are set up for the chips that initiate and passively interact, generating data packets and mapping them to the target information processing unit for parsing through the exchange unit, thus avoiding direct access to large-capacity storage space.

Benefits of technology

It effectively reduces the mapping space usage, improves the applicability, scalability and processing efficiency of instruction processing methods, and reduces control overhead and latency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122308912A_ABST
    Figure CN122308912A_ABST
Patent Text Reader

Abstract

This application discloses an instruction processing method, an information processing unit, a chip, and a chip system. The instruction processing method introduces instruction-based encapsulation, mapping, and parsing operations. This allows the first chip to access the memory cells of the second chip without directly overwriting the second chip's large memory address space during the mapping stage. Instead, it encapsulates a data packet based on the target instruction, writes this data packet to the second information processing unit of the second chip through mapping by the switching unit, and then parses it to trigger the actual access to the memory cell. This method occupies only a small mapping address space to complete cross-chip interaction. Compared to existing direct mapping access methods, it effectively reduces mapping space usage, avoids reliance on large BAR spaces or frequent sliding window mechanisms, and improves the applicability, scalability, and processing efficiency of the instruction processing method.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of integrated circuit technology, and in particular to an instruction processing method, an information processing unit, a chip, and a chip system. Background Technology

[0002] In a multi-chip or multi-accelerator unit computing system, computing unit a (e.g., a GPGPU chip) needs to frequently access memory units. The target memory unit may be computing unit a's own, another computing unit c, or a memory resource on a memory expansion unit b (e.g., an I / O chip) connected to computing unit a. To support access to memory units other than its own, multiple chips are often connected and access each other through a switching unit (e.g., a PCIe switch). The switching unit can allocate a BAR address space for the connected chips, enabling data exchange between them. For example, a can directly access b's memory unit through the address mapping mechanism of the BAR space.

[0003] Due to limitations in the resource allocation mechanisms of server platforms / switching units, the capacity of the BAR (Block Array) space is typically small, making it difficult to cover all the physical memory on b or c. When an instruction accesses data beyond the mapping range of the current BAR, the relevant instruction cannot be directly mapped to the target address of the storage unit, resulting in restricted access.

[0004] Existing technologies mainly alleviate the above problems in two ways: First, they rely on high-end server platforms that support larger BAR allocation capabilities to reduce the situation of restricted instruction access. However, this method has high requirements for hardware platforms and significantly increases system costs. Second, they use a sliding window method to segment and map the memory space, allowing BAR mapping to access different storage areas at different times. However, this method requires frequent switching of mapping relationships during access. When the access mode is discontinuous or changes randomly, it is easy to introduce additional control overhead and instruction processing latency, affecting the overall execution efficiency. Summary of the Invention

[0005] This application provides an instruction processing method, an information processing unit, a chip, and a chip system, which can effectively reduce the mapping space occupation, avoid dependence on large BAR space or frequent sliding window mechanisms, and thus improve the applicability, scalability, and processing efficiency of the instruction processing method.

[0006] In a first aspect, this application provides an instruction processing method applied to a first information processing unit of a first chip that processes a first target instruction. The instruction processing method includes:

[0007] The first data packet is generated based on the first target instruction, where the first target instruction is the instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip. The first data packet is sent to the switching unit so that the first data packet is written into the second information processing unit of the second chip through the mapping of the switching unit. The second information processing unit parses the first data packet and sends the instruction information corresponding to the first target instruction obtained from the parsing to the storage unit of the second chip, thereby enabling the storage unit to execute the first target instruction based on the instruction information.

[0008] Secondly, this application provides another instruction processing method, wherein the first target instruction is processed by the first information processing unit of the first chip, and the instruction processing method is applied to the second information processing unit of the second chip that executes the first target instruction. The instruction processing method includes: The first data packet written by the first information processing unit based on the switching unit is parsed to restore the instruction information corresponding to the first target instruction. The first data packet is generated by the first information processing unit based on the first target instruction. The first target instruction is the instruction corresponding to the storage unit of the second chip. The target write address of the first data packet points to the second information processing unit of the second chip. The instruction information is sent to the storage unit of the second chip so that the storage unit executes the first target instruction based on the instruction information.

[0009] Thirdly, this application provides an information processing unit, which includes a first information processing unit or a second information processing unit. The first information processing unit is disposed on the first chip that processes the first target instruction. The first information processing unit is used to: encapsulate and generate a first data packet based on the first target instruction, wherein the first target instruction is an instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip. The first information processing unit sends the first data packet to the switching unit so that the first data packet is written to the second information processing unit of the second chip through the mapping of the switching unit, and the second information processing unit parses the first data packet, and sends the instruction information corresponding to the first target instruction obtained from the parsing to the storage unit of the second chip, thereby enabling the storage unit to execute the first target instruction based on the instruction information. The first target instruction is processed by the first information processing unit of the first chip; the second information processing unit is disposed in the second chip to receive and execute the first target instruction; the second information processing unit is used to: parse the first data packet written by the first information processing unit based on the switching unit, and restore the instruction information corresponding to the first target instruction; send the instruction information to the storage unit of the second chip, so that the storage unit executes the first target instruction based on the instruction information; the first data packet is generated by the first information processing unit based on the first target instruction, the first target instruction is the instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip.

[0010] Fourthly, this application provides a chip, which is a first chip. The first chip includes an instruction parsing unit, a first information processing unit, and a first interconnection port; the instruction parsing unit is communicatively connected to the first information processing unit, and the first information processing unit is communicatively connected to the first interconnection port. The instruction parsing unit is used to determine that the corresponding instruction is the first target instruction when the instruction points to a preset address range, and to send the first target instruction to the first information processing unit; The first information processing unit is used to encapsulate the received first target instruction into a first data packet and send the first data packet to the first interconnect port; the target instruction is an instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip; The first interconnect port is used to send the first data packet to the connection switching unit, so that the switching unit writes the first data packet to the second information processing unit according to the target write address, thereby enabling the second information processing unit to parse the first data packet and send the instruction information corresponding to the first target instruction obtained from the parsing to the storage unit of the second chip, so that the storage unit can execute the first target instruction.

[0011] Fifthly, this application provides a chip, which is a second chip, including a second interconnect port, a second information processing unit, and a storage unit; the second interconnect port is communicatively connected to the second information processing unit, and the second processing unit is communicatively connected to the storage unit. The second interconnect port receives the first data packet sent by the connected switching unit and writes the first data packet into the second information processing unit. The first data packet is generated by the first information processing unit based on the first target instruction. The first target instruction is the instruction corresponding to the storage unit of the second chip. The target write address of the first data packet points to the second information processing unit of the second chip. The second information processing unit is used to parse the first data packet written by the second interconnection port, so as to send the instruction information corresponding to the first target instruction obtained by parsing and restoring to the storage unit; The storage unit is used to receive instruction information to execute the first target instruction.

[0012] In a sixth aspect, this application provides a chip system including a switching unit and a first chip, as described in the fourth aspect, and a second chip, as described in the fifth aspect, for communication based on the switching unit.

[0013] The advantages of this application compared to existing technologies are as follows: For two chips that need to interact through a switching unit, corresponding information processing units can be set up according to their respective interaction roles. The chip initiating the interaction is denoted as the first chip, which has a first information processing unit; the chip passively interacting is denoted as the second chip, which has a second information processing unit. During the interaction, the first information processing unit generates a first data packet based on a first target instruction for interaction. The first target instruction represents the access requirement to the storage unit of the second chip, and the target write address of the first data packet is used to instruct the second information processing unit. Based on this, the first information processing unit can write the first data packet to the second information processing unit through the mapping of the switching unit. In this process, the switching unit does not need to be aware of the specific storage address layout of the second chip, only knowing the target write address of the first data packet; the second information processing unit, through parsing the first data packet, completes the instruction reconstruction and triggers access to the storage unit.

[0014] In this way, the first chip does not need to directly access the large storage space of the second chip through the mapping of the switching unit. It only needs to write the first data packet to the second information processing unit through the mapping of the switching unit. Therefore, cross-chip interaction can be completed by occupying only a small mapping address space. Compared with the existing direct mapping access method, it can effectively reduce the mapping space occupation and improve the applicability, scalability and flexibility of the instruction processing method in multi-chip scenarios.

[0015] It is understood that the beneficial effects of the second to fifth aspects mentioned above can be found in the relevant descriptions in the first aspect mentioned above, and will not be repeated here. Attached Figure Description

[0016] To more clearly illustrate the technical solutions in the embodiments of this application, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0017] Figure 1 This is a flowchart illustrating an instruction processing method provided in an embodiment of this application; Figure 2 This is a diagram illustrating cross-card access by segmenting and mapping storage space using a sliding window method; Figure 3 This is a schematic flowchart of the instruction processing method executed by the second information processing unit provided in the embodiments of this application; Figure 4 This is a schematic diagram of the structure of the first chip provided in an embodiment of this application; Figure 5This is a schematic diagram of the structure of the second chip provided in an embodiment of this application; Figure 6 This is a schematic diagram of the structure of a chip system provided in an embodiment of this application; Figure 7 This is a schematic diagram of another chip system provided in an embodiment of this application. Detailed Implementation

[0018] In the following description, specific details such as particular system architectures and techniques are set forth for illustrative purposes and not for limitation, in order to provide a thorough understanding of the embodiments of this application. However, those skilled in the art will understand that this application may also be implemented in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, apparatuses, circuits, and methods have been omitted so as not to obscure the description of this application with unnecessary detail.

[0019] The modules and units mentioned in this application may include one or more hardware circuits, including but not limited to: application-specific integrated circuits, digital signal processors, field-programmable gate arrays, discrete logic circuits, state machines, and any combination of the foregoing circuits; these circuits are specifically designed, configured and interconnected to perform one or more specific functions disclosed in this application.

[0020] In multi-chip or multi-accelerator unit computing systems, chips are typically interconnected via switching units to support access to memory units outside their own, and cross-chip memory access is achieved through BAR address mapping mechanisms. However, limited by the BAR space capacity that can be allocated by the server platform / switching unit, the switching unit struggles to cover the large-capacity storage resources of the target chip or storage expansion unit, resulting in access restriction when the instruction access range exceeds the mapped space. Existing solutions either rely on high-end hardware platforms with larger BAR allocation capabilities, which are costly and have limited applicability; or they use a sliding window approach to segment and map memory space, but this requires frequent adjustments to the mapping relationship when access patterns are discontinuous or change randomly, easily introducing additional control overhead and access latency, affecting overall execution efficiency.

[0021] To address this problem, this application proposes an instruction processing method applied to a first information processing unit of a first chip. The first information processing unit encapsulates data based on a first target instruction, generates a first data packet pointing to a second information processing unit of a second chip, and sends the first data packet to a switching unit—the bridge between the two chips. The switching unit maps the first data packet to the second information processing unit, which then parses it and triggers actual access to the memory unit. This instruction processing method differs from the traditional direct interaction mode, employing an indirect interaction approach. During the interaction, the switching unit is only responsible for handling the address mapping between the two information processing units, without directly mapping specific target memory units, thus significantly reducing the required mapping space. Furthermore, since frequent switching of mapping relationships is not required during the interaction, it also reduces control overhead and instruction processing latency, improving instruction processing efficiency. The control method proposed in this application will be described below through specific embodiments.

[0022] To illustrate the technical solutions proposed in this application, the following description will use the first information processing unit as the execution subject to illustrate various embodiments.

[0023] Figure 1 A schematic flowchart of the instruction processing method provided in this application is shown. The instruction processing method includes: Step 110: The first information processing unit encapsulates and generates the first data packet based on the first target instruction.

[0024] Step 120: The first information processing unit sends the first data packet to the switching unit so that the first data packet is written into the second information processing unit of the second chip through the mapping of the switching unit, and the second information processing unit parses the first data packet and sends the instruction information corresponding to the first target instruction obtained from the parsing to the storage unit of the second chip, thereby enabling the storage unit to execute the first target instruction based on the instruction information.

[0025] For two chips that need to interact via a switching unit, various scenarios are possible. For example, a GPGPU chip on the same compute card may need to access storage resources on a corresponding I / O chip; two GPGPU chips on different compute cards may need to exchange data; or a GPGPU chip may access I / O chips on other compute cards. Typically, cross-chip interaction is achieved through a switching unit (such as a PCIe switch). Existing solutions either rely on high-configuration server platforms with greater BAR resource allocation capabilities to cover the target storage space, resulting in high hardware costs and limited applicability; or... Figure 2As shown, the storage space is segmented and mapped using a sliding window method. However, the mapping relationship needs to be switched frequently during access. When the access mode is discontinuous or changes randomly, it is easy to introduce additional control overhead and access latency.

[0026] Clearly, the root cause of the aforementioned limited interaction problem lies in the limited BAR space capacity allocated to each chip by the server platform / switching unit. Existing solutions all revolve around how to cover more storage space within the limited BAR space, which is insufficient to fundamentally eliminate the access limitation problem and all have obvious drawbacks. Based on this, this application changes the traditional design approach, no longer attempting to expand or frequently reuse the BAR space, but instead reducing the BAR space occupancy requirements during interaction, thereby reducing the problem of cross-chip interaction being limited by the BAR space size.

[0027] To achieve the above objectives, this application introduces an indirect interaction mode that differs from traditional direct interaction. Based on the roles of the two chips in the interaction process, corresponding information processing units are set up, with the exchange unit handling the address mapping between the information processing units. For large-scale interconnected chip systems, larger system storage capacity is beneficial for improving overall system performance and better application in scenarios such as large-scale model computation. Therefore, theoretically, for the first chip, the larger the accessible storage space of other chips, the better, resulting in a very large accessible storage space, such as up to 32T or even more. However, in practical applications, the number of accessed chips is usually limited relative to the storage capacity of each chip, and each accessed chip only needs to be configured with one corresponding information processing unit. Therefore, compared to direct mapping of large-capacity storage space, the address space required by the information processing unit used for encapsulation / parsing during information interaction is significantly reduced, thereby significantly reducing the BAR space occupation requirement and breaking through the bottleneck of traditional solutions limited by BAR resources.

[0028] Building upon this foundation, this application provides a unified encapsulation of the target instructions involved in cross-chip interaction. This encapsulates instruction semantics, memory access requirements, and control information into a data format that can be transmitted via the switching unit. This transforms the object mapped by the switching unit from a specific memory address to an information processing unit address. Based on this transformation, the entire chip system can use an "encapsulation-mapping-resolution" processing model to ensure that instructions are accurately restored and triggered for execution upon arrival at the target chip. This achieves a complete, reliable, and highly scalable cross-chip instruction processing flow without relying on large-scale BAR mapping.

[0029] Specifically, the chip initiating the interaction is denoted as the first chip, which has a first information processing unit; the chip passively interacting is denoted as the second chip, which has a second information processing unit. For example, when the first chip has a memory access requirement for the second chip's memory unit, it generates a first target instruction. The first target instruction is any memory access instruction targeting the memory unit. This memory access instruction can be generated by the first chip itself or by receiving instruction information from other chips. For example, the first chip can receive a data access request from a central processing unit (such as a CPU) transmitted by a switching unit (such as a PCIe switch) and determine that the memory access instruction points to the second chip connected to the first chip. The first information processing unit encapsulates data according to the first target instruction, generating a first data packet. This not only unifies the originally scattered instruction semantics and memory access information into a data format that can be transmitted via the switching unit, but also facilitates changing the objects mapped by the switching unit.

[0030] The target write address of the first data packet is used to instruct the second information processing unit of the second chip. After the first information processing unit sends the first data packet to the switching unit, the switching unit can write the first data packet to the corresponding address of the second information processing unit based on the pre-established mapping relationship. The second information processing unit parses the first data packet, recovers the instruction information corresponding to the first target instruction, and sends the instruction information to the storage unit of the second chip to trigger the storage unit to execute the first target instruction.

[0031] In this embodiment, the first chip does not need to directly access the large storage space of the second chip through the mapping of the switching unit. It only needs to write the first data packet to the second information processing unit through the mapping of the switching unit. Therefore, cross-chip interaction can be completed even with a small mapping address space. Compared with the existing direct mapping access method, it can effectively reduce the mapping space occupation and improve the applicability, scalability and flexibility of the instruction processing method in multi-chip scenarios.

[0032] In some embodiments, to ensure that an instruction processing step can be completed smoothly and accurately through the "encapsulation-mapping-parsing" processing mode, the instruction information may include the actual access address of the storage unit and the first target instruction. Specifically, the first information processing unit encapsulates and generates a first data packet based on the first target instruction, including: Step A1: The first information processing unit converts the offset address corresponding to the first target instruction into the actual access address of the storage unit based on the preset address mapping rules and the accessible address set of the first chip.

[0033] To ensure that the second information processing unit can accurately trigger the execution of the first target instruction by the target memory unit after parsing and restoring the instruction information, the actual access address of the memory unit corresponding to the first target instruction can be determined during the instruction encapsulation stage. For the first chip, the reference addresses of the memory units of other chips that it can access can be uniformly regarded as a set of accessible addresses. Based on this, the first information processing unit combines the offset address carried in the first target instruction with the preset address mapping rules to parse and convert the address information, thereby mapping the offset address, which only has logical meaning, to the actual access address corresponding to the specific memory unit, ensuring the accuracy and executability of cross-chip memory access operations.

[0034] For example, the first chip has an address space storage unit, which can use several bits to represent the address space corresponding to the total accessible storage capacity of the instruction (including the first chip's own storage units and the storage units of other chips, such as the second chip). If the address information in the first target instruction points to the address space corresponding to the storage unit of another chip, the access request needs to be accessed across chips. In this case, the first target instruction can be sent to the first information processing unit. For example, a total storage capacity of 32TB can be represented using 45 bits. Assuming that the local storage unit capacity of the first chip is 8TB and the rest is the total capacity of the storage units accessible across chips, the local address range can be 0x000000000000 to 0x07FFFFFFFFFF, and the cross-chip address range can be 0x080000000000 to 0x1FFFFFFFFFFF. If the access address of the access request is in the cross-chip address range, the access request can be sent as the first target instruction to the first information processing unit.

[0035] For example, the first information processing unit includes a configuration register to provide the base address information of the storage unit. If data needs to be read starting from address A, and A indicates that it points to another chip, the first information processing unit can determine the target storage unit of the target chip, the starting address of the target storage unit, the target chip where the target storage unit is located, and the write address of the second information processing unit corresponding to the target chip, based on address A, the address ranges of each chip and each storage unit. Based on the interconnect architecture and the communication connection between the first chip and the target chip, the corresponding switching unit, and even the specific switching port (such as a PCIe port), can be determined. Specifically, the first information processing unit can extract the corresponding offset address based on address A. For example, when address A is 0xFFF0 (offset after removing the high-order address information of the first information processing unit), and the base address of the target storage unit in the corresponding configuration register is 0x100_0000, the actual access address when the first information processing unit generates and forwards the first data packet is the sum of the base address and the offset address, i.e., 0x100_FFF0.

[0036] Step A2: The first information processing unit encapsulates the actual access address, the first target instruction, and the target write address to obtain the first data packet.

[0037] In addition to the actual access address, the first data packet may also encapsulate the target write address of the second information processing unit determined by the second chip and the specific instruction of the first target instruction itself, so that after the second information processing unit completes the parsing, it can send the corresponding instruction information to the storage unit, thereby triggering the storage unit to execute the first target instruction. For example, the specific instruction may be "read data of length B".

[0038] In this embodiment, the first information processing unit determines the physical memory access information of the specific target memory unit by parsing and determining the offset address to the actual access address. This avoids complex address derivation on the second chip side, reducing parsing complexity and error risk. Furthermore, the actual access address, target instruction, and target write address are uniformly encapsulated into a first data packet, so that the switching unit only needs to forward based on a small number of fixed information processing unit addresses, thereby effectively improving the scalability, reliability, and interaction efficiency of cross-chip memory access.

[0039] In some embodiments, to ensure that the first information processing unit can accurately write the first data packet to the second information processing unit through the mapping of the switching unit, the first information processing unit sends the first data packet to the switching unit so that the switching unit writes the first data packet to the second information processing unit of the second chip, including: Step B1: The first information processing unit initiates a write operation to the target write address according to the interconnection topology of the first chip, the second chip, and the switching unit, and sends the first data packet as write data, so that the switching unit routes the write operation to the second information processing unit according to the mapping relationship between the pre-allocated BAR space and the second information processing unit.

[0040] In a typical multi-chip system, there can be one or more switching units, the specific number depending on the system topology. For example, a single compute card may deploy 1-2 PCIe switches, while in multi-compute card or cascaded interconnect scenarios, there may be multiple levels or multiple switching units. Regardless of the number of switching units, after the system powers on, each switching unit and its downstream devices need to be enumerated according to the bus protocol specifications. The essential function of enumeration is to allow the host to uniformly discover all devices in the system, determine the hierarchical relationship, resource requirements, and allocable address space of each device, and complete the unified allocation and initialization of resources such as BARs, thereby avoiding address conflicts and ensuring the addressability and consistency of the entire system.

[0041] Based on this, after the system powers on and completes the enumeration of the switching units, the host allocates and establishes a corresponding BAR address space for the information processing unit of the second chip according to the enumeration results. In this embodiment, the BAR address does not directly map to large-capacity storage resources, but serves as the access entry point for the second information processing unit on the first chip side, establishing a stable mapping relationship with the second information processing unit through the address forwarding table of the switching unit. The first information processing unit can accurately determine the BAR mapping window of the second information processing unit based on the global address view, BAR mapping window, and mapping relationship of each information processing unit formed during the enumeration phase, as well as the identified target chip, thereby providing a clear, reliable, and dynamically adjustable addressing basis for subsequent cross-chip data forwarding and instruction interaction.

[0042] After obtaining the BAR mapping window, the first information processing unit initiates a write operation using its address as the target write address, and sends the pre-encapsulated first data packet as the write data to the switching unit. The switching unit does not need to parse the data content; it only routes and forwards the write request based on the pre-established mapping relationship between the target write address and the second information processing unit, and finally writes the first data packet to the second information processing unit, which then continues to complete the parsing and subsequent instruction execution process.

[0043] In this embodiment, the first information processing unit abstracts cross-chip interaction into a standard write operation on the BAR mapping window of the second information processing unit. This allows the switching unit to handle only the mapping and forwarding of a fixed, small-scale address space, without having to participate in the mapping management of a large-capacity storage space. This significantly reduces BAR resource consumption and switching unit complexity, while leveraging mature BAR access and write operation mechanisms to improve the stability, scalability, and versatility of cross-chip instruction transmission.

[0044] In some embodiments, to ensure the clarity and traceability of instruction execution status feedback or data return paths, the first data packet may also encapsulate the target receiving address of the first information processing unit. Based on this receiving address, the second information processing unit can re-encapsulate the relevant information after obtaining the execution result or return data, and accurately transmit it back to the first information processing unit through the switching unit, thereby realizing the effective association between requests and responses during cross-chip interaction. When the first target instruction is a read instruction, after the first information processing unit sends the first data packet to the switching unit, the following is also included: Step C1: The first information processing unit receives and parses the second data packet written by the second information processing unit through the exchange unit, and restores the target data corresponding to the read instruction.

[0045] Since the target receiving address corresponding to the first information processing unit is pre-encapsulated in the first data packet, the second information processing unit does not need to additionally query or maintain complex backhaul mapping relationships. It only needs to use the target receiving address as the target write address to accurately route the second data packet to the first information processing unit by means of the existing address mapping mechanism of the switching unit.

[0046] Based on this, after parsing the first target instruction, the second information processing unit can initiate a corresponding read operation to the storage unit, enabling the storage unit to retrieve the target data based on the read instruction and return the target data to the second information processing unit. The second information processing unit then encapsulates the target data, generates a second data packet, and writes the second data packet to the first information processing unit through the switching unit.

[0047] After receiving the second data packet written by the switching unit, the first information processing unit parses the data packet content and reconstructs the target data corresponding to the read instruction, thus completing a closed loop of a cross-chip read operation. By encapsulating the target receiving address in the first data packet, the return path is determined at the instruction initiation stage, realizing an interaction method where the request carries loop information. This design avoids the second information processing unit dynamically maintaining a return address table for different requests, reducing state saving and control logic complexity, while ensuring a stable and predictable data feedback path, which is beneficial to improving the reliability of cross-chip interaction and overall processing efficiency.

[0048] In some embodiments, if the first target instruction is a write instruction, the storage unit can, after completing the write operation, report the corresponding instruction execution status to the second information processing unit based on the write result, such as write success, write failure, or exception type. The second information processing unit generates a third data packet based on this execution status, and similarly uses the target receiving address carried in the first data packet as the target write address for the return transmission. Then, it writes the third data packet to the first information processing unit through the mapping of the switching unit. After receiving and parsing the third data packet, the first information processing unit can determine the actual execution result of the write instruction, thereby achieving status awareness and result confirmation for cross-chip write operations.

[0049] By introducing an execution status feedback mechanism in write scenarios, the interaction mode not only supports unidirectional data transmission but also forms a complete instruction processing closed loop of "initiating an instruction—executing an instruction—feedback on execution status," ensuring the controllability and traceability of write operations. Simultaneously, the unified adoption of BAR-mapped address feedback avoids additional state synchronization and channel management overhead, facilitating highly reliable and low-complexity cross-chip write interactions in multi-chip systems.

[0050] In some embodiments, if both the first chip and the second chip are processing chips with integrated memory units (e.g., both are GPGPU chips), the interaction is no longer limited to one-way access. The second chip can also act as the initiator, actively sending a second target instruction to the first chip, pointing to the first chip's memory unit. To this end, a third information processing unit is further provided on the first chip, and a fourth information processing unit is provided on the second chip.

[0051] In this two-way interactive scenario, when the second chip generates a second target instruction for the storage unit of the first chip, the fourth information processing unit, which functions identically to the first information processing unit, can parse and encapsulate the second target instruction according to the aforementioned processing mechanism, and forward it to the third information processing unit of the first chip via the switching unit. The third information processing unit then parses the received data, reconstructs the corresponding instruction information, and triggers access to the storage unit of the first chip. Thus, any processing chip with storage capabilities can flexibly switch between being the initiator or the passive party at different interaction stages, achieving peer-to-peer and scalable cross-chip memory access and data interaction within a unified indirect interaction framework.

[0052] Corresponding to the instruction processing method executed by the first information processing unit mentioned above, Figure 3 A schematic flowchart of an instruction processing method executed by a second information processing unit is shown, the instruction processing method including: Step 310: The second information processing unit parses the first data packet written by the first information processing unit based on the switching unit and restores the instruction information corresponding to the first target instruction.

[0053] The second information processing unit, acting as the receiving-side entry point, parses and processes the first data packet written by the first information processing unit via the mapping of the switching unit. Since the first data packet is uniformly encapsulated and generated by the first information processing unit on the first chip side based on the first target instruction, it already contains instruction semantics, memory access requirements, and necessary control information. The second information processing unit does not need to rely on the switching unit or the first chip's direct perception of the second chip's storage structure; it only needs to unpack and semantically parse the data packet to reconstruct the complete instruction information corresponding to the first target instruction, thereby achieving correct identification and reconstruction of cross-chip instructions.

[0054] Step 320: The second information processing unit sends the instruction information to the storage unit of the second chip, so that the storage unit executes the first target instruction based on the instruction information.

[0055] After restoring the instruction information, the second information processing unit sends the instruction information to the storage unit inside the second chip, so that the storage unit can execute the corresponding first target instruction based on the received instruction information, such as completing a data read or write operation. This process occurs locally on the second chip and no longer relies on the address mapping capability of the switching unit, thereby ensuring the accuracy of instruction execution and the efficiency of storage access.

[0056] In this embodiment, through the above two steps, cross-chip memory access instructions are decomposed into two stages: "data packet parsing and restoration" and "local instruction execution." This allows the switching unit to only perform the function of mapping addresses between data forwarding information processing units, without needing to map the address space of large-capacity storage units. This approach avoids large-scale storage space mapping while achieving reliable transmission and localized execution of instruction semantics. It not only significantly reduces dependence on BAR space but also improves the flexibility, scalability, and overall execution efficiency of cross-chip interaction.

[0057] In some embodiments, the encapsulation process of the first data packet is completed by the first information processing unit, and the specific implementation of the first data packet being written to the second information processing unit through the switching unit has been described in detail in the foregoing embodiments, and will not be repeated here.

[0058] In some embodiments, when the first target instruction is a read instruction, after sending the second target instruction to the memory unit of the second chip, the method further includes: Step D1: The second information processing unit receives the target data read by the storage unit.

[0059] After the first target instruction is executed by the storage unit, the second information processing unit receives the target data returned by the storage unit. This target data is the data obtained by the storage unit after executing the read instruction, ensuring that the subsequent feedback content corresponds one-to-one with the original memory access request, and providing a data basis for result return.

[0060] Step D2: The second information processing unit generates a second data packet based on the target data.

[0061] Step D3: The second information processing unit writes the second data packet into the first information processing unit through the switching unit.

[0062] The second information processing unit encapsulates the received target data to generate a second data packet. The second data packet contains the target data itself, as well as necessary control or identification information such as the corresponding instruction identifier and status information. Its corresponding target receiving address points to the first information processing unit, so that the second information processing unit can accurately write the second data packet into the first information processing unit through the predetermined address mapping relationship of the exchange unit, thereby completing the cross-chip back transmission of the read result.

[0063] In this embodiment, the reading result of the storage unit is encapsulated by the second information processing unit and then returned to the first information processing unit through the exchange unit, which can realize closed-loop interaction of cross-chip data reading.

[0064] It should be understood that the sequence number of each step in the above embodiments does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.

[0065] Corresponding to the instruction processing method in the above embodiments, an information processing unit is provided, which includes a first information processing unit or a second information processing unit. The first information processing unit is disposed on the first chip that processes the first target instruction. The first information processing unit is used to: encapsulate and generate a first data packet based on the first target instruction, wherein the first target instruction is an instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip. The first information processing unit sends the first data packet to the switching unit so that the first data packet is written to the second information processing unit of the second chip through the mapping of the switching unit, and the second information processing unit parses the first data packet, and sends the instruction information corresponding to the first target instruction obtained from the parsing to the storage unit of the second chip, thereby enabling the storage unit to execute the first target instruction based on the instruction information. The first target instruction is processed by the first information processing unit of the first chip; the second information processing unit is disposed in the second chip to receive and execute the first target instruction; the second information processing unit is used to: parse the first data packet written by the first information processing unit based on the switching unit, and restore the instruction information corresponding to the first target instruction; send the instruction information to the storage unit of the second chip, so that the storage unit executes the first target instruction based on the instruction information; the first data packet is generated by the first information processing unit based on the first target instruction, the first target instruction is the instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip.

[0066] It is understandable that within the same chip, the information processing unit may only have a unit for actively encapsulating and sending data packets, or it may only have a unit for parsing and restoring target instructions and returning data, or it may have both units for actively encapsulating and sending data packets and units for parsing and restoring target instructions and returning data. The specific configuration can be tailored to the functional definitions of each chip, or an enable switch can be used to turn on or off the corresponding information processing unit to adapt to diverse chip interconnection and interaction systems and needs.

[0067] For the instruction processing method corresponding to the above embodiments, please refer to... Figure 4 A chip is provided, which is a first chip. The first chip includes an instruction parsing unit 41, a first information processing unit 42, and a first interconnection port 43. The instruction parsing unit 41 is communicatively connected to the first information processing unit 42, and the first information processing unit 42 is communicatively connected to the first interconnection port 43.

[0068] The instruction parsing unit 41 is used to determine that the corresponding instruction is the first target instruction when the instruction points to a preset address range, and to send the first target instruction to the first information processing unit.

[0069] The first information processing unit 42 is used to encapsulate the received first target instruction into a first data packet and send the first data packet to the first interconnect port; the target instruction is an instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip.

[0070] The first interconnect port 43 is used to send the first data packet to the connection switching unit, so that the switching unit writes the first data packet to the second information processing unit according to the target write address, thereby enabling the second information processing unit to parse the first data packet and send the instruction information corresponding to the first target instruction obtained from the parsing to the storage unit of the second chip, so that the storage unit can execute the first target instruction.

[0071] In this embodiment, the identification, encapsulation, and forwarding of cross-chip memory access instructions are achieved through the collaborative structure of the instruction parsing unit and the information processing unit within the first chip. The storage space of the first chip can be divided into two categories: local storage space and storage space of other chips. Access is only made through the switching unit when an instruction needs to access the storage space of other chips. Therefore, upon receiving the first target instruction, the instruction parsing unit first determines whether the instruction points to the storage space of other chips (i.e., a preset address range). If so, the instruction parsing unit identifies it as a target instruction for the storage unit of the second chip and encapsulates it with the first information processing unit, generating a data packet whose target write address points to the information processing unit of the second chip. This data packet is sent to the switching unit through the first interconnect port, completing address mapping and data forwarding. In this way, the first chip does not need to directly access the large-capacity storage space of the second chip through the mapping of the switching unit. Instead, the information processing unit acts as an intermediary for cross-chip interaction, enabling indirect instruction transmission and execution. This effectively reduces the switching unit's requirement for BAR space, lowers mapping complexity, and improves the applicability, scalability, and overall access efficiency of the multi-chip system under resource-constrained conditions.

[0072] For the instruction processing method corresponding to the above embodiments, please refer to... Figure 5 Another chip is provided, which is a second chip. The second chip includes a second interconnect port 51, a second information processing unit 52, and a storage unit 53. The second interconnect port 51 is communicatively connected to the second information processing unit 52, and the second processing unit 52 is communicatively connected to the storage unit 53.

[0073] The second interconnect port 51 receives the first data packet sent by the connected switching unit and writes the first data packet into the second information processing unit 52. The first data packet is generated by the first information processing unit based on the first target instruction. The first target instruction is the instruction corresponding to the storage unit of the second chip. The target write address of the first data packet points to the second information processing unit 52 of the second chip.

[0074] The second information processing unit 52 is used to parse the first data packet written by the second interconnect port 51, so as to send the instruction information corresponding to the first target instruction obtained by parsing and restoring to the storage unit 53.

[0075] Storage unit 53 is used to receive instruction information to execute the first target instruction.

[0076] In this embodiment, the second chip receives the first data packet forwarded by the switching unit through the second interconnect port and writes it into the second information processing unit. The second information processing unit parses the data packet to reconstruct the instruction information for the local storage unit and triggers the storage unit to execute the corresponding target instruction. By unifying cross-chip memory access requests to the second information processing unit for parsing and scheduling, the switching unit can avoid directly mapping large-capacity storage space. This allows the switching unit to complete cross-chip instruction transmission by only needing to be aware of the limited address range of the information processing unit, thereby effectively reducing BAR resource consumption, simplifying address mapping and access control logic, and improving the scalability and execution efficiency of multi-chip systems in complex interconnect scenarios.

[0077] Figure 6 A chip system provided in one embodiment of this application includes a switching unit, and a first chip and a second chip as described in the foregoing embodiments based on the switching unit.

[0078] In this chip system, by setting corresponding information processing units in the first and second chips respectively, the switching unit can only handle data forwarding and address mapping between the information processing units, transforming cross-chip interaction from traditional "direct memory access" to "indirect memory access driven by instruction encapsulation and parsing". This design avoids the switching unit directly mapping large-capacity storage space to BARs, and can complete cross-chip instruction transmission and data interaction even with limited mapping resources, thereby significantly reducing BAR space occupation and alleviating the resource pressure on the server platform. Simultaneously, the system does not require frequent switching of mapping relationships, reducing control overhead and access latency, and can achieve higher access flexibility, scalability, and overall operating efficiency in multi-chip, multi-acceleration unit scenarios.

[0079] It is understood that in some embodiments of the chip system, there may be multiple first chips and multiple second chips. The first chips may be fully interconnected or interconnected via forwarding. One first chip may have one or more correspondingly connected second chips. The address space of the second information processing unit may be divided into two segments: the first subspace stores data packets corresponding to the storage units of the second chip, and the second subspace stores data packets corresponding to the storage units of other chips that the second chip can access as a forwarding chip. When a first target instruction points to a storage unit of the second chip, it can be written to the first subspace through BAR mapping, and the second information processing unit performs parsing and restoration. If the second target instruction points to a storage unit of a third chip for which the second chip acts as a forwarding chip, it can be written to the second subspace through BAR mapping. In this case, the second information processing unit does not perform parsing; the second chip directly forwards the data packet and writes it to the first subspace of the information processing unit of the corresponding third chip through BAR mapping for parsing and restoration. This enables efficient instruction forwarding and is beneficial for adapting to more flexible chip system architectures.

[0080] In some embodiments, the storage capacity of the second chip is greater than the maximum BAR space that the switching unit can allocate to the second chip; the first chip is a GPGPU chip, the second chip is a GPGPU chip or an IO chip, the GPGPU chip is provided with a first information processing unit and a third information processing unit, the IO chip is provided with a second information processing unit, and the third information processing unit and the second information processing unit have the same function; the chip system includes n GPGPU chips and m IO chips, where n≥1 and m≥2.

[0081] In this chip system, the storage capacity of the second chip is much larger than the maximum BAR space that the switching unit can allocate. By setting up functionally coordinated information processing units on the chips on both sides of the interaction, the chip system constructs an indirect interaction mechanism based on instruction encapsulation and parsing, which effectively avoids the problem of large-capacity storage space not being able to be mapped as a whole. Regardless of whether the second chip is a GPGPU chip or an I / O chip, it can complete a unified interaction process with the first chip through the corresponding information processing unit, allowing the first chip to initiate access without needing to know the specific storage layout of the other chip. The GPGPU chip can be equipped with multiple computing cores and local storage units to handle a large number of parallel computing tasks.

[0082] For a multi-chip system containing n GPGPU chips and m I / O chips, this architecture enables many-to-many, scalable cross-chip memory access and data interaction with only limited BAR resources provided by the switching unit. This significantly improves the system's applicability, scalability, and overall operating efficiency in large-scale heterogeneous chip scenarios. Preferably, the I / O chips, as the main bandwidth and memory expansion chips, do not require dedicated information processing units for actively encapsulating and sending data packets, thus reducing the area requirements of the I / O chips.

[0083] For example, see Figure 7 , Figure 7 A chip system is illustrated, comprising two GPGPU chips and two I / O chips. The first GPGPU chip can access either I / O chip via a first switching unit (PCIe Switch 1), and this access is typically unidirectional; the first GPGPU chip can also access the second GPGPU chip via a second switching unit (PCIe Switch 2). In some other embodiments, the GPGPU chips and I / O chips can also be connected via a UCIe interconnect switching / routing module.

[0084] For example, the information processing unit is a mailbox unit, the first information processing unit is a mailbox sending unit (mbs), and the second information processing unit is a mailbox receiving unit (mbr). The first GPGPU chip includes a first LD / ST unit, mbs1, mbr1, two first interconnect ports, and a memory unit 1 (DDR1); any IO chip includes a second interconnect port, mbr2, and a memory unit 2 (DDR2); the second GPGPU chip includes a second LD / ST unit, mbs2, mbr3, and a third interconnect port.

[0085] When the first GPGPU chip reads data, it first generates an AXI read1 instruction. The first LD / ST unit determines whether this instruction points to the memory space of another chip. If not, it is directly transferred to DDR1 to complete the instruction execution. However, if it does (for example, pointing to an I / O chip or another GPGPU chip), the first LD / ST unit determines it as the first target instruction and sends AXI read1 to mbs1 for data encapsulation. The encapsulation includes the actual access address corresponding to AXI read1, the AXI read1 instruction, and the target write address. After data encapsulation, mbs1 generates a data packet pkt1 and selects which PCIe Switch to send it to mbr2. After determining that it is PCIe Switch1, mbs1 uses pkt1 as data to generate AXI write1 and sends it to PCIe Switch1 through the first interconnect port. Through the BAR mapping corresponding to mbr2 on PCIe Switch1, the request is directed to the target write address of mbr2, and AXI write1 is transmitted through PCIe Switch1.

[0086] After AXI write1 is executed, mbr2 is written to pkt1. mbr2 parses pkt1 and generates a new AXI read2 instruction based on the actual access obtained from the reconstruction (the execution effect of the AXI read2 instruction is equivalent to the execution effect of the AXI read1 instruction), and sends it to DDR2. After DDR2 executes AXI read2, it returns the target data. mbr2 encapsulates this data to generate pkt2, and then generates AXI write2 for the target receive address of mbs1 based on pkt2. Then, mbr2 sends pkt2 to PCIe Switch1 through the second interconnect port. PCIe Switch1 writes the data to mbs1 through BAR mapping. After AXI write2 is executed, mbs1 receives and parses pkt2, and returns the target data as the response data (resp data) of "first GPGPU chip LD / ST unit AXI read1".

[0087] When the second GPGPU chip writes data to the DDR1 memory of the first GPGPU chip, it first generates an AXI write3 instruction. The second LD / ST unit determines whether this instruction points to the memory space of another chip. If it does (the case of no further details), the second LD / ST unit identifies it as the second target instruction and sends AXI write3 to mbs2 for data encapsulation. The encapsulation includes the actual access address corresponding to AXI write3, the AXI write3 instruction, and the target write address. After data encapsulation, mbs2 generates a data packet pkt3 and determines which PCIe Switch to send it to mbr1 through. After determining PCIe Switch2, mbs2 uses pkt3 as data to generate AXI write4 and sends it to PCIe Switch2 through the first interconnect port. Through the BAR mapping of PCIe Switch2 corresponding to mbr1, the request is directed to the target write address of mbr1, and AXI write4 is transmitted through PCIe Switch2.

[0088] After AXI write4 is executed, mbr1 is written to pkt3, which can parse the data packet and generate a new AXI write5 instruction based on the actual access address obtained from the reconstruction. This instruction is then sent to DDR1 for DDR1 to perform the corresponding write operation.

[0089] Based on this chip system, the entire data interaction (data memory access) is implemented through encapsulation, mapping, and parsing, which avoids the complexity of directly accessing cross-chip memory spaces, reduces BAR space occupation, and improves system access efficiency, flexibility, and scalability. This indirect access method simplifies data flow management in multi-chip systems and effectively reduces control overhead and access latency.

[0090] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional units and modules is merely an example. In practical applications, the above functions can be assigned to different functional units and modules as needed, that is, the internal structure of the above device can be divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit. Furthermore, the specific names of the functional units and modules are only for easy differentiation and are not intended to limit the scope of protection of this application. The specific working process of the units and modules in the above system can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0091] This application also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps described in the various method embodiments above.

[0092] This application provides a computer program product that, when run on an electronic device, enables the electronic device to perform the steps described in the various method embodiments above.

[0093] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a computer-readable storage medium, and when executed by a processor, it can implement the steps of the various method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable files, or certain intermediate forms. The computer-readable medium can include at least: any entity or device capable of carrying the computer program code to a photographic device / electronic device, a recording medium, a computer memory, a read-only memory (ROM), a random access memory (RAM), an electrical carrier signal, a telecommunication signal, and a software distribution medium, such as a USB flash drive, a portable hard drive, a magnetic disk, or an optical disk.

[0094] In the above embodiments, the descriptions of each embodiment have different focuses. For parts that are not described in detail or recorded in a certain embodiment, please refer to the relevant descriptions of other embodiments.

[0095] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0096] In the embodiments provided in this application, it should be understood that the disclosed apparatus / network devices and methods can be implemented in other ways. For example, the apparatus / network device embodiments described above are merely illustrative. For instance, the division of modules or units described above is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.

[0097] The units described above as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0098] The above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and should all be included within the protection scope of this application.

Claims

1. An instruction processing method, characterized in that, A first information processing unit applied to a first chip for processing a first target instruction, the instruction processing method comprising: The first data packet is generated based on the first target instruction, where the first target instruction is an instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip. The first data packet is sent to the switching unit so that the first data packet is written into the second information processing unit of the second chip through the mapping of the switching unit, and the second information processing unit parses the first data packet and sends the instruction information corresponding to the first target instruction obtained from the parsing to the storage unit of the second chip, thereby enabling the storage unit to execute the first target instruction based on the instruction information.

2. The instruction processing method as described in claim 1, characterized in that, The instruction information includes the actual access address of the storage unit and the first target instruction; The process of encapsulating and generating the first data packet based on the first target instruction includes: Based on the preset address mapping rules and the accessible address set of the first chip, the offset address corresponding to the first target instruction is converted into the actual access address of the memory unit; the accessible address set includes the reference address of the memory unit of any chip accessible by the first chip, and the target write address is determined based on the second chip. The first data packet is obtained by encapsulating the actual access address, the first target instruction, and the target write address.

3. The instruction processing method as described in claim 1, characterized in that, The step of sending the first data packet to the switching unit, so that the switching unit writes the first data packet into the second information processing unit of the second chip, includes: Based on the interconnection topology of the first chip, the second chip, and the switching unit, a write operation is initiated to the target write address, and the first data packet is sent as write data, so that the switching unit routes the write operation to the second information processing unit according to the mapping relationship between the pre-allocated BAR space and the second information processing unit.

4. The instruction processing method according to any one of claims 1 to 3, characterized in that, The first target instruction is a read instruction, the first data packet includes a target receiving address, and after sending the first data packet to the switching unit, it further includes: The second information processing unit receives and parses the second data packet written to the target receiving address through the switching unit, and restores the target data corresponding to the read instruction; the second data packet is generated by the second information processing unit based on the received target data; the target data is sent by the storage unit after executing the read instruction.

5. An instruction processing method, characterized in that, The first target instruction is processed by the first information processing unit of the first chip, and the instruction processing method is applied to the second information processing unit of the second chip that executes the first target instruction. The instruction processing method includes: The first information processing unit parses the first data packet written by the switching unit to restore the instruction information corresponding to the first target instruction; the first data packet is generated by the first information processing unit based on the first target instruction, the first target instruction is the instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip; The instruction information is sent to the storage unit of the second chip so that the storage unit executes the first target instruction based on the instruction information.

6. An information processing unit, characterized in that, The information processing unit includes a first information processing unit or a second information processing unit. The first information processing unit is disposed on the first chip that processes the first target instruction. The first information processing unit is used to: encapsulate and generate a first data packet based on the first target instruction, wherein the first target instruction is an instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip. The first information processing unit sends the first data packet to the switching unit so that the first data packet is written to the second information processing unit of the second chip through the mapping of the switching unit, and the second information processing unit parses the first data packet, and sends the instruction information corresponding to the first target instruction obtained from the parsing to the storage unit of the second chip, thereby enabling the storage unit to execute the first target instruction based on the instruction information. The first target instruction is processed by the first information processing unit of the first chip; the second information processing unit is disposed in the second chip that receives and executes the first target instruction; the second information processing unit is used to: parse the first data packet written by the first information processing unit based on the switching unit, and restore the instruction information corresponding to the first target instruction; send the instruction information to the storage unit of the second chip, so that the storage unit executes the first target instruction based on the instruction information; The first data packet is generated by the first information processing unit based on the first target instruction. The first target instruction is an instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip.

7. A chip, characterized in that, The chip is a first chip, which includes an instruction parsing unit, a first information processing unit, and a first interconnect port; the instruction parsing unit is communicatively connected to the first information processing unit, and the first information processing unit is communicatively connected to the first interconnect port. The instruction parsing unit is used to determine that the corresponding instruction is a first target instruction when the instruction points to a preset address range, and to send the first target instruction to the first information processing unit; The first information processing unit is used to encapsulate the received first target instruction into a first data packet and send the first data packet to the first interconnect port; the target instruction is an instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip; The first interconnect port is used to send the first data packet to the connection switching unit, so that the switching unit writes the first data packet to the second information processing unit according to the target write address, thereby enabling the second information processing unit to parse the first data packet and send the instruction information corresponding to the first target instruction obtained from the parsing to the storage unit of the second chip, so that the storage unit can execute the first target instruction.

8. A chip, characterized in that, The chip is a second chip, which includes a second interconnect port, a second information processing unit, and a storage unit; the second interconnect port is communicatively connected to the second information processing unit, and the second processing unit is communicatively connected to the storage unit. The second interconnect port receives the first data packet sent by the connected switching unit and writes the first data into the second information processing unit; The first data packet is generated by the first information processing unit based on a first target instruction. The first target instruction is an instruction corresponding to the storage unit of the second chip, and the target write address of the first data packet points to the second information processing unit of the second chip. The second information processing unit is used to parse the first data packet written by the second interconnect port, so as to send the instruction information corresponding to the first target instruction obtained by parsing and restoring to the storage unit; The storage unit is used to receive the instruction information to execute the first target instruction.

9. A chip system, characterized in that, It includes a switching unit, and a first chip as described in claim 7 and a second chip as described in claim 8 for communicating based on the switching unit.

10. The chip system as described in claim 9, characterized in that, The storage capacity of the second chip is greater than the maximum BAR space that the switching unit can allocate to the second chip; the first chip is a GPGPU chip, the second chip is a GPGPU chip or an IO chip, the GPGPU chip is provided with a first information processing unit and a third information processing unit, the IO chip is provided with a second information processing unit, and the third information processing unit and the second information processing unit have the same function; the chip system includes n GPGPU chips and m IO chips, where n≥1 and m≥2.