Cross-platform communication system, method, electronic device, storage medium, and program product

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using the abstraction and adaptation layers of the cross-platform communication system, the communication difficulties between the GPU host driver and the embedded coprocessor caused by the heterogeneity of the operating system kernel are solved, realizing transparent operation and stable communication between different operating systems, and reducing the cost and risk of cross-platform porting.

CN122240551APending Publication Date: 2026-06-19MOORE THREADS TECH CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: MOORE THREADS TECH CO LTD
Filing Date: 2026-04-27
Publication Date: 2026-06-19

Application Information

Patent Timeline

27 Apr 2026

Application

19 Jun 2026

Publication

CN122240551A

IPC: G06F15/163; G06F9/54; G06F9/48; G06F9/4401

AI Tagging

Application Domain

Program initiation/switching Interprogram communication

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

In existing technologies, the heterogeneity of operating system kernels makes cross-platform communication between GPU host drivers and embedded coprocessors difficult, resulting in high porting costs, performance uncertainty, and fragmented resource management strategies.

Method used

A cross-platform communication system is provided, including an abstraction layer and an adaptation layer. The abstraction layer provides a unified interface for host drivers under different operating systems, and the adaptation layer implements interface mapping, encapsulates the kernel mechanisms of different operating systems, and realizes cross-platform communication between host drivers and coprocessors.

Benefits of technology

It enables transparent operation across different operating systems, avoiding the adaptation costs and reliability risks of cross-platform porting, ensuring communication stability and consistency, and reducing the development and maintenance burden.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122240551A_ABST

Patent Text Reader

Abstract

This disclosure relates to a cross-platform communication system, method, electronic device, storage medium, and program product, belonging to the field of computer science. The cross-platform communication system is used to achieve cross-platform communication between host drivers and coprocessors in the hardware layer under different operating systems. It includes an abstraction layer and an adaptation layer. The abstraction layer provides a unified first interface for host drivers under different operating systems. The adaptation layer receives calls from the abstraction layer and maps the first interface of the abstraction layer to a second interface under different operating systems to support communication between the abstraction layer and the coprocessors. Embodiments of this disclosure provide a unified cross-platform communication system for interaction between host drivers and coprocessors, enabling transparent source-code-level operation across different operating system architectures without needing to concern themselves with the specific implementation of the underlying operating system. This avoids the adaptation costs and reliability risks associated with cross-platform porting.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of computers, and more particularly to a cross-platform communication system, method, electronic device, storage medium, and program product. Background Technology

[0002] In related technologies, the heterogeneity of the operating system kernel is typically exposed directly to the basic communication layer of the graphics processing unit (GPU) driver. This deep coupling leads to significant challenges in porting, testing, and verifying driver code for GPU embedded coprocessors across platforms, making it difficult to guarantee consistent and reliable performance across different operating systems. For example, the heterogeneity of the operating system kernel may cause communication difficulties between the GPU host driver and the embedded coprocessor. Summary of the Invention

[0003] In view of this, this disclosure presents a cross-platform communication system, method, electronic device, storage medium, and program product.

[0004] According to one aspect of this disclosure, a cross-platform communication system is provided. This system enables cross-platform communication between host drivers under different operating systems and coprocessors in the hardware layer. The cross-platform communication system includes an abstraction layer and an adaptation layer. The abstraction layer is connected to the host driver and provides a unified first interface for host drivers under different operating systems. This first interface supports communication between the adaptation layer and the host driver. The adaptation layer receives calls from the abstraction layer and maps the first interface of the abstraction layer to a second interface under different operating systems to support communication between the abstraction layer and the coprocessors.

[0005] In one possible implementation, the abstraction layer is further configured to: receive request messages from the host driver under different operating systems through the first interface; notify the coprocessor corresponding to the request message so that the coprocessor writes response data to the buffer of the adaptation layer after completing the request message, and triggers a hardware interrupt of the adaptation layer; read the response data of the request message from the buffer in response to the hardware interrupt; and wake up the host driver to process the response data.

[0006] In one possible implementation, the first interface includes an entry function, and the abstraction layer is further configured to: call the unified entry function to receive a request message from the host driver; select a control channel corresponding to the event type of the request message; and write the request message into a buffer in the control channel of the adaptation layer.

[0007] In one possible implementation, the first interface further includes an interrupt bottom-half function, and the abstraction layer is further configured to: receive an acknowledgment signal generated by a hardware interrupt of the adaptation layer, the acknowledgment signal being used to characterize the completion of the upper-half processing operation of the hardware interrupt; based on the acknowledgment signal, invoke the unified interrupt bottom-half function to trigger the task scheduler of the adaptation layer to invoke the bottom-half handler based on the second interface; and receive response data read by the bottom-half handler from the buffer of the adaptation layer.

[0008] In one possible implementation, calling a unified interrupt bottom-half function triggers the task scheduler of the adaptation layer to call the bottom-half handler based on the second interface. This includes: calling a unified interrupt bottom-half function in a delayed procedure call or task chaining model to trigger the task scheduler of the adaptation layer to call the bottom-half handler under different operating systems based on the second interface.

[0009] In one possible implementation, receiving response data read by the lower half processor from the buffer of the adaptation layer includes: receiving response data at a message identifier location read by the lower half processor from the buffer, wherein the message identifier is a globally unique message identifier assigned by the abstraction layer to the request message in response to receiving the request message from the host driver, and the response data is written to the buffer by the coprocessor after completing the request message from the host driver.

[0010] In one possible implementation, the first interface further includes a response function, which is used to wake up the host driver to process response data in either an event notification mode or a polling mode.

[0011] In one possible implementation, the first interface further includes a wait function, which is used to trigger the host driver to wait for response data in either an event notification mode or a polling mode.

[0012] In one possible implementation, the first interface also includes a unified memory management function for allocating memory from a pre-created kernel cache or non-paged pool to build a cross-platform asynchronous execution environment.

[0013] According to another aspect of this disclosure, a cross-platform communication method is provided, the method being applied to a cross-platform communication system. The cross-platform communication system is used to realize cross-platform communication between host drivers under different operating systems and various coprocessors in the hardware layer. The cross-platform communication system includes an abstraction layer and an adaptation layer. The method includes: the abstraction layer providing a unified first interface for host drivers under different operating systems, wherein the first interface supports communication between the adaptation layer and the host driver, and the abstraction layer is connected to the host driver; the adaptation layer receiving a call from the abstraction layer and mapping the first interface of the abstraction layer to a second interface under different operating systems to support communication between the abstraction layer and the coprocessors.

[0014] In one possible implementation, the method further includes: the abstraction layer receiving request messages from the host driver under different operating systems through the first interface; the abstraction layer notifying the coprocessor corresponding to the request message so that the coprocessor writes response data to the buffer of the adaptation layer after completing the request message, and triggers a hardware interrupt of the adaptation layer; the abstraction layer responding to the hardware interrupt reading the response data of the request message from the buffer; and the abstraction layer waking up the host driver to process the response data.

[0015] In one possible implementation, the first interface includes an entry function, and the method further includes: the abstraction layer calling the unified entry function to receive a request message from the host driver; the abstraction layer selecting a control channel corresponding to the event type of the request message; and the abstraction layer writing the request message into a buffer in the control channel of the adaptation layer.

[0016] In one possible implementation, the first interface further includes an interrupt bottom-half function, and the method further includes: the abstraction layer receiving an acknowledgment signal generated by a hardware interrupt of the adaptation layer, the acknowledgment signal being used to indicate that the hardware interrupt has completed the upper-half processing operation; the abstraction layer, based on the acknowledgment signal, calling the unified interrupt bottom-half function to trigger the task scheduler of the adaptation layer to call the bottom-half handler based on the second interface; and the abstraction layer receiving response data read by the bottom-half handler from the buffer of the adaptation layer.

[0017] In one possible implementation, calling a unified interrupt bottom-half function triggers the task scheduler of the adaptation layer to call the bottom-half handler based on the second interface. This includes: calling a unified interrupt bottom-half function in a delayed procedure call or task chaining model to trigger the task scheduler of the adaptation layer to call the bottom-half handler under different operating systems based on the second interface.

[0018] In one possible implementation, receiving response data read by the lower half processor from the buffer of the adaptation layer includes: receiving response data at a message identifier location read by the lower half processor from the buffer, wherein the message identifier is a globally unique message identifier assigned by the abstraction layer to the request message in response to receiving the request message from the host driver, and the response data is written to the buffer by the coprocessor after completing the request message from the host driver.

[0019] In one possible implementation, the first interface further includes a response function, which is used to wake up the host driver to process response data in either an event notification mode or a polling mode.

[0020] In one possible implementation, the first interface further includes a wait function, which is used to trigger the host driver to wait for response data in either an event notification mode or a polling mode.

[0021] In one possible implementation, the first interface also includes a unified memory management function for allocating memory from a pre-created kernel cache or non-paged pool to build a cross-platform asynchronous execution environment.

[0022] According to another aspect of this disclosure, an electronic device is provided, the electronic device including the cross-platform communication system described above.

[0023] According to another aspect of this disclosure, a non-volatile computer-readable storage medium is provided, on which a computer program is stored, which, when executed by a processor, enables an electronic device to have the functions achievable by the cross-platform communication system as described above.

[0024] According to another aspect of this disclosure, a computer program product is provided, including a computer program or a non-volatile computer-readable storage medium carrying the computer program, wherein when the computer program is executed by a processor, it enables an electronic device to have the functions achievable by the cross-platform communication system as described above.

[0025] The cross-platform communication system provided in this disclosure can be used to realize cross-platform communication between host drivers and coprocessors in the hardware layer under different operating systems. The cross-platform communication system includes an abstraction layer and an adaptation layer. The abstraction layer connects to the host driver and provides a unified first interface for host drivers under different operating systems. This first interface supports communication between the adaptation layer and the host driver. The adaptation layer receives calls from the abstraction layer and maps the first interface of the abstraction layer to a second interface under different operating systems to support communication between the abstraction layer and the coprocessors. In this way, a unified cross-platform communication system can be provided for the interaction between the host driver and the coprocessor. It allows for transparent source-code-level operation across different operating system architectures without needing to concern oneself with the specific implementation of the underlying operating system, avoiding the adaptation costs and reliability risks associated with cross-platform porting.

[0026] Other features and aspects of this disclosure will become clear from the following detailed description of exemplary embodiments with reference to the accompanying drawings. Attached Figure Description

[0027] The accompanying drawings, which are included in and form part of this specification, illustrate exemplary embodiments, features, and aspects of this disclosure together with the specification and serve to explain the principles of this disclosure.

[0028] Figure 1 A block diagram of a cross-platform communication system according to an embodiment of the present disclosure is shown.

[0029] Figure 2 A schematic diagram illustrating an application scenario of a cross-platform communication system according to an embodiment of the present disclosure is shown.

[0030] Figure 3 A schematic diagram of a cross-platform communication system according to an embodiment of the present disclosure is shown.

[0031] Figure 4 This diagram illustrates inter-process communication in operating systems within related technologies.

[0032] Figure 5 A schematic diagram of inter-process communication according to an embodiment of the present disclosure is shown.

[0033] Figure 6 A schematic diagram of another inter-process communication according to an embodiment of the present disclosure is shown.

[0034] Figure 7 A flowchart illustrating a cross-platform communication method according to an embodiment of this disclosure is shown.

[0035] Figure 8 A block diagram of an electronic device according to an embodiment of the present disclosure is shown. Detailed Implementation

[0036] Various exemplary embodiments, features, and aspects of this disclosure will now be described in detail with reference to the accompanying drawings. The same reference numerals in the drawings denote elements that have the same or similar functions. Although various aspects of the embodiments are shown in the drawings, they are not necessarily drawn to scale unless specifically indicated otherwise.

[0037] As used herein, the terms “comprising,” “including,” “having,” or variations thereof are open-ended and include one or more of the stated features, integrals, elements, steps, components, or functions, but do not exclude the presence or addition of one or more other features, integrals, elements, steps, components, functions, or groups thereof.

[0038] When an element is referred to as “connected,” “coupled,” “responding,” or a variation thereof relative to another element, it may be directly connected, coupled, or responding to another element, or there may be an intermediate element present.

[0039] Although the terms first, second, third, etc., may be used herein to describe various elements / operations, these elements / operations should not be limited by these terms. These terms are only used to distinguish one element / operation from another. Therefore, without departing from the teachings of the inventive concept, a first element / operation in some embodiments may be referred to as a second element / operation in other embodiments.

[0040] The term “exemplary” as used herein means “serving as an example, embodiment, or illustration.” Any embodiment illustrated herein as “exemplary” is not necessarily to be construed as superior to or better than other embodiments.

[0041] Furthermore, to better illustrate this disclosure, numerous specific details are set forth in the following detailed description. Those skilled in the art will understand that this disclosure can be practiced without certain specific details. In some instances, methods, means, components, and circuits well known to those skilled in the art have not been described in detail in order to highlight the main points of this disclosure.

[0042] It should be noted that the information (including but not limited to user device information, user personal information, etc.), data (including but not limited to data used for analysis, data stored, data displayed, etc.) and signals involved in this application are all authorized by the user or fully authorized by all parties, and the collection, use and processing of related data must comply with the relevant laws, regulations and standards of the relevant regions.

[0043] In heterogeneous operating system kernel driver architectures, such as the Windows Display Driver Model (WDDM) in Windows, the Direct Rendering Manager (DRM) in Linux, and the Kernel Mode Driver (KMD) in Linux, the communication between the host driver (HD) and the on-die management controller (ODMC) relies on a discrete communication scheme directly implemented by the operating system's native mechanisms. Specifically:

[0044] First, the operating system's interrupt bottom-half handling mechanism is discrete. For example, in the Windows WDDM driver stack, asynchronous communication heavily relies on its unique Delayed Procedure Call (DPC) mechanism. This mechanism operates at a high Interrupt Request Level (IRQL), which effectively delays processing, but its execution context is strictly limited and coupled with an event-based (KEVENT) synchronous model, resulting in a long interrupt response path. This makes it difficult to meet the latency-sensitive control requirements of the Graphics Processing Unit (GPU) for power state switching.

[0045] For example, the Linux KMD driver stack commonly uses a soft interrupt request (SoftIRQ) and tasklet model. While the tasklet model is known for its lightweight nature and high concurrency, its scheduling semantics, concurrency guarantees, and interaction with the wait queue differ fundamentally from the Windows DPC model.

[0046] Second, the synchronization and task scheduling models of the operating system are incompatible. For example, for the Windows WDDM and Linux KMD driver stacks, their interface semantics and behaviors are inconsistent in terms of core synchronization primitives (such as Windows' KeWaitForSingleObject and Linux's wait_event_timeout) and work item scheduling interfaces (such as Windows' IoQueueWorkItem and Linux's queue_work). This forces driver developers to maintain two sets of functionally equivalent but completely independent communication state machines and concurrency control logic.

[0047] Third, the operating system's memory and buffer management strategies are fragmented. The ring buffer and its related memory allocation strategies (such as non-paged memory and atomic context memory GFP_ATOMIC), cache consistency maintenance, and producer-consumer synchronization logic used to achieve high-performance communication all need to be implemented and optimized independently for the memory management models of different operating system kernels, making it impossible to build a unified and optimized data transmission channel.

[0048] Based on the above discrete technical solution, the GPU host driver of the operating system has the following aspects that need improvement when communicating with the embedded coprocessor:

[0049] First, when porting firmware-level management functions such as GPU power and thermal control across platforms, the porting costs are high due to differences in operating system architecture. For example, due to the lack of a unified abstraction for interrupt bottom-half mechanisms (such as DPC and Tasklet) and synchronization primitives, driver developers must deeply understand and maintain two independent kernel programming models, resulting in serious code redundancy and a huge burden of development, testing, and long-term maintenance.

[0050] Second, system-level performance and determinism are difficult to guarantee. Due to differences in task scheduling strategies and latency guarantees between heterogeneous kernels, the same communication logic exhibits vastly different response latency and throughput in Windows and Linux environments. This performance uncertainty severely limits the reliability and performance optimization potential of GPU power management, thermal control, and other functions with extremely high real-time requirements.

[0051] Third, resource management strategies are fragmented. Because core resources required for communication, such as memory pools and work items, rely on operating system-specific allocation and management interfaces, unified scheduling and global optimization across lifecycles cannot be achieved, thus affecting the overall resource utilization efficiency and stability of the system.

[0052] To address the communication difficulties between GPU host drivers and embedded coprocessors caused by the heterogeneity of operating system kernels, this disclosure provides a cross-platform communication system. The cross-platform communication system includes an abstraction layer and an adaptation layer. The abstraction layer connects to the host driver and provides a unified first interface for host drivers under different operating systems. This first interface supports communication between the adaptation layer and the host driver. The adaptation layer receives calls from the abstraction layer and maps the first interface of the abstraction layer to a second interface under different operating systems to support communication between the abstraction layer and the coprocessor.

[0053] In this way, a unified cross-platform asynchronous communication framework is provided for the interaction between host drivers (such as GPU host drivers) and on-chip coprocessors (such as various coprocessors integrated into GPUs). It can achieve transparent operation at the source code level without having to care about the specific implementation of the underlying operating system, thus avoiding the adaptation costs and reliability risks brought about by cross-platform porting.

[0054] Figure 1 A block diagram of a cross-platform communication system according to an embodiment of the present disclosure is shown. Figure 1 As shown, the cross-platform communication system is used to realize cross-platform communication between host drivers under different operating systems and coprocessors in the hardware layer. The cross-platform communication system includes an abstraction layer 1 and an adaptation layer 2. The abstraction layer 1 is connected to the host driver and is used to provide a unified first interface for host drivers under different operating systems. The first interface supports communication between the adaptation layer 2 and the host driver. The adaptation layer 2 receives calls from the abstraction layer 1 and is used to map the first interface of the abstraction layer 1 to a second interface under different operating systems to support communication between the abstraction layer 1 and the coprocessors.

[0055] Figure 2 A schematic diagram illustrating an application scenario of a cross-platform communication system according to an embodiment of this disclosure is provided. Figure 2 As shown, the cross-platform communication system may include an abstraction layer 1 and an adaptation layer 2. Through the abstraction layer 1 and the adaptation layer 2, the cross-platform communication system can realize cross-platform communication between the host driver (e.g., host driver layer 3) and each coprocessor in the hardware layer 4.

[0056] The host driver deploys software programs that drive the hardware operations within the computer. The host driver can be a host driver layer 3 under different operating systems, such as the GPU host driver layer under Windows or Linux. The GPU host driver layer can run in the host's Central Processing Unit (CPU). In this example, host driver layer 3 may include a power management module, a performance tuning module, and a thermal control module. The power management module can accept calls from the user layer to provide drivers related to GPU power management; the performance tuning module can accept calls from the user layer to provide drivers related to GPU performance parameter tuning; and the thermal control module can accept calls from the user layer to provide drivers related to GPU heat dissipation.

[0057] Abstraction Layer 1, also known as the Unified Operating System Abstraction Layer, is the interface layer located between the operating system kernel (e.g., host driver) and hardware circuitry (e.g., coprocessor). Specifically, Abstraction Layer 1 connects Host Driver Layer 3 and Adaptation Layer 2, providing a unified first interface for Host Driver Layer 3 under different operating systems. It then maps this unified first interface to second interfaces under different operating systems via Adaptation Layer 2, enabling communication with the hardware layer 4 coprocessor. Abstraction Layer 1 provides a set of Application Programming Interfaces (APIs) completely independent of the operating system as the first interface. For example, Abstraction Layer 1 may include memory management interfaces, synchronization waiting interfaces, task scheduling interfaces, and event notification interfaces. Functional modules of Host Driver Layer 3 (e.g., power management modules, performance tuning modules, thermal control modules) can rely on this Abstraction Layer 1 to provide a stable first interface, achieving complete decoupling of business logic from the underlying platform.

[0058] The adaptation layer 2, also known as the operating system kernel adaptation layer, serves as a translation mechanism for cross-platform communication systems, implementing interface mapping and encapsulating kernel mechanisms. Interface mapping maps the first interface of abstraction layer 1 to the second interface of the native kernel of each operating system. For example, if abstraction layer 1 calls the unified interrupt bottom-half function `unified_tasklet_schedule`, adaptation layer 2 can map `unified_tasklet_schedule` to the native function `tasklet_schedule()` on the Linux side and to the native function `DxgkCbQueueDpc()` on the Windows side. Furthermore, adaptation layer 2 can encapsulate the synchronization and scheduling mechanisms specific to each operating system. For instance, adaptation layer 2 can abstract the Linux wait function `wait_event` and the Windows wait function `KeWaitForSingleObject` into a single interface function `unified_ipc_msg wait_event`, serving as the wait function for adaptation layer 2.

[0059] Hardware layer 4 can include various coprocessors, such as Figure 2 As shown, the coprocessor may include a system management controller (SMC), a communication management controller (CMC), etc. The embodiments of this disclosure do not specifically limit the type of coprocessor.

[0060] For example, a coprocessor may include one or more dedicated microcontrollers or embedded coprocessors integrated within processor hardware such as a Graphics Processing Unit (GPU), General Purpose Computing on Graphics Processing Units (GPGPU), Tensor Processing Unit (TPU), or Neural Network Processing Unit (NPU). This coprocessor does not directly participate in graphics or computational rendering tasks, but rather is responsible for auxiliary management and control functions of the chip. These functions include, but are not limited to: Power State Management (PSM), Dynamic Voltage and Frequency Scaling (DVFS), Thermal and Power Monitoring (TPM), Error Logging (EL), and secure communication with other system components (such as the motherboard chipset).

[0061] In this architecture, the host driver layer 3 and the abstraction layer 1 have a strict unidirectional dependency. The driver code in the host driver layer 3 depends on the first interface of the abstraction layer 1, completely eliminating the need to be aware of the underlying operating system, thus enabling "code once, run across platforms." Similarly, the abstraction layer 1 and the adaptation layer 2 also have a strict unidirectional dependency. The abstraction layer 1 calls the second interface (e.g., a specific operating system's native API) under different operating systems through the adaptation layer 2 to implement platform-specific functions. This strict unidirectional dependency helps ensure transparent migration of driver logic and consistency in cross-platform behavior.

[0062] In one possible implementation, the abstraction layer 1 is further configured to: receive request messages from the host driver under different operating systems through the first interface; notify the coprocessor corresponding to the request message so that the coprocessor writes response data to the buffer of the adaptation layer 2 after completing the request message, and triggers a hardware interrupt of the adaptation layer 2; read the response data of the request message from the buffer in response to the hardware interrupt; and wake up the host driver to process the response data.

[0063] For example, host drivers under different operating systems can call a unified first interface. Thus, whether it's a host driver under Windows or Linux, they can call the same function to receive request messages from host drivers under different operating systems. These request messages may include messages requesting power state management from the coprocessor, messages requesting adjustment of dynamic voltage and frequency from the coprocessor, messages requesting adjustment of chip temperature from the coprocessor, messages requesting communication from the coprocessor, etc. The embodiments of this disclosure do not limit the types of request messages.

[0064] When the host driver calls the first interface of the abstraction layer 1, the abstraction layer 1 does not care about the type of the operating system. Instead, in response to the host driver's call, it writes the host driver's request message carried by the first interface into the buffer of the adaptation layer 2.

[0065] Buffers can alleviate the slowdown of host-driven request coprocessors. For example, when retrieving the response data for request message B, the response data for request message A needs to be used. If request message B hasn't reached the point where it needs to access the response data for request message A, request message A won't be processed in advance; that is, the process of retrieving the response data for request message A will be delayed. To reduce the waiting time for processing request message A, a buffer can be set up. The response data for request message A is stored in the buffer each time, and request message B can read the response data for request message A from the buffer each time. This process is the producer-consumer model.

[0066] The producer-consumer model uses a buffer as an intermediary to decouple producers and consumers. Producers generate data and place it in the buffer, while consumers retrieve data from the buffer for processing; direct interaction between the two is unnecessary. When the buffer is not full, the producer can continue producing; when the buffer is full, the producer enters a waiting state. Consumers retrieve data from the buffer and process it; when the buffer is empty, consumers enter a waiting state. The buffer automatically adjusts the production and consumption speed through a blocking mechanism to avoid resource waste or overload, ensuring the orderliness of requested messages and data consistency in a concurrent environment.

[0067] Optionally, the buffer can be a first-in-first-out queue, and may include a circular buffer, a linked buffer, a fixed-length buffer, etc. The embodiments of this disclosure do not limit this.

[0068] Abstraction layer 1 writes request messages from the host driver into a buffer and can notify the coprocessor to be requested via register operations or memory mapping. Once the coprocessor has processed the request message, it can trigger a hardware interrupt; subsequently, the coprocessor can write the response data of the request message back to the circular buffer. Simultaneously, abstraction layer 1 can respond to the hardware interrupt triggered by the coprocessor, read the response data of the request message from the buffer, and wake up the host driver to process the response data.

[0069] In this way, the host driver does not need to pay attention to the specific implementation of the underlying operating system, and can achieve transparent operation at the source code level across different operating system systems, avoiding the adaptation costs and reliability risks brought about by cross-platform porting.

[0070] In one possible implementation, the first interface further includes an interrupt bottom-half function, and the abstraction layer 1 is further configured to: receive an acknowledgment signal generated by a hardware interrupt of the adaptation layer 2, the acknowledgment signal being used to indicate that the hardware interrupt has completed the upper-half processing operation; based on the acknowledgment signal, invoke the unified interrupt bottom-half function to trigger the task scheduler of the adaptation layer 2 to invoke the bottom-half handler based on the second interface; and receive response data read by the bottom-half handler from the buffer of the adaptation layer 2.

[0071] For example, a hardware interrupt triggered by adaptation layer 2 can call an interrupt service routine (ISR) to perform the minimum necessary hardware acknowledgment operations, i.e., the upper half processing operations, and generate an acknowledgment signal. In response to receiving the acknowledgment signal triggered by the hardware interrupt, abstraction layer 1 can call a unified interrupt lower half function based on the acknowledgment signal. This interrupt lower half function can be mapped to different functions in different operating systems. For example, in Windows, this function is mapped to the callback function DxgkCbQueueDpc, which is a callback function used to queue delayed procedure calls (DPCs) in the Windows Display Driver Model (WDDM). In Linux, it is mapped to the tasklet_schedule or work item, thus realizing unified triggering of heterogeneous kernel mechanisms.

[0072] Interrupt service routines are responsible for immediately responding to hardware interrupts and performing critical and urgent tasks, such as reading device status or saving register values. Their execution time must be as short as possible to avoid blocking other interrupts or affecting response speed. After completing the urgent task, abstraction layer 1 calls a unified interrupt bottom-half function, triggering the task scheduler of adaptation layer 2 to call the bottom-half handler based on the second interface. This achieves delayed process calls, postponing the execution of non-urgent but time-consuming operations (such as data backup, read / write operations, etc.). This design, by separating urgent and non-urgent tasks, ensures the timeliness of interrupt handling while reducing the time occupied by coprocessor hardware resources, improving communication efficiency and stability.

[0073] Abstraction layer 1, by invoking a unified interrupt bottom-half function, can trigger the task scheduler of adaptation layer 2 to call the bottom-half handler based on the second interface. This bottom-half handler is responsible for reading the response data of the request message from the buffer. For example, the scheduled bottom-half handler can run in a safe execution context, responsible for reading and parsing the response data from the circular buffer.

[0074] Abstraction layer 1 can receive response data from request messages read from the buffer by the bottom half processor. This response data is written by the coprocessor after it has finished processing the request message.

[0075] By setting a unified interrupt bottom-half function, different functions can be mapped to different operating systems, thus achieving unified triggering of heterogeneous kernel mechanisms.

[0076] Abstraction layer 1, upon receiving the response data from the request message, can wake up the host driver to process the response data. Abstraction layer 1 can wake up the host driver using either an event-driven or polling-driven approach. For example, in event-driven mode, abstraction layer 1 can wake up blocked threads in the host driver via an event object. In polling mode, abstraction layer 1 can use write-release semantics to update flag variables, causing waiting threads in the host driver to exit their busy-wait loop.

[0077] The abstraction layer 1 of this disclosure acts as a deeply abstracted communication middleware, bridging the architectural differences between kernels of different operating systems (e.g., Windows and Linux) in terms of interrupt latency handling, thread synchronization, and task scheduling. This provides a cross-platform asynchronous communication system with consistent behavior and semantics, reentrancy, and low latency for the interaction between the GPU host driver and the on-chip management controller. This method unifies the encapsulation of heterogeneous kernel mechanisms such as interrupt bottom-half mechanisms and task scheduling models under different operating systems, enabling transparent migration and stable operation of the GPU host driver across heterogeneous operating systems and resolving cross-platform adaptation bottlenecks.

[0078] The following provides an exemplary description of a cross-platform communication system according to an embodiment of this disclosure.

[0079] In one possible implementation, the first interface further includes memory management functions for allocating memory from a pre-created kernel cache or a non-paged pool to build a cross-platform asynchronous execution environment. For example, the memory management functions may include a memory allocation function `unified_os_ipc_msg_alloc` and a memory release function `unified_os_ipc_msg_free()`, which are used to allocate memory from a pre-created kernel cache (Linux) or a non-paged pool (Windows) to build a cross-platform asynchronous execution environment.

[0080] For example, abstraction layer 1 can call a unified memory management function, which can allocate work item memory in the non-paged pool through the kernel API IoAllocateWorkItem() in the Windows WDDM driver stack. Then, with the help of adaptation layer 2, IoQueueWorkItem() can be called to submit it to the work queue of delayed procedure calls managed by the system (such as CriticalWorkQueue) for execution.

[0081] In the Linux KMD driver stack, a work queue with specific attributes (such as concurrency or ordering) can be created through the wrapper interface os_alloc_workqueue(), and tasks can be submitted to the queue using os_queue_work().

[0082] Furthermore, the operations of this solution can be encapsulated and scheduled using the unified interface `unified_os_ipc_dispatch_queue_work()`. This interface can automatically select the optimal underlying queue based on the task's real-time requirements (indicated by the `need_concurrency` parameter), thereby achieving consistent asynchronous execution semantics across heterogeneous kernels.

[0083] In this way, not only is memory availability guaranteed at any interruption request level, but the caching mechanism also significantly improves performance in frequent communication scenarios.

[0084] In one possible implementation, the first interface includes an entry function, and the abstraction layer 1 is further configured to: call the unified entry function to receive a request message from the host driver; select a control channel corresponding to the event type of the request message according to the event type of the request message; and write the request message into the buffer of the control channel of the adaptation layer 2.

[0085] For example, host drivers under different operating systems can call the unified entry function ipc_message_transmit(msg) in abstraction layer 1. Abstraction layer 1 can automatically route to the corresponding control channel based on the event type (e.g., EVENT_TYPE_HOST_TO_embedded coprocessor) in the request message msg.

[0086] The control channels may include System Management Controller (SMC) channels, Communication Management Controller (CMC) channels, etc.

[0087] In this adaptation layer 2, multiple control channels can exist. Each control channel may include at least some of the following: a dedicated buffer (e.g., a circular buffer), hardware interrupts, a task scheduler, interrupt vectors, task units, and lock resources. Different control channels are isolated from each other and do not affect each other. These control channels can be dynamically detected and initialized when the host driver is loaded, providing high flexibility and scalability, and enabling scalable design for multiple coprocessors.

[0088] Optionally, if a control channel corresponding to the event type of the request message exists in the adaptation layer 2, a control channel corresponding to the event type of the request message can be selected from multiple control channels in the adaptation layer 2; if a control channel corresponding to the event type of the request message does not exist in the adaptation layer 2, a control channel corresponding to the event type of the request message can be created in the adaptation layer 2.

[0089] The request message `msg` can be a concrete object instantiated from the abstraction layer 1 structure `msg_dispatch_info`. Abstraction layer 1 can encapsulate platform-related task units through the `struct msg_dispatch_info` structure. Its internal working members are defined at compile time as pointers to specific operating system working item structures, achieving complete decoupling between business logic and platform implementation.

[0090] After a control channel is allocated for a request message, the request message can be written to a buffer dedicated to that control channel. This buffer may include a transmit ring buffer and a receive ring buffer; abstraction layer 1 can write the request message to the transmit ring buffer.

[0091] In this way, abstraction layer 1 can provide a unified entry function for different operating systems. Developers only need to call a unified interface to access hardware functions. The same coprocessor can be ported to different operating systems without writing adaptation code for each system, thus exhibiting good compatibility. Furthermore, by matching dedicated control channels for request messages of different event types, it is beneficial to communicate concurrently with multiple embedded coprocessors.

[0092] In one possible implementation, receiving response data read by the lower half processor from the buffer of the adaptation layer 2 includes: receiving response data at the location indicated by the message identifier read by the lower half processor from the buffer, wherein the message identifier is a globally unique message identifier assigned by the abstraction layer 1 to the request message in response to receiving the request message from the host driver, and the response data is written to the buffer by the coprocessor after completing the request message from the host driver.

[0093] For example, abstraction layer 1 can assign a globally unique message identifier to each request message by calling the identifier function id_map_alloc(), which is used for matching subsequent request messages with response data. In this way, abstraction layer 1 can read the response data at the position indicated by the message identifier of the request message from the buffer according to the instructions of the bottom half processor.

[0094] As can be seen, message identifiers can ensure the uniqueness of request messages, avoid conflicts, and improve reliability in high-concurrency scenarios.

[0095] In one possible implementation, the first interface further includes a wait function, which is used to trigger the host driver to wait for response data in either an event notification mode or a polling mode. The first interface also includes a response function, which is used to wake up the host driver to process the response data in either an event notification mode or a polling mode.

[0096] For example, after the abstraction layer 1 notifies the coprocessor to be requested by the request message, it can provide a unified wait function for the host driver. The wait function is used to start the host driver to wait for the response data of the request message in either event notification mode or polling mode. The abstraction layer 1 wakes up the host driver to process the response data, including: providing a unified response function for the host driver, which is used to wake up the host driver to process the response data of the request message in either event notification mode or polling mode.

[0097] For synchronous calls that require a response, the host driver can call a unified wait function in abstraction layer 1. Abstraction layer 1 can provide either an event notification mode or a polling mode.

[0098] For polling mode, host drivers under different operating systems can call the unified wait function unified_ipc_msg_wait_event_polling() in abstraction layer 1 to implement it. Internally, it can use high-precision clock sources under various operating systems (such as the clock source KeQueryInterruptTimePrecise under Windows) for busy waiting, which is suitable for microsecond-level scenarios that are extremely sensitive to latency.

[0099] For the event notification mode, host drivers under different operating systems can call the unified wait function unified_ipc_msg_wait_event() in abstraction layer 1 to implement it. It encapsulates various blocking mechanisms of operating system kernels (such as KeWaitForSingleObject under Windows and wait_event_timeout under Linux), and actively yields the host driver's resources to perform other tasks during the waiting period, thereby improving resource utilization.

[0100] For example, if the host driver, acting as the communication initiator, adopts a certain synchronous mode, the host driver's thread will enter a controlled waiting state after sending a request. For instance, in Windows, this can be achieved by blocking on an event object using `KeWaitForSingleObject()` and setting a precise timeout. In Linux, it can be achieved by sleeping on the wait queue using `os_wait_event_timeout()`.

[0101] Once the response data for the request message is ready, abstraction layer 1 can call the unified response function `unified_ipc_schedule_response_msg()`. This response function is a unified abstraction for the wake-up mechanism. For example, in event notification mode, blocked threads can be woken up using the `KeSetEvent()` function in Windows or the `wake_up()` function in Linux. In polling mode, write-release semantics can be used to update flag variables, causing waiting threads to exit the busy-wait loop.

[0102] Therefore, by setting unified wait and response functions in Abstraction Layer 1, a completely consistent semantics can be presented to the host driver, making it applicable to various operating systems. Furthermore, Abstraction Layer 1 provides two synchronization mechanisms: polling and event notification, supporting various waiting strategies from microsecond-level precise timing to thread-level blocking scheduling to adapt to communication scenarios with different real-time requirements. In addition, synchronous waiting can be achieved through wait and response functions, which is beneficial for providing predictable low-latency responses and stable communication bandwidth for GPU tasks such as power management and thermal control, eliminating performance jitter caused by platform differences.

[0103] In one possible implementation, calling a unified interrupt bottom-half function triggers the task scheduler of the adaptation layer 2 to call the bottom-half handler based on the second interface. This includes: calling a unified interrupt bottom-half function in a delayed procedure call or task chaining model to trigger the task scheduler of the adaptation layer 2 to call the bottom-half handler under different operating systems based on the second interface.

[0104] For example, when the coprocessor completes a request and triggers a hardware interrupt, its interrupt service routine only performs the necessary hardware response and then calls the unified interrupt bottom-half function interface `unified_tasklet_schedule()`. This interface maps to the task chaining model-related native functions `tasklet_schedule()` or `queue_work()` in Linux environments, and to the delayed process call-related native functions `DxgkCbQueueDpc()` or `IoQueueWorkItem()` in Windows environments. The task scheduler in adaptation layer 2 responds to the interrupt bottom-half function interface `unified_tasklet_schedule()`. It can execute the task chaining model-related native functions `tasklet_schedule()` or `queue_work()` through the second interface to implement the bottom-half processing program under the Linux operating system, or it can execute the delayed process call-related native functions `DxgkCbQueueDpc()` or `IoQueueWorkItem()` through the second interface to implement the bottom-half processing program under the Windows operating system.

[0105] In this way, a unified behavior abstraction and interface encapsulation are performed for interrupt bottom-half calls under different operating systems (such as the task chain model in Linux and delayed procedure calls in Windows), which is beneficial for cross-platform work item distribution.

[0106] Figure 3 A schematic diagram of a cross-platform communication system according to an embodiment of the present disclosure is shown. The following is an example of... Figure 3 Taking an example, the cross-platform communication system of this disclosure will be described exemplarily. The adaptation layer 2 may include a ring buffer, hardware interrupts, and a task scheduler.

[0107] During the request sending phase, the GPU host driver calls the unified entry function `ipc_message_transmit()`, regardless of the operating system type. The cross-platform communication system automatically handles request message routing, message identifier allocation, and buffer management, triggering the embedded coprocessor through an abstract doorbell notification mechanism.

[0108] For example, the GPU host driver calls the unified entry function `ipc_message_transmit(msg)`, which automatically routes the request message `msg` to the corresponding control channel (e.g., SMC channel or CMC channel) based on its event type. Abstraction layer 1 can assign a globally unique message identifier ID to each request message by calling the identifier function `id_map_alloc()`, which is used for matching subsequent request messages with response data.

[0109] The cross-platform communication system also supports delayed initialization of the control channel. If the control channel already exists, channel matching can be performed directly based on the event type of the request message to select the corresponding control channel; if the control channel is not ready, the initialization function ipc_do_ctrl_init() will be automatically called to complete operations such as ring buffer verification, lock initialization, and task unit registration to ensure the robustness of the communication link.

[0110] Abstraction layer 1 can write request messages into the transmit ring buffer (TX RingBuffer) in the corresponding control channel and trigger the doorbell register to notify the target coprocessor (e.g., the SMC coprocessor).

[0111] When a thread in the GPU host driver calls the unified wait function `unified_ipc_msg_wait_event()`, the host driver enters a controlled wait state. This wait mode can include event notification mode and polling mode (high-precision busy wait). For example, in event notification mode, after abstraction layer 1 calls the unified wait function `unified_ipc_msg_wait_event()`, it can be mapped to the actual wait function `KeWaitForSingleObject` in Windows, and to the wait function `wait_event_timeout` in Linux, but presents completely consistent semantics to the GPU host driver.

[0112] During the interrupt bottom-half scheduling phase, after the coprocessor completes its processing, a hardware interrupt is triggered. The hardware interrupt calls the interrupt service routine `ipc_irq_handler(ipc_ctrl)`, which performs only minimal processing. Upon receiving the acknowledgment signal confirming the completion of the interrupt service routine `ipc_irq_handler(ipc_ctrl)`, abstraction layer 1 immediately calls the unified interrupt bottom-half function `unified_tasklet_schedule()`. This function maps to the native callback function `DxgkCbQueueDpc` in Windows (Win) and to the native task chaining scheduling function `tasklet_schedule` in Linux, thus achieving unified triggering of heterogeneous kernel mechanisms.

[0113] During the response processing phase, the bottom-half handler (such as `ipc_tasklet_handler`) triggered by the interrupt bottom-half function `unified_tasklet_schedule()` reads response data from the receive ring buffer (RX Ring Buffer) and calls the unified response function `unified_ipc_schedule_response_msg()`. Depending on the waiting mode, it sets the event flag or clears the polling variable, thus precisely waking up the previously waiting driver thread and completing the entire communication transaction. For example, in event notification mode, the blocked thread can be woken up using the `KeSetEvent()` function in Windows or the `wake_up()` function in Linux. In polling mode, write-release semantics can be used to update the flag variable, causing the waiting thread to exit the busy-waiting loop. Regardless of whether the underlying implementation uses an event object (corresponding to event notification mode) or a wait queue (corresponding to polling mode), the correct waiting thread can be precisely woken up, ensuring the integrity and reliability of the transaction.

[0114] Figure 4 This diagram illustrates inter-process communication in operating systems within related technologies. For example... Figure 4 As shown, the Windows WDDM driver adopts a discrete communication scheme, which may include a request initiation phase, an interrupted and delayed procedure call (DPC) processing phase, and a response polling and processing phase.

[0115] During the request initiation phase, the host driver can send a function request message (such as a power management request message GetPower) and call the inter-process communication function IpcTransmitMsg to select synchronous or asynchronous mode based on the communication mode. Then, it can call the sendIpcMessage function to write the function request message to the request queue and trigger a soft interrupt to notify the embedded coprocessor via the softIrqRaise function. For synchronous requests, the thread does not directly wait for a response but instead notifies a separate manager thread via the event notification function KeSetEvent.

[0116] During the interrupt and delayed procedure call handling phase, the embedded coprocessor triggers a hardware interrupt, calling the interrupt handling callback function DxgkInterruptRoutine. The interrupt service routine calls the interrupt handling callback function ProcessInterrupt for processing, and subsequently, the delayed procedure call scheduler scheduleDpc can schedule a delayed procedure call. Further processing is performed in the delayed procedure call routine (such as DpcRoutine), and the event notification function KeSetEvent is called again to notify the manager thread.

[0117] During the response polling and processing phase, a separate embedded coprocessor manager thread exists. This manager thread checks the request queue in a loop by waiting for events or sorting polling. When the manager thread detects a response, it calls the inter-process communication callback function IPCMessageHandler, and then the process replies with the message function ProcessReplyMsg to process the response data.

[0118] Figure 5 A schematic diagram illustrating inter-process communication according to an embodiment of this disclosure is shown. Figure 5 As shown, Windows inter-process communication in a cross-platform communication system can include a request sending phase, an interrupt bottom half scheduling phase, and a response processing phase.

[0119] During the request sending phase, the host driver can send a function request message (such as a get_freq request message to obtain the coprocessor's operating frequency). This can be achieved by calling the inter-process communication transmission function ipc_transmit, which then enters the unified entry point ipc_transmit_ext. The unified entry point function ipc_message_transmit routes the message to the corresponding channel based on its type, calling the send request message function ipc_send_message. Finally, the message is written to the request queue via the function call rb_ops->put, triggering a soft interrupt to notify the embedded coprocessor.

[0120] During the interrupt bottom-half scheduling phase, the embedded coprocessor triggers a hardware interrupt, calling the interrupt handler callback function DxgkInterruptRoutine. After the interrupt service routine (ISR) performs hardware verification, it calls the native Windows operating system function DxgkCbQueueDpc to schedule a delayed procedure call, ultimately executing the unified operating system's delayed procedure call service routine unified_os_dpc_routine.

[0121] During the response processing phase, the unified operating system's delayed procedure call service routine `unified_os_dpc_routine` calls the bottom-half handler `ipc_tasklet_handler` to read response data from the response queue. The waiting thread is then awakened by the unified response function `unified_ipc_schedule_response_msg`, which returns the data and completes the transaction.

[0122] Figure 6 A schematic diagram illustrating another inter-process communication according to an embodiment of this disclosure is shown. Figure 6As shown, Linux inter-process communication in a cross-platform communication system can include a request sending phase, an interrupt bottom half scheduling phase, and a response processing phase.

[0123] During the request-to-send phase, the host driver can request a message through the send function. After calling the inter-process communication transmission function `ipc_transmit` to enter the unified entry point `ipc_transmit_ext`, the unified entry function `ipc_message_transmit` routes the message to the corresponding channel based on the message type and calls the send request message function `ipc_send_message`. Then, it writes the message to the send circular buffer via the function call `rb_ops->put`, triggering a soft interrupt to notify the embedded coprocessor.

[0124] During the interrupt and delayed procedure call handling phase, the embedded coprocessor triggers a hardware interrupt, calling the interrupt service routine (ISR) `ipc_irq_handler()`. After quickly acknowledging the interrupt, the ISR calls the unified interrupt bottom-half function `unified_tasklet_schedule()` to schedule the bottom-half task.

[0125] During the response processing phase, the scheduled bottom-half handler, ipc_tasklet_handler, reads the response data from the receive circular buffer and wakes up the thread blocked in the waiting function, unified_ipc_schedule_response_msg(), through the unified response function, unified_ipc_msg_wait_event(), thus completing the inter-process communication transaction.

[0126] Figure 7 A flowchart illustrating a cross-platform communication method according to an embodiment of this disclosure is shown. Figure 7 As shown, the method is applied to a cross-platform communication system, which is used to realize cross-platform communication between host drivers and coprocessors in the hardware layer under different operating systems. The cross-platform communication system includes an abstraction layer and an adaptation layer. The method includes:

[0127] In step S71, the abstraction layer provides a unified first interface for host drivers under different operating systems, wherein the first interface supports communication between the adaptation layer and the host driver, and the abstraction layer is connected to the host driver;

[0128] In step S72, the adaptation layer receives the call from the abstraction layer and maps the first interface of the abstraction layer to a second interface under a different operating system to support communication between the abstraction layer and the coprocessor.

[0129] In one possible implementation, the cross-platform communication method of this disclosure embodiment is used for communication between various coprocessors integrated in GPUs and GPU host drivers, providing a unified framework across operating systems.

[0130] In one possible implementation, the cross-platform communication method of this disclosure embodiment can be executed by an electronic device such as a terminal device or a server. The terminal device can be a user equipment (UE), mobile device, user terminal, terminal, cellular phone, cordless phone, personal digital assistant (PDA), handheld device, computing device, in-vehicle device, wearable device, etc. The performance analysis method can be implemented by a processor calling computer-readable instructions stored in memory. Alternatively, the method can be executed by a server.

[0131] In this way, a unified cross-platform asynchronous communication system is provided for the interaction between the GPU host driver and the on-chip coprocessor. It can achieve transparent operation at the source code level without having to care about the specific implementation of the underlying operating system, thus avoiding the adaptation costs and reliability risks brought about by cross-platform porting.

[0132] In one possible implementation, the method further includes: the abstraction layer receiving request messages from the host driver under different operating systems through the first interface; the abstraction layer notifying the coprocessor corresponding to the request message so that the coprocessor writes response data to the buffer of the adaptation layer after completing the request message, and triggers a hardware interrupt of the adaptation layer; the abstraction layer responding to the hardware interrupt reading the response data of the request message from the buffer; and the abstraction layer waking up the host driver to process the response data.

[0133] In one possible implementation, the first interface includes an entry function, and the method further includes: the abstraction layer calling the unified entry function to receive a request message from the host driver; the abstraction layer selecting a control channel corresponding to the event type of the request message; and the abstraction layer writing the request message into a buffer in the control channel of the adaptation layer.

[0134] In one possible implementation, the first interface further includes an interrupt bottom-half function, and the method further includes: the abstraction layer receiving an acknowledgment signal generated by a hardware interrupt of the adaptation layer, the acknowledgment signal being used to indicate that the hardware interrupt has completed the upper-half processing operation; the abstraction layer, based on the acknowledgment signal, calling the unified interrupt bottom-half function to trigger the task scheduler of the adaptation layer to call the bottom-half handler based on the second interface; and the abstraction layer receiving response data read by the bottom-half handler from the buffer of the adaptation layer.

[0135] In one possible implementation, calling a unified interrupt bottom-half function triggers the task scheduler of the adaptation layer to call the bottom-half handler based on the second interface. This includes: calling a unified interrupt bottom-half function in a delayed procedure call or task chaining model to trigger the task scheduler of the adaptation layer to call the bottom-half handler under different operating systems based on the second interface.

[0136] In one possible implementation, receiving response data read by the lower half processor from the buffer of the adaptation layer includes: receiving response data at a message identifier location read by the lower half processor from the buffer, wherein the message identifier is a globally unique message identifier assigned by the abstraction layer to the request message in response to receiving the request message from the host driver, and the response data is written to the buffer by the coprocessor after completing the request message from the host driver.

[0137] In one possible implementation, the first interface further includes a response function, which is used to wake up the host driver to process response data in either an event notification mode or a polling mode.

[0138] In one possible implementation, the first interface further includes a wait function, which is used to trigger the host driver to wait for response data in either an event notification mode or a polling mode.

[0139] In one possible implementation, the first interface also includes a unified memory management function for allocating memory from a pre-created kernel cache or non-paged pool to build a cross-platform asynchronous execution environment.

[0140] In summary, the embodiments of this disclosure construct a layered cross-platform communication system. First, this cross-platform communication system can pipe inter-process communication (IPC) messages, for example, it can be responsible for zero-initialization safe allocation, cache pooling, and lifecycle management of communication messages, ensuring the reliability of operations in interrupt contexts and other paths. Second, this cross-platform communication system implements synchronous waiting control, for example, it can provide two synchronization mechanisms: polling and event notification, supporting various waiting strategies from microsecond-level precise timing to thread-level blocking scheduling, to adapt to communication scenarios with different real-time requirements. Third, this cross-platform communication system can also perform task scheduling abstraction, and can provide unified behavior abstraction and interface encapsulation for Linux task chains or workqueues for delayed procedure calls, as well as Windows delayed procedure calls (DPC) / work item scheduling (WorkItem), to achieve cross-platform work item distribution. This cross-platform communication system can also perform interrupt handling and routing, responsible for capturing and responding to hardware interrupt signals from embedded coprocessors, completing the fast processing of the upper half of the interrupt, and reliably scheduling the corresponding lower half processing tasks. Furthermore, the circular buffer in this cross-platform communication system enables lock-free or fine-grained lock-based data transmission channels based on the producer-consumer model, ensuring message ordering and data consistency in a concurrent environment. In addition, this cross-platform communication system features a unified resource management module, enabling unified creation, initialization, synchronization, and destruction management of controlled objects (such as IPC channels, task units, and buffers), ensuring overall system resource consistency.

[0141] As can be seen, this solution addresses the communication difficulties between GPU host drivers and embedded coprocessors caused by the heterogeneity of operating system kernels in related technologies. It provides a cross-platform asynchronous communication system with deterministic low latency characteristics. This method achieves unified communication semantics by constructing an abstraction layer 1 and an adaptation layer 2. For example, by constructing a unified interface specification, it encapsulates and shields the underlying operating system-specific interrupt bottom-half mechanisms (such as Linux's tasklet / softirq and Windows' DPC) and task scheduling models (such as Linux's workqueue and Windows' KeWorkItem), ensuring that the same IPC communication logic has consistent behavioral semantics in heterogeneous kernel environments. Furthermore, this solution provides predictable low-latency responses and stable communication bandwidth for critical tasks such as GPU power management and thermal control through optimized synchronous waiting and buffer management strategies, eliminating performance jitter introduced by platform differences. In addition, this solution enables data exchange between the GPU host driver and the coprocessor to achieve transparent source-code-level operation between Windows WDDM and Linux DRM / KMD systems without needing to concern themselves with the specific implementation of the underlying operating system, eliminating the adaptation costs and reliability risks associated with cross-platform porting.

[0142] It is understood that the various method embodiments mentioned above in this disclosure can be combined with each other to form combined embodiments without violating the principle and logic. Due to space limitations, this disclosure will not elaborate further. Those skilled in the art will understand that in the above methods of specific implementation, the specific execution order of each step should be determined by its function and possible internal logic.

[0143] In addition, this disclosure also provides cross-platform communication devices, electronic devices, computer-readable storage media, and program products, all of which can be used to implement any of the cross-platform communication methods provided in this disclosure. The corresponding technical solutions and descriptions are described in the corresponding records in the method section and will not be repeated here.

[0144] In some embodiments, the functions or modules of the apparatus provided in this disclosure can be used to perform the methods described in the above method embodiments. The specific implementation can be referred to the description of the above method embodiments, and for the sake of brevity, it will not be repeated here.

[0145] This disclosure also provides an electronic device, which includes the cross-platform communication system described above.

[0146] This disclosure also provides a non-volatile computer-readable storage medium storing a computer program thereon, which, when executed by a processor, enables an electronic device to perform the functions achievable by the cross-platform communication system described above.

[0147] This disclosure also provides a computer program product, including a computer program or a non-volatile computer-readable storage medium carrying the computer program, wherein when the computer program is executed by a processor, the electronic device enables the functions that the cross-platform communication system described above can achieve.

[0148] Figure 8 A block diagram of an electronic device 1900 according to an embodiment of the present disclosure is shown. For example, the electronic device 1900 may be provided as a server or a terminal device. (Refer to...) Figure 8 The electronic device 1900 includes a processing component 1922, which further includes one or more processors, and memory resources represented by memory 1932 for storing instructions, such as application programs, that can be executed by the processing component 1922. The application programs stored in memory 1932 may include one or more modules, each corresponding to a set of instructions. Furthermore, the processing component 1922 is configured to execute instructions to perform the methods described above.

[0149] Electronic device 1900 may also include a power supply component 1926 configured to perform power management of electronic device 1900, a wired or wireless network interface 1950 configured to connect electronic device 1900 to a network, and an input / output interface 1958 (I / O interface). Electronic device 1900 can operate on an operating system, such as Windows Server, stored in memory 1932. TM Mac OS X TM Unix TM Linux TM FreeBSD TM Or similar.

[0150] In an exemplary embodiment, a non-volatile computer-readable storage medium is also provided, such as a memory 1932 including computer program instructions that can be executed by a processing component 1922 of an electronic device 1900 to perform the above-described method.

[0151] Computer-readable storage media can be tangible devices capable of holding and storing programs / instructions used by instruction execution devices. Computer-readable storage media can be, for example—but not limited to—electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disc read-only memory (CD-ROM), digital multifunction disc (DVD), memory sticks, floppy disks, mechanical encoding devices, such as punch cards or recessed protrusions storing instructions thereon, and any suitable combination of the foregoing. The computer-readable storage media used herein are not to be construed as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses through fiber optic cables), or electrical signals transmitted through wires.

[0152] The computer program (or computer-readable program instructions) described herein can be downloaded from a computer-readable storage medium to various computing / processing devices, or downloaded via a network, such as the Internet, local area network, wide area network, and / or wireless network, to an external computer or external storage device. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives the computer-readable program instructions from the network and forwards them to the computer-readable storage medium in the respective computing / processing device.

[0153] The computer program (or computer program instructions) used to perform the operations of this disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as the "C" language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some embodiments, electronic circuitry, such as programmable logic circuitry, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), is personalized by utilizing state information from the computer-readable program instructions to implement various aspects of this disclosure.

[0154] Various aspects of this disclosure are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this disclosure. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.

[0155] These computer-readable program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processor of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner; thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.

[0156] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0157] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction containing one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than those marked in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.

[0158] The various embodiments of this disclosure have been described above. These descriptions are exemplary and not exhaustive, nor are they limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles, practical application, or technical improvements to the embodiments in the market, or to enable others skilled in the art to understand the embodiments disclosed herein.

Claims

1. A cross-platform communication system, characterized in that, The cross-platform communication system is used to enable cross-platform communication between host drivers and coprocessors in the hardware layer under different operating systems. The cross-platform communication system includes an abstraction layer and an adaptation layer. The abstraction layer connects to the host driver and provides a unified first interface for host drivers under different operating systems. The first interface supports communication between the adaptation layer and the host driver. The first interface also includes a unified memory management function, which is used to allocate memory from a pre-created kernel cache or non-paged pool to build a cross-platform asynchronous execution environment. The adaptation layer receives a call from the abstraction layer to map the first interface of the abstraction layer to a second interface under different operating systems, so as to support communication between the abstraction layer and the coprocessor.

2. The cross-platform communication system according to claim 1, characterized in that, The abstraction layer is also used for: The first interface is used to receive request messages from the host driver under different operating systems. The coprocessor corresponding to the request message is notified so that after the coprocessor completes the request message, it writes response data to the buffer of the adaptation layer and triggers a hardware interrupt of the adaptation layer. In response to the hardware interrupt, read the response data of the request message from the buffer; The host driver is awakened to process the response data.

3. The cross-platform communication system according to claim 1, characterized in that, The first interface includes an entry function, and the abstraction layer is further used for: The unified entry function is invoked to receive request messages from the host driver; Based on the event type of the request message, select a control channel corresponding to the event type for the request message; The request message is written into the buffer of the control channel in the adaptation layer.

4. The cross-platform communication system according to claim 1, characterized in that, The first interface also includes an interrupt bottom half function, and the abstraction layer is further used for: Receive an acknowledgment signal generated by a hardware interrupt of the adaptation layer, the acknowledgment signal being used to indicate that the hardware interrupt has completed the upper half of the processing operation; Based on the confirmation signal, the unified interrupt bottom half function is invoked to trigger the task scheduler of the adaptation layer to call the bottom half processing program based on the second interface; The lower half of the process receives response data read from the buffer of the adaptation layer.

5. The cross-platform communication system according to claim 4, characterized in that, The unified interrupt bottom-half function is invoked to trigger the task scheduler of the adaptation layer to call the bottom-half handler based on the second interface, including: The unified interrupt bottom-half function is invoked to trigger the task scheduler of the adaptation layer to call the bottom-half handler under different operating systems based on the second interface, in a delayed procedure call or task chain model.

6. The cross-platform communication system according to claim 4, characterized in that, Receiving response data read from the adapter layer's buffer by the lower half of the processing program includes: The lower half of the processing program reads response data from the buffer at the location indicated by the message identifier, wherein the message identifier is a globally unique message identifier assigned by the abstraction layer to the request message received from the host driver, and the response data is written to the buffer by the coprocessor after completing the request message from the host driver.

7. The cross-platform communication system according to any one of claims 1 to 6, characterized in that, The first interface also includes a response function, which is used to wake up the host driver to process response data in either event notification mode or polling mode.

8. The cross-platform communication system according to any one of claims 1 to 6, characterized in that, The first interface also includes a wait function, which is used to trigger the host driver to wait for response data in either event notification mode or polling mode.

9. A cross-platform communication method, characterized in that, The method is applied to a cross-platform communication system, which enables cross-platform communication between host drivers and coprocessors in the hardware layer under different operating systems. The cross-platform communication system includes an abstraction layer and an adaptation layer. The method includes: The abstraction layer provides a unified first interface for host drivers under different operating systems. The first interface supports communication between the adaptation layer and the host driver. The abstraction layer is connected to the host driver. The first interface also includes a unified memory management function, which is used to allocate memory from a pre-created kernel cache or non-paged pool to build a cross-platform asynchronous execution environment. The adaptation layer receives the call from the abstraction layer and maps the first interface of the abstraction layer to a second interface under different operating systems to support communication between the abstraction layer and the coprocessor.

10. An electronic device, characterized in that, The electronic device includes a cross-platform communication system as described in any one of claims 1 to 8.

11. A non-volatile computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it enables the electronic device to perform the functions that the cross-platform communication system as described in any one of claims 1 to 8 can achieve.

12. A computer program product comprising a computer program, or a non-volatile computer-readable storage medium carrying the computer program, characterized in that, When the computer program is executed by the processor, it enables the electronic device to perform the functions that the cross-platform communication system as described in any one of claims 1 to 8 can achieve.