Hardware modules, virtualized time-sharing control methods, devices and related equipment
By introducing a timer in the hardware module to record instruction processing time, the cost and efficiency issues caused by adding "in-command switching" support to the hardware module are resolved, and the normal processing and fair use of tenant instructions are realized.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NANJING ILUVATAR COREX TECH CO LTD (DBA ILUVATAR COREX INC NANJING)
- Filing Date
- 2022-12-09
- Publication Date
- 2026-06-30
AI Technical Summary
Adding "command-based switching" support to the hardware module results in high R&D costs, low efficiency, and increased hardware module area overhead.
By introducing a timer into the hardware module to record the time when instruction processing is completed and subtracting this time slice before tenant switching, instruction processing can be forcibly terminated midway, reducing the area of the hardware module and R&D costs.
It enables the normal processing of tenant commands, avoids data loss, ensures the fairness of hardware module usage, and reduces R&D costs and area overhead.
Smart Images

Figure CN116010025B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of virtualization technology, and more specifically, to a hardware module, a virtualization time-sharing control method, a device, and related equipment. Background Technology
[0002] For devices that support hardware virtualization (such as GPUs (Graphics Processing Units), virtualization is typically implemented using time-sharing technology, allowing multiple tenants (also referred to as clients in some literature) to share the same set of hardware resources. To ensure fair use of hardware resources among tenants, a time slice (e.g., 100ms) is usually assigned to each tenant. When the time slice ends, the management software or firmware issues a series of instructions to switch the tenant using the hardware resources.
[0003] In practical applications, it's possible to encounter situations where the time slice expires before the instruction has finished executing. If a tenant using hardware resources is switched directly in this case, the execution of the current instruction must be forcibly terminated, potentially leading to data loss and errors when processing subsequent instructions for that tenant.
[0004] To prevent this from happening, support for "in-command switching" is added to multiple hardware modules of the device. This allows the context to be saved immediately when the time slice ends (i.e., all context information of the current tenant is saved). This enables the context to be restored after the execution of the current instruction ends during operation, thus ensuring that the instruction can continue to be executed correctly.
[0005] However, adding support for "command-in-command switching" to the hardware module requires significant R&D manpower and time, which will substantially increase the development cost and reduce the efficiency of the hardware module. Furthermore, adding support for "command-in-command switching" to the hardware module also requires adding more registers, which will increase the area overhead of the hardware module to some extent. Summary of the Invention
[0006] The purpose of this application is to provide a hardware module, a virtualized time-sharing control method, a device, and related equipment to solve the problem that adding support for "command-based switching" to the hardware module would significantly increase hardware development costs and reduce development efficiency, and would also increase hardware area overhead to some extent.
[0007] This application provides a hardware module supporting virtualization processing, including: a processing unit, configured to receive and process pending instructions from the current tenant, and to provide feedback to the upper-layer software with first notification information indicating that tenant switching is possible after processing the pending instructions when a termination command is received from the upper-layer software; and a timer, connected to the processing unit, configured to start timing when the termination command is received, and to provide feedback to the upper-layer software with the timing duration when a second notification information is received from the processing unit, so that the upper-layer software subtracts the timing duration from the next time slice of the current tenant; wherein the second notification information is a notification generated by the processing unit when it finishes processing the pending instructions.
[0008] In the above implementation, a timer is added to the hardware module supporting virtualization processing. This timer is connected to the processing unit within the hardware module, which processes pending instructions from the current tenant and commands issued by the upper-layer software. The processing unit is configured to, upon receiving a termination command from the upper-layer software, not immediately stop processing the pending instruction if it has not yet been completed. Instead, it continues processing the instruction until it is finished before ceasing service to the current tenant and sending a first notification indicating that a tenant switchover is possible to the upper-layer software. Correspondingly, the timer is configured to start counting when a termination command is received from the upper-layer software, and to send the countdown duration to the upper-layer software upon receiving a second notification after the processing unit has completed processing the pending instructions. This allows the upper-layer software to subtract the countdown duration from the current tenant's next time slice. This ensures that tenant instructions are processed normally and prevents data loss due to forced termination of processing instructions during tenant switchover, thus avoiding subsequent instruction processing errors caused by data loss.
[0009] Meanwhile, since the timer records the duration from the time the termination command is received until the processing unit finishes processing the currently processed instruction, it is essentially the duration of the current tenant's overuse. Therefore, in this embodiment, the upper-layer software can ensure the fairness of each tenant's use of the hardware module by subtracting this duration from the current tenant's next time slice.
[0010] Furthermore, the solution implemented in this application eliminates the need to terminate instruction processing during the instruction processing phase. Therefore, it eliminates the need to add "in-command switching" support to the hardware module, thus avoiding the development manpower and time costs associated with such support. This effectively reduces hardware module development costs and improves development efficiency. Of course, since "in-command switching" support is no longer required, additional registers are also unnecessary, further reducing the hardware module's area overhead.
[0011] Furthermore, the hardware module supporting virtualization processing also includes a virtualization control register, which is connected to the processing unit and the timer respectively, and is used to receive and temporarily store commands issued by the upper-layer software.
[0012] By using the virtualization control register, both the processing unit and the timer can obtain the termination command from the virtualization control register, so that the upper-layer software does not need to repeatedly issue the termination command. Furthermore, by sending the termination command to both the processing unit and the timer simultaneously through the virtualization control register, the accuracy of the timing duration obtained by the timer can be guaranteed.
[0013] Furthermore, the hardware module supporting virtualization processing also includes: an interrupt handling module, connected to the processing unit and the timer, used to notify the upper-layer software of the first notification information and the timing duration via an interrupt.
[0014] In the above implementation process, by setting up an interrupt handling module, the first notification information and the timing duration are notified to the upper-layer software via an interrupt. This reduces the waiting time of the processing unit in the direct program control method and improves the parallel operation of the system.
[0015] This application embodiment also provides a virtualization time-sharing control method, applied in a processing module equipped with upper-layer software, wherein the processing module is connected to any of the aforementioned hardware modules via an interface; the virtualization time-sharing control method includes: after the current tenant's time slice usage time reaches zero, issuing a termination command to the hardware module; upon receiving the timing duration returned by the hardware module, subtracting the timing duration from the current tenant's next round time slice to obtain the current tenant's latest next round time slice, so that when the current tenant uses the hardware module in the next round, the hardware module is used for usage control according to the latest next round time slice.
[0016] In the above implementation, since the timer records the duration from the receipt of the termination command to the processing unit completing the processing of the currently pending instruction, it essentially represents the over-utilization time of the current tenant. Therefore, in this embodiment, the upper-layer software can ensure the fairness of hardware module usage among all tenants by subtracting this duration from the current tenant's next time slice. Furthermore, the solution in this embodiment eliminates the need to end instruction processing during the instruction processing phase, thus eliminating the need to add "in-command switching" support to the hardware module. This also eliminates the manpower and time wasted in developing "in-command switching" support, effectively reducing hardware module development costs and improving efficiency. Of course, since "in-command switching" support is no longer needed, additional registers are not required in the hardware module, further reducing its area overhead.
[0017] Furthermore, the method also includes: upon receiving the first notification information returned by the hardware module, issuing a pending instruction for the next tenant to the hardware module.
[0018] In the above implementation process, when the first notification information returned by the hardware module is received, it indicates that during the current round of use of the hardware module, after the processing unit receives the termination command, there is no longer any instruction of the current tenant being processed in the processing unit. At this time, the switching of the tenant can be achieved by sending the next tenant's pending instruction to the hardware module.
[0019] Furthermore, the method also includes: if the latest next-round time slice of the current tenant is negative after subtracting the timing duration from the next-round time slice of the current tenant, then stop the current tenant's use of the hardware module in the next round, and add the latest next-round time slice of the current tenant to the time slice of the next round of the current tenant.
[0020] In the above implementation process, when the value obtained by subtracting the timing duration from the current tenant's next round time slice is negative, it indicates that the current tenant's overused time has exceeded the maximum time that it can use the hardware module in one cycle (i.e., one round) (i.e., the initial value of the time slice). Therefore, in the next cycle, the current tenant is skipped directly, that is, the current tenant's use of the hardware module in the next round is stopped, and the excess time is deducted from the time slice of the next round after that (i.e., the latest next round time slice), thereby effectively ensuring the fairness of each tenant's use of the hardware module.
[0021] This application embodiment also provides a virtualized time-sharing control device, applied in a processing module, wherein the processing module is connected to any of the aforementioned hardware modules via an interface; the virtualized time-sharing control device includes: a control module, configured to issue a termination command to the hardware module after the current tenant's time slice usage time reaches zero; and an update module, configured to, upon receiving the timing duration returned by the hardware module, subtract the timing duration from the current tenant's next round time slice to obtain the current tenant's latest next round time slice, so that when the current tenant uses the hardware module in the next round, the control module controls the use of the hardware module according to the latest next round time slice.
[0022] This application also provides a graphics processor having any of the aforementioned hardware modules that support virtualization processing.
[0023] This application embodiment also provides a processing module, including a processor and a memory; the processor is used to connect to any of the aforementioned hardware modules that support virtualization processing, and to execute one or more programs stored in the memory to realize the functions of the upper-layer software and execute any of the aforementioned virtualization time-sharing control methods.
[0024] This application also provides an electronic device, including any of the aforementioned hardware modules supporting virtualization processing, and the aforementioned processing modules.
[0025] This application also provides a computer-readable storage medium storing one or more programs that can be executed by one or more processors to implement any of the above-described virtualization time-sharing control methods. Attached Figure Description
[0026] To more clearly illustrate the technical solutions of the embodiments of this application, the accompanying drawings used in the embodiments of this application will be briefly introduced below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.
[0027] Figure 1 A schematic diagram of the basic structure of a hardware module supporting virtualization processing is provided in an embodiment of this application;
[0028] Figure 2 A schematic diagram of the structure of a specific hardware module supporting virtualization processing is provided in an embodiment of this application;
[0029] Figure 3A schematic diagram of a more specific hardware module supporting virtualization processing is provided for embodiments of this application;
[0030] Figure 4 A flowchart illustrating a virtualization time-sharing control method applied in a processing module, as provided in an embodiment of this application;
[0031] Figure 5 A schematic diagram of the structure of a specific hardware module supporting virtualization processing is provided in an embodiment of this application;
[0032] Figure 6 This is a schematic diagram of a virtualized time-sharing control device provided in an embodiment of this application. Detailed Implementation
[0033] The technical solutions in the embodiments of this application will now be described with reference to the accompanying drawings.
[0034] Example 1:
[0035] To address the issues that significantly increase hardware development costs and reduce development efficiency, as well as increase hardware area overhead, due to the need to add "in-command switching" support to hardware modules, this application provides a hardware module that supports virtualization processing. (See also...) Figure 1 As shown, Figure 1 This illustration shows a basic structural diagram of a hardware module supporting virtualization processing provided in an embodiment of this application, including: a processing unit and a timer.
[0036] The processing unit is used to receive and process the pending instructions of the current tenant, and when it receives a termination command from the upper layer software, it processes the pending instructions and then sends a first notification message to the upper layer software indicating that the tenant can be switched.
[0037] A timer, connected to the processing unit, starts timing upon receiving a termination command and feeds back the timing duration to the upper-layer software upon receiving a second notification message from the processing unit, so that the upper-layer software can subtract the timing duration from the current tenant's next time slice. The second notification message is generated by the processing unit after it has finished processing the currently pending instructions. In one optional embodiment, the second notification message can be the same as the first notification message, but this is not a limitation.
[0038] It is understood that in the embodiments of this application, the processing unit may be such as a CPU (Central Processing Unit), a GPU SP (Streaming Processor), or a GPU SM (Streaming Multiprocessor), but this is not a limitation.
[0039] It is also understood that, in the embodiments of this application, firmware, such as drivers, may be installed in the hardware module so that the processing unit can operate normally based on the installed firmware.
[0040] In the embodiments of this application, such as Figure 2 As shown, the hardware module may also include a virtualization control register, which is connected to the processing unit and the timer respectively, and is used to receive and temporarily store commands issued by the upper-layer software.
[0041] It is understood that the virtualization control register can be implemented using a conventional instruction register, but this is not a limitation. It is also understood that the virtualization control register can be a single register or composed of multiple registers, and this is not a limitation in the embodiments of this application. When the virtualization control register is composed of multiple registers, it can be used to receive and store commands issued by upper-layer software, such as termination commands, save commands, restore commands, and start execution commands.
[0042] It is also understood that, in this embodiment, the hardware module may further include a communication interface for communicating with upper-layer software, such as a PCIe (Peripheral Component Interconnect Express) interface, but this is not a limitation. The virtualization control register is connected to this communication interface to receive and store commands issued by the upper-layer software. Furthermore, the upper-layer software can also use this communication interface to issue pending instructions from the current tenant to the processing unit of the hardware module for processing.
[0043] It is also understood that, in this embodiment, although the processing unit will process the currently pending instructions after receiving the termination command from the upper-layer software, and then send back the first notification information indicating that tenant switching is possible, there is no need to add support for "switching in command" in the hardware module. That is, there is no need to save all the context information of the currently pending instructions. However, in order to ensure that the instructions can be executed correctly after the tenant switches back to the current tenant, some necessary context information (such as the instruction pointer indicating which instruction has been executed) still needs to be saved. Therefore, in this embodiment, the hardware module may also include a memory bus to save the context information to memory, and the context information can also be restored through the memory bus.
[0044] In the embodiments of this application, such as Figure 3 As shown, it may also include an interrupt handling module, connected to the processing unit and the timer, for notifying the upper-layer software of the first notification information and the timing duration via an interrupt. It can be understood that the interrupt handling module can also be used to process the data format of the first notification information and the timing duration, thereby enabling the upper-layer software to accurately identify the first notification information and the timing duration.
[0045] It is understood that, in the embodiments of this application, the interrupt handling module can be implemented by a format conversion circuit, but this is not a limitation.
[0046] In this embodiment, the hardware module may further include an interface control module, thereby enabling the processing unit, timer, interrupt handling module, and memory bus to be connected via the interface control module. The interface control module may internally be configured with a signal gating circuit to output different data to the corresponding modules. For example, the signal gating circuit may output second notification information to the timer, first notification information and timing duration to the interrupt handling module, and context information to the memory bus. It can also be understood that the interface control module may internally be configured with a protocol conversion circuit to convert requests into the protocol required by the output module.
[0047] It can also be understood that, in the embodiments of this application, the upper-layer software is carried in a processing module with program execution capability, such as a CPU, a microcontroller, or an MCU (Microcontroller Unit). Therefore, in terms of circuit entity, the processing module carrying the upper-layer software is connected to the aforementioned hardware module. By running the upper-layer software, the relevant instructions to be processed and commands are issued to complete the switching between tenants.
[0048] It can also be understood that the so-called switching between tenants refers to the process of switching the pending instructions that are input into the processing unit of the hardware module from pending instructions belonging to one tenant to pending instructions belonging to another tenant.
[0049] Based on the above embodiments, this application also provides a virtualization time-sharing control method applied to a processing module equipped with upper-layer software, wherein the processing module is connected to the hardware module described above. See also... Figure 4 As shown, Figure 4 This is a flowchart illustrating a virtualized time-sharing control method provided in an embodiment of this application, including:
[0050] S401: After the current tenant's time slice usage time reaches zero, a termination command is sent to the hardware module.
[0051] It is understood that in the embodiments of this application, each tenant has a time slice, the initial value of each tenant's time slice is the same, and the initial value of the time slice is also the same in each round of use of the hardware module.
[0052] For example, suppose there are five tenants, A, B, C, D, and E. Each of these five tenants has a time slice, denoted as A1, B1, C1, D1, and E1 respectively. The initial values of A1, B1, C1, D1, and E1 are all equal. If the initial value of A1, B1, C1, D1, and E1 is 100ms, then in each round of hardware module usage, the initial value of A1, B1, C1, D1, and E1 will always be 100ms.
[0053] It is understood that each round described in the embodiments of this application can also be referred to as each cycle, and one cycle is the time during which all tenants use the hardware module once. For example, still using the previous example, suppose there are five tenants A, B, C, D, and E. After the five tenants A, B, C, D, and E use the hardware module once in sequence, one round of time-sharing multiplexing of the hardware module is completed. In the next round, the five tenants A, B, C, D, and E will continue to use the hardware module once in sequence.
[0054] It can also be understood that the so-called current tenant refers to the tenant to which the instruction to be processed by the current processing unit belongs, that is, the tenant currently using the hardware module.
[0055] It is also understood that, in the embodiments of this application, the initial value of the time slice can be set by the engineer according to actual needs, such as 100ms, but this is not a limitation.
[0056] S402: Upon receiving the timing duration returned by the hardware module, subtract the timing duration from the current tenant's next round time slice to obtain the current tenant's latest next round time slice, so that when the current tenant uses the hardware module in the next round, the hardware module is controlled according to the latest next round time slice.
[0057] It can also be understood that when the processing module receives the first notification information returned by the hardware module, it can also issue a pending instruction for the next tenant to the hardware module to realize the tenant switch. At this time, the next tenant becomes the new current tenant, and the time slice of the next tenant starts counting down. When the usage time of the next tenant's time slice reaches zero, the above process repeats. Figure 4 This process enables the entire tenant to perform time-sharing multiplexing operations on hardware modules.
[0058] For example, continuing with the previous example, assume the initial values of A1, B1, C1, D1, and E1 are 100ms, and this is the first round. During this round of hardware module usage, tenant A has already used 100ms, meaning time slice A1's usage time has reached zero. At this point, the upper-layer software sends a termination command to the hardware module and waits for its response. Referring to the previous introduction of the hardware module, assuming the hardware module's processing unit is processing a pending command from tenant A when it receives the termination command, the processing unit continues processing the pending command, and the hardware module's timer also receives the termination command and starts timing. After the processing unit finishes processing the pending command, it generates a first notification message to the upper-layer software and a second notification message to the timer. Upon receiving the second notification message, the timer stops timing and sends the timing duration back to the upper-layer software.
[0059] Assuming the timing duration is 40ms, after receiving this timing duration, the upper-layer software updates the value of time slice A1 corresponding to tenant A in the next round to (100ms-40ms) = 60ms. Therefore, when processing tenant A's pending instructions in the second round, after the upper-layer software starts tenant A's time slice, this time slice is only 60ms. When the 60ms usage time reaches zero, a termination command is sent to the hardware module, and the software waits for the hardware module's response.
[0060] It is also understandable that, considering that in actual applications, the initial value of the time slice may be set too small, resulting in the timing duration exceeding the initial value of the time slice. In this case, one time slice is not enough to offset the timing duration. Therefore, in order to ensure fairness among tenants, in an optional implementation of this application, if the latest next-round time slice of the current tenant is negative after subtracting the timing duration from the next round time slice of the current tenant, then the current tenant's use of the hardware module in the next round is stopped, and the latest next-round time slice of the current tenant is added to the time slice of the next round of the current tenant.
[0061] For example, continuing with the previous example, assume the initial values of A1, B1, C1, D1, and E1 are 100ms. In the first round, if the timing feedback from tenant A's hardware module is 140ms, the upper-layer software, upon receiving this timing, updates the value of time slice A1 for tenant A in the next round (i.e., the second round) to (100ms - 140ms) = -40ms. Since -40ms is negative, the value of time slice A1 for tenant A in the next round (i.e., the third round) is updated to (100ms + (-40ms)) = 60ms. Then, during the time-sharing multiplexing process of the hardware modules corresponding to each tenant in the second round, instead of processing tenant A's pending instructions, the upper-layer software skips tenant A and issues pending instructions for tenant B, starting tenant B's time slice. In the third round of time-sharing multiplexing of hardware modules corresponding to each tenant, when processing the pending instructions of tenant A, the pending instructions of tenant A are sent to the hardware module and time slice A1 is started. At this time, time slice A1 is only 60ms. When the 60ms usage time is zero, a termination command is sent to the hardware module and the system waits for the hardware module's response.
[0062] In the example above, the usage time only exceeded the initial value of one time slice but not the initial value of two time slices, so tenant A only skipped one round. However, in actual applications, the usage time may exceed the cumulative value of multiple initial time slices. For example, assuming the initial time slice value is 10ms and the usage time is 55ms, the usage time exceeds the cumulative value of five initial time slices. In this case, the tenant will be skipped for the next five rounds, and can only use the hardware module in the sixth round, with a time slice of 5ms. That is, when the usage time exceeds the cumulative value of multiple initial time slices, a portion of the usage time is deducted from the time slice in each round. If the time slice in a round is still positive after deducting the usage time, the tenant does not need to be skipped in that round.
[0063] Based on the hardware module supporting virtualization processing and the virtualization time-sharing control method provided in this application embodiment, a timer is added to the hardware module supporting virtualization processing. This timer is connected to a processing unit within the hardware module that processes pending instructions for the current tenant and commands issued by the upper-layer software. The processing unit is configured to, upon receiving a termination command from the upper-layer software, not immediately stop processing the pending instruction if it has not yet been completed. Instead, it continues processing the instruction until it is finished before stopping service to the current tenant and sending a first notification indicating that tenant switching is possible to the upper-layer software. Correspondingly, by configuring the timer to start counting when a termination command is received from the upper-layer software, and sending the countdown duration to the upper-layer software upon receiving a second notification after the processing unit has completed processing the pending instruction, the upper-layer software subtracts this countdown duration from the current tenant's next time slice. This ensures that tenant instructions are processed normally and prevents data loss due to forced termination of processing instructions during tenant switching, thus avoiding subsequent instruction processing errors caused by data loss. Meanwhile, since the timer records the duration from the receipt of the termination command to the processing unit completing the processing of the currently pending instruction, it essentially represents the over-utilization time of the current tenant. Therefore, in this embodiment, the upper-layer software can ensure the fairness of hardware module usage among all tenants by subtracting this duration from the current tenant's next time slice. Furthermore, the solution in this embodiment eliminates the need to end instruction processing during the instruction's execution, thus eliminating the need to add "in-command switching" support to the hardware module. This also eliminates the manpower and time wasted on developing "in-command switching" support, effectively reducing hardware module development costs and improving efficiency. Of course, since "in-command switching" support is no longer needed, additional registers are also unnecessary, further reducing hardware module area overhead.
[0064] Example 2:
[0065] Based on Embodiment 1, this embodiment uses a specific hardware module structure as an example to further illustrate this application.
[0066] Please see Figure 5 As shown, the hardware module includes a PCIe interface, virtualization control register, CPU, firmware, timer, interface control module, interrupt handling module, and memory bus.
[0067] The PCIe interface is used to communicate with upper-layer software (i.e., the processing module), and the virtualization control register is connected to the PCIe interface. The PCIe interface can write commands issued by the upper-layer software into the virtualization control register by writing to the register, and can also transmit the pending instructions of the current tenant issued by the upper-layer software to the CPU.
[0068] Firmware is essentially a program, which can be burned into the CPU by upper-layer software or other circuits through the PCIe interface to enable the CPU to function properly, such as a driver. Firmware can be burned directly into the CPU without going through the virtualization control register to avoid the program information being too large and crowding the space of the virtualization control register.
[0069] In the embodiments of this application, such as Figure 5 As shown, the CPU and timer are connected to the output of the virtualization control register. The CPU can retrieve commands by reading the register. Simultaneously, the CPU is also connected to the PCIe interface to receive instructions from upper-layer software and process them according to the commands read from the virtualization control register. The timer can also retrieve whether a termination command has been received by reading the register.
[0070] It is understandable that, in order to ensure that the timer will only read the termination command, the virtualization control register can include multiple registers, one of which is dedicated to storing the termination command. Thus, by simply connecting this register to the timer, it can be ensured that the timer will only read the termination command.
[0071] In this embodiment, the CPU can issue interface control commands to the interface control module to control the interface control module to transmit different information to different objects. For example, by issuing interface control commands, the interface control module can output second notification information to the timer, output first notification information and timing duration to the interrupt handling module, and output context information to the memory bus, etc. Furthermore, the interface control module can also convert the data format to the format required by the output module, so that the receiving module can effectively identify and process the received data.
[0072] The interrupt handling module connects to the CPU and timer through the interface control module, and converts the first notification information and the timing duration together as interrupt information into a format that can be recognized by the upper-layer software. Then, it reports the interrupt information to the upper-layer software through the PCIe interface.
[0073] The specific implementation of the embodiments of this application will be described below with reference to a specific example:
[0074] Suppose there are three tenants, A, B, and C, who reuse the aforementioned [resources] using time-sharing multiplexing technology. Figure 5 The hardware module is shown. Assume that each tenant's initial time slice is 100ms per round of time-sharing. Assume the tenants' multiplexing order is A, B, C. Before executing time-sharing, the upper-layer software can send the relevant firmware to the hardware module's CPU via the hardware module's PCIe interface for installation. Afterward, time-sharing of the hardware module can proceed.
[0075] Starting from the first round:
[0076] The upper-layer software sends tenant A's pending instructions and start execution command to the hardware module's PCIe interface. Simultaneously, the upper-layer software begins a countdown of 100ms for tenant A's time slice. The PCIe interface writes the start execution command into the virtualization control register and transmits tenant A's pending instructions to the CPU. The CPU retrieves the start execution command by reading the register and begins processing the pending instructions.
[0077] When tenant A's time slice usage time reaches zero (i.e., the countdown timer resets from 100ms), the upper-layer software sends a termination command to the hardware module's PCIe interface. The PCIe interface writes the termination command into the virtualization control register. The CPU and timer receive the termination command by reading the register, causing the timer to start counting down. The CPU continues processing currently pending instructions, generating a first notification message and a second notification message upon completion, and outputting necessary context information. The interface control module transmits the context information to the memory bus for storage. The interface control module sends the second notification message to the timer. Upon receiving the second notification message, the timer stops counting down and outputs the countdown duration (assumed to be 40ms). The interface control module sends the first notification message and the countdown duration together as an interrupt message to the interrupt handling module. The interrupt handling module converts the interrupt message format to a format recognizable by the upper-layer software and then uploads it to the upper-layer software via the PCIe interface.
[0078] Upon receiving the interrupt information, the upper-layer software updates tenant A's second-round time slice to 100ms - 40ms = 60ms. Simultaneously, it sends tenant B's pending instructions and the start execution command to the hardware module's PCIe interface. The upper-layer software also begins a countdown of 100ms for tenant B's time slice. The PCIe interface writes the start execution command to the virtualization control register and transmits tenant B's pending instructions to the CPU. The CPU retrieves the start execution command by reading the register and begins processing the pending instructions.
[0079] When tenant B's time slice usage time reaches zero (i.e., the countdown timer resets from 100ms), the upper-layer software sends a termination command to the hardware module's PCIe interface. The PCIe interface writes the termination command into the virtualization control register. The CPU and timer receive the termination command by reading the register, causing the timer to start counting down. The CPU continues processing currently pending instructions, generating a first notification message and a second notification message upon completion, and outputting necessary context information. The interface control module transmits the context information to the memory bus for storage. The interface control module sends the second notification message to the timer. Upon receiving the second notification message, the timer stops counting down and outputs the countdown duration (assumed to be 140ms). The interface control module sends the first notification message and the countdown duration together as an interrupt message to the interrupt handling module. The interrupt handling module converts the interrupt message format to a format recognizable by the upper-layer software and then uploads it to the upper-layer software via the PCIe interface.
[0080] Upon receiving the interrupt information, the upper-layer software updates tenant B's second-round time slice to 100ms - 140ms = -40ms and continues to update it to 100ms - 40ms = 60ms for the third round. Simultaneously, the upper-layer software sends tenant C's pending instructions and the start execution command to the hardware module's PCIe interface. At the same time, the upper-layer software begins a 100ms countdown for tenant C's time slice. The PCIe interface writes the start execution command to the virtualization control register and transmits tenant C's pending instructions to the CPU. The CPU reads the register to obtain the start execution command and begins processing the pending instructions.
[0081] When tenant C's time slice usage time reaches zero (i.e., the countdown timer resets from 100ms), the upper-layer software sends a termination command to the hardware module's PCIe interface. The PCIe interface writes the termination command into the virtualization control register. The CPU and timer receive the termination command by reading the register, causing the timer to start counting down. The CPU continues processing currently pending instructions and, upon completion, generates a first notification message and a second notification message, outputting necessary context information. The interface control module transmits the context information to the memory bus for storage. The interface control module sends the second notification message to the timer. Upon receiving the second notification message, the timer stops counting down and outputs the countdown duration (assumed to be 210ms). The interface control module sends the first notification message and the countdown duration together as an interrupt message to the interrupt handling module. The interrupt handling module converts the interrupt message format to a format recognizable by the upper-layer software and then uploads it to the upper-layer software via the PCIe interface.
[0082] After receiving the interruption information, the upper-layer software updates tenant C's second-round time slice to 100ms - 210ms = -110ms, its third-round time slice to 100ms - 110ms = -10ms, and its fourth-round time slice to 100ms - 10ms = 90ms. Simultaneously, it enters the second-round multiplexing process.
[0083] The upper-layer software sends tenant A's pending instructions and start execution command to the hardware module's PCIe interface. Simultaneously, the upper-layer software begins a countdown of 60ms for tenant A's time slice. The PCIe interface writes the start execution command into the virtualization control register and transmits tenant A's pending instructions to the CPU. The CPU retrieves the start execution command by reading the register and begins processing the pending instructions.
[0084] When tenant A's time slice usage time reaches zero (i.e., the countdown timer resets from 60ms), the upper-layer software sends a termination command to the hardware module's PCIe interface. The PCIe interface writes the termination command to the virtualization control register. The CPU and timer receive the termination command by reading the register, causing the timer to start counting down. The CPU continues processing currently pending instructions and, upon completion, generates a first notification message and a second notification message, outputting necessary context information. The interface control module transmits the context information to the memory bus for storage. The interface control module sends the second notification message to the timer. Upon receiving the second notification message, the timer stops counting down and outputs the countdown duration (assumed to be 0ms). The interface control module sends the first notification message and the countdown duration together as an interrupt message to the interrupt handling module. The interrupt handling module converts the interrupt message format to a format recognizable by the upper-layer software and then uploads it to the upper-layer software via the PCIe interface.
[0085] After receiving the interruption information, the upper-layer software updates tenant A's third-round time slice to 100ms - 0ms = 100ms. Simultaneously, since tenant B's second-round time slice is negative (-40ms), tenant B is skipped, and the software determines whether tenant C needs to be skipped. Because tenant C's second-round time slice is negative (-110ms), tenant C is skipped, and the third-round multiplexing process begins.
[0086] The upper-layer software sends tenant A's pending instructions and start execution command to the hardware module's PCIe interface. Simultaneously, the upper-layer software begins a countdown of 60ms for tenant A's time slice. The PCIe interface writes the start execution command into the virtualization control register and transmits tenant A's pending instructions to the CPU. The CPU retrieves the start execution command by reading the register and begins processing the pending instructions.
[0087] When tenant A's time slice usage time reaches zero (i.e., the countdown timer resets from 100ms), the upper-layer software sends a termination command to the hardware module's PCIe interface. The PCIe interface writes the termination command to the virtualization control register. The CPU and timer receive the termination command by reading the register, causing the timer to start counting down. The CPU continues processing currently pending instructions and, upon completion, generates a first notification message and a second notification message, outputting necessary context information. The interface control module transmits the context information to the memory bus for storage. The interface control module sends the second notification message to the timer. Upon receiving the second notification message, the timer stops counting down and outputs the countdown duration (assumed to be 20ms). The interface control module sends the first notification message and the countdown duration together as an interrupt message to the interrupt handling module. The interrupt handling module converts the interrupt message format to a format recognizable by the upper-layer software and then uploads it to the upper-layer software via the PCIe interface.
[0088] Upon receiving the interrupt information, the upper-layer software updates tenant A's third-round time slice to 100ms - 20ms = 80ms. Simultaneously, it sends tenant B's pending instructions and the start execution command to the hardware module's PCIe interface. The upper-layer software also begins a 60ms countdown to tenant B's time slice. The PCIe interface writes the start execution command to the virtualization control register and transmits tenant B's pending instructions to the CPU. The CPU reads the register to obtain the start execution command and begins processing the pending instructions.
[0089] When tenant B's time slice usage time reaches zero (i.e., the countdown timer resets from 60ms), the upper-layer software sends a termination command to the hardware module's PCIe interface. The PCIe interface writes the termination command to the virtualization control register. The CPU and timer receive the termination command by reading the register, causing the timer to start counting down. The CPU continues processing currently pending instructions, generating a first notification message and a second notification message upon completion, and outputting necessary context information. The interface control module transmits the context information to the memory bus for storage. The interface control module sends the second notification message to the timer. Upon receiving the second notification message, the timer stops counting down and outputs the countdown duration (assumed to be 10ms). The interface control module sends the first notification message and the countdown duration together as an interrupt message to the interrupt handling module. The interrupt handling module converts the interrupt message format to a format recognizable by the upper-layer software and then uploads it to the upper-layer software via the PCIe interface.
[0090] After receiving the interruption information, the upper-layer software updates tenant B's time slice for the fourth round to 100ms - 10ms = 90ms. Meanwhile, since tenant C's time slice for the third round is -10ms, tenant C is skipped, and the fourth round of reuse begins. The reuse process is the same as described above and will not be repeated here. It is important to note that tenant C's time slice is positive in this round, and tenant C is no longer skipped.
[0091] Example 3:
[0092] Based on the same inventive concept, this application, in addition to the hardware module supporting virtualization processing provided in Embodiment 1, also provides a virtualization time-sharing control device 600. Please refer to... Figure 6 As shown, Figure 6 It shows the use of Figure 4 The illustrated method is a virtualized time-sharing control device. It should be understood that the specific functions of device 600 are described above; to avoid repetition, detailed descriptions are omitted here. Device 600 includes at least one software function module that can be stored in memory or embedded in the operating system of device 600 in the form of software or firmware. Specifically:
[0093] See Figure 6 As shown, device 600 is applied in a processing module, and the processing module is connected to the hardware module provided in Embodiment 1 via an interface (e.g., a PCIe interface). Device 600 includes: a control module 601 and an update module 602. Wherein:
[0094] The control module 601 is used to send a termination command to the hardware module after the current tenant's time slice usage time reaches zero;
[0095] The update module 602 is used to subtract the timing duration from the next round time slice of the current tenant when it receives the timing duration returned by the hardware module, so as to obtain the latest next round time slice of the current tenant, so that when the current tenant uses the hardware module in the next round, the control module controls the use of the hardware module according to the latest next round time slice.
[0096] In one feasible embodiment of this application, the device 600 may further include an instruction issuing module, which is used to issue a pending instruction for the next tenant to the hardware module when it receives the first notification information returned by the hardware module.
[0097] In one feasible embodiment of this application, the update module 602 is further configured to, if the latest next-round time slice of the current tenant is negative after subtracting the timing duration from the next-round time slice of the current tenant, then add the latest next-round time slice of the current tenant to the next-round time slice of the current tenant.
[0098] The control module 601 is further configured to, if the latest next-round time slice of the current tenant is negative after subtracting the timing duration from the next-round time slice of the current tenant, then stop the current tenant from using the hardware module in the next round.
[0099] It is understood that the above-mentioned software functional modules can be implemented by one software or by multiple different software, and no limitation is made in the embodiments of this application.
[0100] It should be understood that, for the sake of brevity, some of the content described in Embodiment 1 will not be repeated in this embodiment.
[0101] Example 4:
[0102] This embodiment provides a graphics processor having the hardware module supporting virtualization processing provided in Embodiment 1.
[0103] It is understandable that an image processor can have multiple hardware modules supporting virtualization processing. In this case, multiple hardware modules can be used in parallel to achieve time-sharing multiplexing for different tenant sets. Optionally, multiple tenants can be evenly divided into different tenant sets, and then time-sharing multiplexing can be performed on different hardware modules respectively, so that each tenant can make more balanced use of each hardware module.
[0104] For example, suppose there are four hardware modules, labeled hardware module 1, hardware module 2, hardware module 3, and hardware module 4. Assume there are tenants A1, A2, A3, A4, A5, A6, A7, and A8. Tenants A1 and A2 can simultaneously reuse hardware module 1 using time-sharing multiplexing; tenants A3 and A4 can simultaneously reuse hardware module 2 using time-sharing multiplexing; tenants A5 and A6 can simultaneously reuse hardware module 3 using time-sharing multiplexing; and tenants A7 and A8 can reuse hardware module 4 using time-sharing multiplexing. This allows for a more balanced utilization of the hardware modules among the tenants.
[0105] It is also understood that a graphics processor may have other components or circuits besides hardware modules, such as memory, but this is not a limitation.
[0106] It is also understood that an electronic component including the aforementioned graphics processor is also provided in this application embodiment.
[0107] For example, the electronic component may be, but is not limited to, a GPU-equipped component that can be independently produced or manufactured from a GPU board (such as a graphics card).
[0108] This application also provides a processing module, which includes a processor and a memory. The processor is used to connect to the hardware module supporting virtualization processing provided in Embodiment 1, or to connect to the graphics processor provided in this embodiment (specifically, to the hardware module supporting virtualization processing in the graphics processor), or to connect to the electronic components provided in this embodiment (specifically, to the hardware module supporting virtualization processing in the electronic components).
[0109] In addition, the processor is also used to execute one or more programs stored in the memory to implement the functions of the upper-layer software in Embodiment 1 and to execute the virtualization time-sharing control method provided in Embodiment 1.
[0110] It is understood that a processor can be a processor core or processor chip, or other circuitry capable of configuring and running programs, but this is not a limitation. Similarly, memory can be RAM (Random Access Memory), ROM (Read-Only Memory), flash memory, etc., but this is not a limitation either.
[0111] In addition, the processing module may include more components. For example, the processing module may also have an internal communication bus for communication between the processor and the memory; or, for example, the processing module may have an external communication interface, such as a PCIe interface, for information transmission with hardware modules, but this is not a limitation.
[0112] This embodiment also provides an electronic device, which includes the hardware module supporting virtualization processing provided in Embodiment 1, or the graphics processor provided in this embodiment, or the electronic components provided in this embodiment, and the processing module provided in this embodiment. The processing module is connected to the hardware module supporting virtualization processing provided in Embodiment 1.
[0113] It is understood that electronic devices can be data processing devices such as servers, laptops, desktop computers, and tablets, and can be equipped with GPUs, but this is not a limitation. The processing module can be a data processing module within the electronic device that is independent of the GPU, such as the CPU. However, the processing module can also be a part of the GPU that has data processing capabilities and can run software, such as a processor core (SM) within the GPU, but this is also not a limitation.
[0114] This embodiment also provides a computer-readable storage medium, such as a floppy disk, optical disk, hard disk, flash memory, USB flash drive, SD (Secure Digital Memory Card), MMC (Multimedia Card), etc. This computer-readable storage medium stores one or more programs that implement the above steps. These one or more programs can be executed by one or more processors to implement the virtualization time-sharing control method executed by the upper-layer software in Embodiment 1 and / or Embodiment 2. Further details will not be elaborated here.
[0115] In the embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. The apparatus embodiments described above are merely illustrative. For example, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. Furthermore, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Additionally, the displayed or discussed mutual couplings, direct couplings, or communication connections may be through some communication interfaces; indirect couplings or communication connections between devices or units may be electrical, mechanical, or other forms.
[0116] Furthermore, the units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0117] Furthermore, the functional modules in the various embodiments of this application can be integrated together to form an independent part, or each module can exist independently, or two or more modules can be integrated to form an independent part.
[0118] In this document, relational terms such as first and second are used only to distinguish one entity or operation from another entity or operation, without necessarily requiring or implying any such actual relationship or order between these entities or operations.
[0119] In this article, "multiple" refers to two or more.
[0120] The above description is merely an embodiment of this application and is not intended to limit the scope of protection of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of protection of this application.
Claims
1. A hardware module supporting virtualization processing, characterized in that, include: The processing unit is used to receive and process the pending instructions of the current tenant, and to send back first notification information indicating that tenant switching can be performed when the current pending instructions are processed after receiving a termination command from the upper layer software. The upper-layer software is software carried within a processing module that has program execution capabilities; A timer, connected to the processing unit, is used to start timing when the termination command is received, and to feed back the timing duration to the upper-layer software when the second notification information from the processing unit is received, so that the upper-layer software subtracts the timing duration from the next time slice of the current tenant; The second notification information is the notification information generated by the processing unit when it finishes processing the currently processed instruction.
2. The hardware module supporting virtualization processing as described in claim 1, characterized in that, Also includes: The virtualization control register is connected to the processing unit and the timer respectively, and is used to receive and temporarily store commands issued by the upper-layer software.
3. The hardware module supporting virtualization processing as described in claim 1 or 2, characterized in that, Also includes: An interrupt handling module, connected to the processing unit and the timer, is used to notify the upper-layer software of the first notification information and the timing duration via an interrupt.
4. A virtualized time-sharing control method, characterized in that, The method is applied in a processing module equipped with upper-layer software, wherein the processing module is connected to the hardware module as described in any one of claims 1-3 via an interface; the virtualization time-sharing control method includes: After the current tenant's time slice usage time reaches zero, a termination command is sent to the hardware module; Upon receiving the timing duration returned by the hardware module, the timing duration is subtracted from the current tenant's next round time slice to obtain the current tenant's latest next round time slice, so that when the current tenant uses the hardware module in the next round, the hardware module is controlled for use according to the latest next round time slice.
5. The virtualized time-sharing control method as described in claim 4, characterized in that, The method further includes: Upon receiving the first notification information returned by the hardware module, a pending instruction for the next tenant is sent to the hardware module.
6. The virtualization time-sharing control method as described in claim 4, characterized in that, The method further includes: If the latest next-round time slice of the current tenant is negative after subtracting the timing duration from the next round time slice of the current tenant, then the current tenant's use of the hardware module in the next round is stopped, and the latest next-round time slice of the current tenant is added to the time slice of the current tenant in the next round after that.
7. A virtualized time-sharing control device, characterized in that, The virtualization time-sharing control device is applied in a processing module, which is connected to the hardware module as described in any one of claims 1-3 via an interface; the virtualization time-sharing control device includes: The control module is used to send a termination command to the hardware module after the current tenant's time slice usage time reaches zero; An update module is used to subtract the timing duration from the next round time slice of the current tenant when it receives the timing duration returned by the hardware module, so as to obtain the latest next round time slice of the current tenant. When the current tenant uses the hardware module in the next round, the control module controls the use of the hardware module according to the latest next round time slice.
8. A graphics processor, characterized in that, It has a hardware module that supports virtualization processing as described in any one of claims 1-3.
9. A processing module, characterized in that, It includes a processor and a memory; the processor is configured to connect to a hardware module supporting virtualization processing as described in any one of claims 1-3, and to execute one or more programs stored in the memory to implement the functions of the upper-layer software and to execute the virtualization time-sharing control method as described in any one of claims 4-6.
10. An electronic device, characterized in that, It includes the hardware module supporting virtualization processing as described in any one of claims 1-3, and the processing module as described in claim 9.
11. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores one or more programs, which can be executed by one or more processors to implement the virtualization time-sharing control method as described in any one of claims 4-6.