Authentication method and device for computing device, electronic device, and storage medium

By employing request queues and response queues to execute multiple verification threads in parallel during GPU verification, the problem of excessively long simulation times for large-scale and multi-core chips is solved, achieving a more efficient verification process and significantly accelerating the simulation speed of GPU chips.

CN116933698BActive Publication Date: 2026-06-12SHANGHAI BIREN TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANGHAI BIREN TECH CO LTD
Filing Date
2022-04-07
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing GPU verification methods take too long to simulate on large-scale and multi-core chips and lack effective multi-threaded simulation solutions, resulting in low simulation efficiency.

Method used

Multiple verification threads, including test threads, main threads, and computing device threads, are executed in parallel using request queues and response queues. They communicate with the computing unit through an interconnect module, send drive signals using the request queue, and receive response signals through the response queue. A system clock signal is generated in conjunction with a clock generator module to accelerate the verification process.

Benefits of technology

It significantly accelerates the simulation speed of GPU chips, improves verification efficiency, and shortens the development cycle.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116933698B_ABST
    Figure CN116933698B_ABST
Patent Text Reader

Abstract

A verification method and device for a computing device, an electronic device and a storage medium, the computing device comprising an interconnection module and at least one computing unit, the verification method comprising: providing a verification environment for verifying the computing device, wherein the verification environment comprises a request queue, a reply queue and a master bus function module; and executing a plurality of verification threads in parallel by using the request queue and the reply queue, wherein the plurality of verification threads comprises a test thread, a master thread and a computing device thread, the master thread running the master bus function module to perform interface driving and sampling operations on the computing device, communicating with the at least one computing unit through the interconnection module, the test thread sending a driving signal to the computing device and receiving a response signal from the computing device, and the computing device thread running the computing device to perform operations. The method has the advantages of multi-thread parallel simulation, can accelerate the simulation speed, and significantly accelerates the chip development speed.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] Embodiments of this disclosure relate to a verification method, verification apparatus, electronic device, and computer-readable storage medium for a computing device. Background Technology

[0002] With the rapid development of the integrated circuit industry, chip complexity has increased significantly. For example, the size of Graphics Processing Units (GPUs) is growing larger, thus placing increasingly higher demands on GPU functional verification and requiring shorter iteration cycles. Currently, Universal Verification Methodology (UVM), a verification platform development framework based on the System Verilog library, is widely used in GPU verification. For example, simulation tools such as Synopsys or Cadence can be used for design verification. Summary of the Invention

[0003] At least one embodiment of this disclosure provides a verification method for a computing device. The computing device includes an interconnect module and at least one computing unit. The verification method includes: providing a verification environment for verifying the computing device, wherein the verification environment includes a request queue, a response queue, and a main bus function module; and using the request queue and the response queue to execute multiple verification threads in parallel, wherein the multiple verification threads include a test thread, a main thread, and a computing device thread. The main thread runs the main bus function module to perform interface driving and sampling operations for the computing device and communicates with at least one computing unit through the interconnect module. The test thread sends drive signals to the computing device and receives response signals from the computing device. The computing device thread runs the computing device to perform calculations.

[0004] For example, in the verification method provided in at least one embodiment of this disclosure, multiple verification threads are executed in parallel using a request queue and a response queue, including: using the request queue and the response queue to perform data communication between the multiple verification threads.

[0005] For example, in the verification method provided in at least one embodiment of this disclosure, data communication between multiple verification threads is performed using a request queue and a response queue, including: the test thread uses the request queue to send a drive signal to the computing device through the main thread, and uses the response queue to receive a response signal from the computing device through the main thread.

[0006] For example, in a verification method provided in one embodiment of this disclosure, the verification environment further includes a clock generator module, and the verification method further includes: running the clock generator module to generate a system clock signal, and using the system clock signal to execute multiple verification threads in parallel.

[0007] For example, in a verification method provided in one embodiment of this disclosure, the verification environment further includes a system software interface module, and the test thread also runs the system software interface module to communicate with the system software to receive test cases and drivers for verification of the computing device from the system software.

[0008] For example, in a verification method provided in one embodiment of this disclosure, the test thread sends the drive signals corresponding to the test cases and drivers to the main bus function module through a request queue via the system software interface module, and feeds back the status of the computing device obtained by the main bus function module to the system software through a reply queue.

[0009] For example, in a verification method provided in one embodiment of this disclosure, the verification environment further includes a bus function module and a memory module. The main thread also runs the bus function module and the memory module, receives storage access requests from the computing device, and accesses the memory module by responding to the storage access requests through the bus function module.

[0010] For example, in a verification method provided in one embodiment of this disclosure, a bus function module provides sampling of the address channel from the computing device to the memory module and driving of the data channel.

[0011] For example, in a verification method provided in one embodiment of this disclosure, the master bus function module and the slave bus function module conform to the Advanced Scalable Interface Specification or the Peripheral Component Rapid Interconnect Specification.

[0012] For example, in a verification method provided in one embodiment of this disclosure, the memory module conforms to the high bandwidth memory specification or the double rate synchronous dynamic random access memory specification; the test thread also performs initialization operations on the computing device and trigger operations on at least one computing unit.

[0013] For example, in a verification method provided in one embodiment of this disclosure, the computing device further includes a command processor module, and the computing device thread runs the computing device, including: running the command processor module to interact with at least one computing unit, and the initialization operation includes initialization operation of the command processor module and initialization of at least one computing unit.

[0014] For example, in a verification method provided in one embodiment of this disclosure, the computing device further includes multiple register modules, and the test thread further configures the register modules to perform initialization operations on the computing device and trigger operations on at least one computing unit.

[0015] For example, in a verification method provided in one embodiment of this disclosure, at least one computing unit includes multiple computing units, and a computing device thread includes multiple computing device sub-threads, each of which is used to run multiple computing units.

[0016] For example, in a verification method provided in one embodiment of this disclosure, the computing device is a single-core image processing unit or a multi-core image processing unit, and the computing unit is the computing core of the image processing unit.

[0017] For example, in a verification method provided in one embodiment of this disclosure, the request queue includes a write address queue and a write data queue. The test thread uses the request queue to send a drive signal to the computing device through the main thread, including: the test thread sending an address signal to the write address queue, and in response to the main thread detecting that the write address queue is not empty, driving the address to the computing device according to the bus timing; the test thread sending a data signal to the write data queue, and in response to the main thread detecting that the write data queue is not empty, driving the address to the computing device according to the bus timing.

[0018] For example, in a verification method provided in one embodiment of this disclosure, the main bus function module provides driving for the address channel of the address signal and driving for the data channel, as well as performing write response channel sampling for the address channel.

[0019] For example, in a verification method provided in one embodiment of this disclosure, the response queue includes a write command response queue, and the write command response queue is used to receive response signals from the computing device through the main thread.

[0020] At least one embodiment of this disclosure also provides a verification method for multiple computing devices coupled together. Each computing device includes an interconnect module and at least one computing unit. The verification method for each computing device includes: providing a verification environment for verifying the computing device, wherein the verification environment includes a request queue, a response queue, and a main bus function module; and executing multiple verification threads in parallel using the request queue and the response queue, wherein the multiple verification threads include a test thread, a main thread, and a computing device thread. The main thread runs the main bus function module to perform interface driving and sampling operations for the computing device and communicates with at least one computing unit through the interconnect module. The test thread sends drive signals to the computing device and receives response signals from the computing device. The computing device thread runs the computing device to perform calculations.

[0021] For example, in a verification method provided in one embodiment of this disclosure, the verification environment further includes a system software interface module, and the test thread also runs the system software interface module to communicate with the system software to receive test cases and drivers for verification of each of the multiple computing devices from the system software.

[0022] At least one embodiment of this disclosure also provides a verification apparatus for a computing device. The computing device includes an interconnect module and at least one computing unit. The verification apparatus includes a verification environment unit and a thread execution unit. The verification environment unit is configured to provide a verification environment for verifying the computing device. The verification environment includes a request queue, a response queue, and a main bus function module. The thread execution unit is configured to execute multiple verification threads in parallel using the request queue and the response queue. The multiple verification threads include a test thread, a main thread, and a computing device thread. The main thread runs the main bus function module to perform interface driving and sampling operations for the computing device and communicates with at least one computing unit through the interconnect module. The test thread sends drive signals to the computing device and receives response signals from the computing device. The computing device thread runs the computing device to perform calculations.

[0023] At least one embodiment of this disclosure also provides an electronic device, including: a processor; and a memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the processor, implement the verification method provided in at least one embodiment of this disclosure.

[0024] At least one embodiment of this disclosure provides a computer-readable storage medium for non-transitory storage of computer-executable instructions, wherein the computer-executable instructions, when executed by a processor, implement the verification method provided in at least one embodiment of this disclosure. Attached Figure Description

[0025] To more clearly illustrate the technical solutions of the embodiments of this disclosure, the accompanying drawings of the embodiments will be briefly described below. Obviously, the drawings described below only relate to some embodiments of this disclosure and are not intended to limit this disclosure.

[0026] Figure 1A A schematic diagram of the structure of an image processing unit is shown;

[0027] Figure 1B A schematic flowchart of a verification method for a computing device provided in at least one embodiment of the present disclosure is shown;

[0028] Figure 2 A schematic diagram of the structure of a verification environment for a computing device according to at least one embodiment of the present disclosure is shown;

[0029] Figure 3 A flowchart illustrating the parallel execution of multiple verification threads on a computing device according to at least one embodiment of the present disclosure is shown;

[0030] Figure 4 A schematic diagram of an example of a verification environment and verification method for a system comprising multiple computing devices, according to at least one embodiment of the present disclosure, is shown.

[0031] Figure 5 A flowchart illustrating the process of executing multiple verification threads in parallel across multiple computing devices is shown.

[0032] Figure 6 A schematic block diagram of a verification apparatus for a computing device provided in at least one embodiment of the present disclosure is shown;

[0033] Figure 7 A schematic block diagram of an electronic device provided for some embodiments of this disclosure;

[0034] Figure 8 A schematic block diagram of another electronic device provided for some embodiments of this disclosure;

[0035] Figure 9 This is a schematic diagram of a storage medium provided for some embodiments of this disclosure. Detailed Implementation

[0036] To make the objectives, technical solutions, and advantages of the embodiments of this disclosure clearer, the technical solutions of the embodiments of this disclosure will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this disclosure. All other embodiments obtained by those skilled in the art based on the described embodiments of this disclosure without creative effort are within the scope of protection of this disclosure.

[0037] Unless otherwise defined, the technical or scientific terms used in this disclosure shall have the ordinary meaning understood by one of ordinary skill in the art to which this disclosure pertains. The terms “first,” “second,” and similar terms used in this disclosure do not indicate any order, quantity, or importance, but are merely used to distinguish different components. Similarly, the terms “an,” “a,” or “the,” and similar terms do not indicate a quantity limitation, but rather indicate the presence of at least one. The terms “including,” “comprising,” or “containing,” and similar terms mean that the element or object preceding the word encompasses the elements or objects listed following the word and their equivalents, without excluding other elements or objects. The terms “connected,” “linked,” or similar terms are not limited to physical or mechanical connections, but can include electrical connections, whether direct or indirect. The terms “upper,” “lower,” “left,” and “right,” etc., are used only to indicate relative positional relationships, and these relative positional relationships may change accordingly when the absolute position of the described objects changes.

[0038] For large-scale chips, such as ultra-large-scale GPUs and multi-core GPUs with on-chip, inter-chip, and inter-board interconnects, simulation using tools like VCS or Cadence results in excessive simulation time and memory consumption, often requiring a week, a month, or even longer to complete the simulation of a single test case. Current simulation tools lack effective multi-threaded simulation solutions, and their acceleration effects are unsatisfactory.

[0039] At least one embodiment of this disclosure provides a verification method for a computing device, the computing device including an interconnect module and at least one computing unit. The verification method includes: providing a verification environment for verifying the computing device, wherein the verification environment includes a request queue, a response queue, and a main bus function module; and executing multiple verification threads in parallel using the request queue and the response queue, wherein the multiple verification threads include a test thread, a main thread, and a computing device thread, the main thread running the main bus function module to perform interface driving and sampling operations for the computing device, and communicating with at least one computing unit through the interconnect module, the test thread using the request queue to send drive signals to the computing device through the main thread, and using the response queue to receive response signals from the computing device through the main thread, and the computing device thread running the computing device to perform calculations.

[0040] The verification method for computing devices provided in the above embodiments of this disclosure has the advantage of multi-threaded parallel simulation, which can speed up the simulation speed and significantly accelerate the chip development speed. For example, it has obvious advantages in simulating large-scale multi-core GPU chips.

[0041] At least one embodiment of this disclosure also provides a verification apparatus, an electronic device, and a non-transitory computer-readable storage medium for implementing the verification method described above.

[0042] The embodiments of this disclosure will now be described in detail with reference to the accompanying drawings, but this disclosure is not limited to these specific embodiments.

[0043] Figure 1A A schematic diagram of an exemplary image processing unit 100 is shown. Figure 1A As shown, the image processing unit 100 includes a command processor module 111, one or more computing units 130, etc. For example, the image processing unit 100 may also include a memory module (not shown), such as a high-bandwidth memory (HBM). For example, the memory module of the image processing unit 100 can be accessed by a device outside the image processing unit 100 (e.g., a CPU).

[0044] Figure 1AThe image processing unit 100 is illustrated with two computing units 130 as examples, while other possible computing units are omitted. Each computing unit 130 includes a thread block scheduling / distribution module, multiple computing kernels, a register file, a shared L1 cache, etc. The register file includes multiple registers, such as general-purpose registers or vector registers. To schedule threads performing computing tasks among the multiple computing units 130, the image processing unit 100 may further include a thread block scheduling module 121. The multiple computing units 130 are coupled to each other, for example, via on-chip interconnects (not shown).

[0045] The image processing unit 100 can be used for computational tasks such as matrix calculations and image rendering, which can be executed in parallel by multiple threads. For example, before execution, these threads are divided into multiple thread blocks in the thread block scheduling module 121, and then these thread blocks are distributed to various computing units. All threads in a thread block are typically assigned to the same computing unit for execution. Simultaneously, thread blocks can be split into thread bundles, for example, each thread bundle containing a fixed number (or less than this fixed number) of threads, such as 32 threads. Multiple thread blocks can be executed in the same computing unit or in different computing units.

[0046] Within each computing unit, the thread bundle scheduling / distribution module schedules and allocates thread bundles so that multiple computing cores of that computing unit 130 can run their corresponding thread bundles. Each computing core includes an arithmetic logic unit (ALU), a floating-point unit, etc. Depending on the number of computing cores in the computing unit, multiple thread bundles within a thread block can be executed simultaneously or in a time-sharing manner. Multiple threads within each thread bundle will execute the same instructions.

[0047] During the execution of computational tasks, the computing unit 130 needs to acquire input data to be processed and also generates result data, which can be stored, for example, in the memory module (e.g., HBM) of the image processing unit 100. This input or result data can be transferred (input or output) via direct memory operations, in which case the computing unit 130 will generate a transfer access request for direct memory operations.

[0048] The above Figure 1A The exemplary image processing unit 100 shown is for illustrative purposes only. This disclosure is not limited to the structure and composition shown. For example, embodiments of this disclosure can be used for other types of image processing units.

[0049] Figure 1BA schematic flowchart illustrating a verification method for a computing device according to at least one embodiment of this disclosure is shown. For example, the computing device includes an interconnect module and at least one computing unit. For example, in some embodiments of this disclosure, the computing device may be a single-core CPU or a multi-core CPU, or a single-core GPU or a multi-core GPU. Correspondingly, the computing unit may be a processor core in the CPU, or a computing unit in the GPU. The interconnect module is used to implement coupling and communication between multiple components in the computing device. For example, for a GPU, to implement coupling and communication between components such as computing units and command processors (described below), the interconnect module may include a network on chip (NoC).

[0050] like Figure 1B As shown, the verification method includes the following steps S101 to S102.

[0051] Step S101: Provide a verification environment for verifying computing devices.

[0052] The aforementioned verification environment includes a request queue, a response queue, and a main bus function module.

[0053] Step S102: Execute multiple verification threads in parallel using the request queue and the response queue.

[0054] These multiple verification threads include test threads, a main thread, and computing device threads, which execute in parallel, thereby improving verification efficiency. The main thread runs the main bus function module to perform interface driving and sampling operations for the computing device, and communicates with at least one computing unit through the interconnect module. The test thread sends drive signals to the computing device and receives response signals from the computing device. The computing device thread runs the computing device to perform calculations.

[0055] For example, in some examples of the verification method described above, step S102 may include: using a request queue and a response queue for data communication between multiple verification threads. For example, in at least one example, using a request queue and a response queue for data communication between multiple verification threads includes: the test thread using the request queue to send a drive signal to the computing device via the main thread, and using the response queue to receive a response signal from the computing device via the main thread. By using the request queue and response queue for data communication, the multiple verification threads, particularly the test thread and the main thread, can be executed in parallel more efficiently, and the system's demand for storage resources can be reduced.

[0056] For example, in some embodiments of this disclosure, the computing device further includes multiple register modules. For instance, a CPU or GPU includes various registers, some for storing data, some for storing instructions, and some for recording the operating state of the computing device. For example, the multiple register modules of the verified computing device correspond to these registers.

[0057] For example, in some embodiments of this disclosure, the computing device further includes a command processor module. As described above, for a GPU, the command processor can be a processing unit within the GPU that retrieves and interprets commands generated from the CPU (Central Processing Unit). For example, the command processor module of the verified computing device corresponds to this command processor.

[0058] Figure 2 A schematic diagram of the structure of a verification environment for a computing device according to at least one embodiment of the present disclosure is shown; Figure 2 The image shows a verification environment for a single computing device as an example.

[0059] for Figure 2 In the illustrated embodiments, for example, in at least one example, the computing device to be verified is a multi-core GPU, which includes at least one GPU core 0 to GPU core X as a computing unit, where X is a positive integer. The computing device to be verified, the multi-core GPU, at this stage is represented, for example, using a hardware description language (HDL) such as Verilog, SystemVerilog, System C, C++, etc. Similarly, the verification environment described above is also expressed, for example, using a hardware description language.

[0060] The verification environment for verifying computing devices includes a system software interface module 201, a request queue 202, a response queue 203, a master bus function module 204, a clock generator module 205, a memory module 206, and a slave bus function module 207. For example... Figure 2 As shown, in at least one example, the computing device also includes a command processor module 208.

[0061] For example, request queue 202 and response queue 203 can be FIFO (First In First Out) queues. For example, request queue 202 stores drive signals corresponding to test cases and drivers, and response queue 203 stores the status of computing devices acquired by the main bus function module 204.

[0062] For example, in some embodiments of this disclosure, the main bus function module 204 in the verification environment can provide driving for the address channel and the data channel of the address signal, as well as performing write response channel sampling for the address channel (as will be combined below). Figure 3 The above).

[0063] For example, in some embodiments of this disclosure, the system software (e.g., the UVM system) sends drive signals corresponding to test cases and drivers to the main bus function module 204 via request queue 202, and feeds back the status of the computing device obtained by the main bus function module 204 to the system software via reply queue 203.

[0064] For example, in some embodiments of this disclosure, clock generator module 205 can be used to generate a system clock signal.

[0065] For example, in some embodiments of this disclosure, the system software interface module 201 is used to receive test cases and drivers for verification of computing devices from the system software, and send drive signals corresponding to the test cases and drivers to the main bus function module 204.

[0066] For example, in some embodiments of this disclosure, the bus function module 207 can provide sampling of the address channel from the computing device to the memory module 206 and driving of the data channel (as will be combined below). Figure 3 The above).

[0067] For example, the master bus function module 204 and the slave bus function module 207 conform to the Advanced Extensible Interface (AXI) specification or the Peripheral Component Interconnect Express (PCIE) specification, and the master bus function module 204 and the slave bus function module are used to provide an AXI or PCIE interface.

[0068] For example, in at least one example, memory module 206 conforms to the High Bandwidth Memory (HBM) specification, Double Data Rate Synchronous Dynamic Random Access Memory (DDR SRAM) specification, etc., that is, memory module 206 corresponds to HBM memory, DDR memory, and the DDR memory can be any version, such as DDR2, DDR3, DDR4 or DDR5.

[0069] Figure 2In the example shown, the on-chip network is, for example, an interconnect module included in a computing device, and the main bus function module 204 communicates with at least one computing unit through the on-chip network.

[0070] For example, in the embodiments of this disclosure, there are no restrictions on the specific type or result of the interconnection module. For example, the on-chip network may include a switch network, a tree network, a ring network, a mesh network, or a torus network or any combination thereof.

[0071] As mentioned above, during the verification phase, all of the above modules are software modules described by hardware description languages ​​in the integrated circuit design process. When compiled and run by a computer, they can cooperate with each other to simulate the operation of a computing device, including the operation of the computing device itself and its interaction with other components (such as the verification system).

[0072] It should be noted that, Figure 2 The verification environment shown is merely an example. The modules included in the verification environment are not limited to those described above, and may include more or fewer modules. The embodiments disclosed herein do not impose any limitations on this.

[0073] For example, the verification method of the embodiments of this disclosure can be implemented using a C++-based simulation environment and a multi-CPU multi-threading mechanism. For example, the verification method can be executed using the open-source simulation tool Verilator. The smallest unit of circuit calculation in Verilator is timescale. If the calculation of the internal state of the computing device and the test program (which depends on the clock) are completed in a single thread, it will cause the tool to make calculation errors. The data stream can be divided into three threads for parallel operation. Data communication between multiple verification threads can be achieved through request queues and response queues.

[0074] It should be noted that the Verilator simulation tool is merely an example for illustrative purposes. The embodiments disclosed herein are not limited to the use of the Verilator simulation tool; other simulation tools capable of parallel simulation may be used.

[0075] As previously described, the system software used for verification (e.g., the UVM system) can send drive signals corresponding to test cases and drivers to the main bus function module 204 via the request queue 202, and feed back the status of the computing device obtained by the main bus function module 204 to the system software via the reply queue 203, in order to realize the parallel execution of multiple verification threads and the parallel communication between multiple verification threads.

[0076] For example, in some embodiments of this disclosure, the verification method may further include: running clock generator module 205 to generate a system clock signal, and using the system clock signal to execute the above-mentioned multiple verification threads in parallel.

[0077] For example, in some embodiments of this disclosure, the test thread also runs the system software interface module 201 to communicate with the system software, for example, to receive test cases and drivers for verification of the computing device from the system software.

[0078] In the embodiments of this disclosure, request queue 202 and response queue 204 are used to complete data communication between multiple verification threads. In particular, for large-scale chips such as multi-core GPUs, this can significantly speed up the simulation speed of the chip and significantly accelerate the chip development speed.

[0079] The following provides a more specific example of how to utilize request queues and response queues to execute multiple verification threads in parallel, in at least one embodiment of this disclosure.

[0080] Figure 3 A schematic diagram of a process for a computing device to execute multiple verification threads in parallel, according to at least one embodiment of the present disclosure, is shown. This verification process corresponds to, for example... Figure 1B The verification method shown, and for example Figure 2 The verification environment and computing equipment shown are illustrated.

[0081] like Figure 3 As shown, in order to verify the computing device in the design process, the main thread 301 is started first, followed by the test thread 302 and the computing device thread 303, which are executed in parallel with the main thread 301.

[0082] For example, a clock generator module can also be started here to generate a system clock signal for parallel execution of the aforementioned multiple verification threads, such as main thread 301, test thread 302, and computing device thread 303.

[0083] like Figure 3 As shown, test thread 302 configures multiple register modules in the computing device to perform initialization operations on the chip being verified (i.e., the computing device chip, such as a GPU chip). For example, the register modules in the computing device are used to store drive signals corresponding to test cases and drivers, as well as the state of the computing device obtained by the main bus function module.

[0084] For example, test thread 302 sends the received drive signals to the main bus function module through a request queue; the status of the computing device is fed back to test thread 302 through a reply queue, and test thread 302 then feeds back this status information to, for example, system software.

[0085] The main thread 301 runs the main bus function module, the slave bus function module, and the memory module. For example, the main bus function module can provide address channel drivers, data channel drivers, and write response channel sampling to the computing device. The main bus function module provides the interface between itself and the computing device, and as mentioned above, this interface can conform to an existing interface standard, such as PCIe. Similarly, the slave bus function module can provide data channel drivers from the computing device to, for example, the memory module, and perform address channel sampling. The slave bus function module provides the interface between itself and the computing device, and as mentioned above, this interface can conform to an existing interface standard, such as PCIe.

[0086] For example, an address channel carries control messages to describe the attributes of the transmitted data and carries the corresponding address control information. For example, a data channel is used to transmit data between the computing device and the main bus function module or slave bus function module. For example, a data channel can have a corresponding address channel, carrying the address and control information required for transmission. For example, in the operation of test thread 302 writing data to the computing device through the main bus function module, the computing device informs the main bus function module that the transmission is complete through a write response channel.

[0087] The data channel driver connects to the bus function module and the memory module. The bus function module responds to memory access requests received from the computing device to access the memory module. For example, the access operation here simulates the GPU's access to video memory. For example, the access operation can include read operations, write operations, query operations, etc.

[0088] As shown in the figure, computing device thread 304 can further include multiple computing device sub-threads, such as... Figure 3 The computing device sub-threads 1 to n are described. Each of these sub-threads is used to run code corresponding to a specific computing unit; for example, one sub-thread corresponds to one computing unit, thus simulating the operation of these multiple computing units. For instance, in a multi-core GPU, these sub-threads are used to simulate the operation of GPU cores 0 to n-1.

[0089] like Figure 3 As shown, test thread 302 and main thread 301 communicate via request queues and response queues. For example, the request queue includes a write address queue and a write data queue; correspondingly, sending drive signals to the computing device via the main thread using the request queue may include: transmitting a first address signal to the computing device via the write address queue; and transmitting a first data signal to the computing device via the main bus using the write data queue.

[0090] For example, the test thread uses a request queue to send drive signals to the computing device through the main thread, including: the test thread sending an address signal to the write address queue, and in response to the main thread detecting that the write address queue is not empty, driving the address to the computing device according to the bus timing; the test thread sending a data signal to the write data queue, and in response to the main thread detecting that the write data queue is not empty, driving the data to the computing device according to the bus timing.

[0091] For example, the response queue includes a write command response queue, where the main thread receives response signals from the computing device and completes the communication between the test thread and the main thread through the write command response queue.

[0092] For example, such as Figure 3 As shown, in some embodiments of this disclosure, test thread 302 also configures the register module to perform initialization operations on the computing device and trigger operations on at least one computing unit.

[0093] For example, such as Figure 3 As shown, in some embodiments of this disclosure, the main thread 301 also runs a bus function module and a memory module, receives storage access requests from the computing device, and accesses the memory module by responding to the storage access requests through the bus function module.

[0094] For example, such as Figure 3 As shown, in some embodiments of this disclosure, the computing device thread 303 running the computing device may include: running a command processor module to interact with at least one computing unit. For example, initialization operations may include initialization operations on the command processor module and initialization of at least one computing unit.

[0095] The verification methods in some embodiments of this disclosure can also be used for multiple computing devices, for example, multiple computing devices are coupled to each other, each computing device can be the embodiment described above, each including an interconnect module and at least one computing unit.

[0096] Figure 4 A schematic diagram of an example of a verification environment and verification method for a system comprising multiple computing devices, according to at least one embodiment of the present disclosure, is shown.

[0097] Since the processing power of a single computing device is limited, multiple computing devices need to be used simultaneously to process data in order to improve data processing efficiency. This has led to the need for systems that include multiple computing devices. For example, a system that includes multiple computing devices can be a GPU interconnect system or a CPU interconnect system.

[0098] like Figure 4 As shown, the verification environment of each of the multiple computing devices has the following characteristics: Figure 2The structure of the verification environment for a single computing device is shown below; for details, please refer to [link / reference]. Figure 2 The relevant descriptions will not be repeated here. The difference between parallel simulation using multiple computing devices and simulation using a single computing device is that the verification environment for multiple computing devices is instantiated, and the execution of the test cases is managed by a shared test case and driver.

[0099] For example, in some embodiments of this disclosure, each of the plurality of computing devices receives test cases and drivers for verification of the plurality of computing devices through its system software interface module.

[0100] Figure 5 This diagram illustrates the process of executing multiple verification threads in parallel across multiple computing devices.

[0101] like Figure 5 As shown, for example, there are n computing devices to be verified (n is greater than 1 and n is an integer). For each computing device, the main thread is started first. Then, a system clock signal is generated by the clock generator module to execute multiple verification threads in parallel. Next, test threads and computing device threads are started and executed in parallel with the main thread. Finally, the threads are terminated. Since multiple verification threads are executed in parallel for each computing device, including test threads, the main thread, and computing device threads, for n computing devices, n test threads (test thread 1 to test thread n), n main threads, and n computing device threads (computing device 1 thread to computing device n thread) are executed in parallel. The operations of the test threads, the main thread, and the computing device threads, as well as the communication methods between the threads, have already been described. Figure 3 The above has been explained in detail, so I will not repeat it here.

[0102] Figure 6 A schematic block diagram of a verification apparatus 600 for a computing device according to at least one embodiment of the present disclosure is shown. This verification apparatus can be used to perform... Figure 1B The verification method shown.

[0103] like Figure 6 As shown, the verification device 600 includes a verification environment unit 601 and a thread execution unit 602.

[0104] The verification environment unit 601 is configured to provide a verification environment for verifying computing devices, which includes a request queue, a response queue, a main bus function module, etc.

[0105] The thread execution unit 602 is configured to execute multiple verification threads in parallel using a request queue and a response queue. These multiple verification threads include a test thread, a main thread, and a computing device thread. The main thread runs the main bus function module to perform interface driving and sampling operations for the computing device and communicates with at least one computing unit through an interconnect module. The test thread sends drive signals to the computing device and receives response signals from the computing device. The computing device thread runs the computing device to perform calculations.

[0106] For example, in at least one embodiment, the thread execution unit 602 is further configured to perform data communication between multiple verification threads using a request queue and a response queue. For example, in at least one example, performing data communication between multiple verification threads using a request queue and a response queue includes: a test thread using the request queue to send a drive signal to the computing device via the main thread, and using the response queue to receive a response signal from the computing device via the main thread.

[0107] For example, in at least one embodiment, the verification environment further includes a clock generator module, and the thread execution unit 602 is further configured to run the clock generator module to generate a system clock signal and use the system clock signal to execute multiple verification threads in parallel.

[0108] For example, in at least one embodiment, the verification environment further includes a system software interface module, and the thread execution unit 602 is further configured such that the test thread also runs the system software interface module to communicate with the system software to receive test cases and drivers for verification of the computing device from the system software.

[0109] For example, in at least one embodiment, the thread execution unit 602 is further configured such that the test thread sends drive signals corresponding to test cases and drivers to the main bus function module through a request queue via the system software interface module, and feeds back the status of the computing device obtained by the main bus function module to the system software through a reply queue.

[0110] For example, in at least one embodiment, the verification environment further includes a bus function module and a memory module, and the thread execution unit 602 is further configured such that the main thread also runs the bus function module and the memory module, receives storage access requests from the computing device, and accesses the memory module by responding to the storage access requests through the bus function module.

[0111] For example, in at least one embodiment, the thread execution unit 602 is further configured to provide drive for the slave address drive channel from the computing device to the memory module and drive for the slave data channel from the bus function module, and to sample the slave address drive channel.

[0112] For example, in at least one embodiment, the thread execution unit 602 is further configured such that the test thread also performs initialization operations on the computing device and trigger operations on at least one computing unit.

[0113] For example, in at least one embodiment, the computing device further includes a command processor module, and the thread execution unit 602 is further configured to run the command processor module to interact with at least one computing unit.

[0114] For example, in at least one embodiment, the computing device further includes a plurality of register modules, and the thread execution unit 602 is further configured such that the test thread further configures the register modules to perform initialization operations on the computing device and to trigger operations on at least one computing unit.

[0115] For example, in at least one embodiment, the request queue includes a write address queue and a write data queue, and the thread execution unit 602 is further configured to use the write address queue to transmit a first address signal to the computing device to the main thread; and to use the write data queue to transmit a first data signal to the computing device to the main bus.

[0116] For example, in at least one embodiment, the response queue includes a write command response queue, and the thread execution unit 602 is further configured to use the write command response queue to receive response signals from the computing device via the main thread.

[0117] For example, the verification environment unit 601 and the thread execution unit 602 can be implemented using hardware, software, firmware, or any feasible combination thereof, without limitation by this disclosure.

[0118] The verification device 600 and Figure 1B The verification methods shown have the same technical effect, and will not be described in detail here.

[0119] At least one embodiment of this disclosure also provides an electronic device including a processor and a memory storing computer-executable instructions that, when executed by the processor, implement the verification method provided in at least one embodiment of this disclosure.

[0120] Figure 7 This is a schematic block diagram of an electronic device 700 provided for some embodiments of this disclosure. For example... Figure 7As shown, the electronic device 700 includes a processor 710 and a memory 720. The memory 720 stores computer-executable instructions (e.g., one or more computer program modules). The processor 710 executes the computer-executable instructions, which, when executed by the processor 710, can perform one or more steps in the verification method described above. The memory 720 and the processor 710 can be interconnected via a bus system and / or other forms of connection mechanism (not shown).

[0121] For example, processor 710 can be a central processing unit (CPU), a graphics processing unit (GPU), or other form of processing unit with data processing and / or program execution capabilities. For example, the CPU can be a CISC or RISC architecture, such as x86 or ARM architecture. Processor 710 can be a general-purpose processor or a special-purpose processor, capable of controlling other components in electronic device 700 to perform desired functions.

[0122] For example, memory 720 may include any combination of one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and / or non-volatile memory. Volatile memory may include, for example, random access memory (RAM) and / or cache memory. Non-volatile memory may include, for example, read-only memory (ROM), hard disk, erasable programmable read-only memory (EPROM), portable compact disc read-only memory (CD-ROM), USB memory, flash memory, etc. One or more computer program modules may be stored on the computer-readable storage medium, and processor 710 may run one or more computer program modules to implement various functions of electronic device 700. Various application programs and various data, as well as various data used and / or generated by the application programs, may also be stored in the computer-readable storage medium.

[0123] It should be noted that, in the embodiments of this disclosure, the specific functions and technical effects of the electronic device 700 can be referred to the description of the verification method above, and will not be repeated here.

[0124] Figure 8 This is a schematic block diagram of another electronic device provided in some embodiments of this disclosure. The electronic device 800 is, for example, suitable for implementing the verification method provided in the embodiments of this disclosure. The electronic device 800 may be a terminal device, etc. It should be noted that... Figure 8 The illustrated electronic device 800 is merely an example and does not impose any limitation on the functionality and scope of use of the embodiments disclosed herein.

[0125] like Figure 8As shown, the electronic device 800 may include a processing device (e.g., a central processing unit, a graphics processor, etc.) 810, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 820 or a program loaded from a storage device 880 into a random access memory (RAM) 830. The RAM 830 also stores various programs and data required for the operation of the electronic device 800. The processing device 810, the ROM 820, and the RAM 830 are interconnected via a bus 840. An input / output (I / O) interface 850 is also connected to the bus 840.

[0126] Typically, the following devices can be connected to I / O interface 850: input devices 860 including, for example, touchscreens, touchpads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; output devices 870 including, for example, liquid crystal displays (LCDs), speakers, vibrators, etc.; storage devices 880 including, for example, magnetic tapes, hard disks, etc.; and communication devices 890. Communication device 890 allows electronic device 800 to communicate wirelessly or wiredly with other electronic devices to exchange data. Although Figure 8 An electronic device 800 with various devices is shown, but it should be understood that it is not required to implement or have all of the devices shown, and the electronic device 800 may alternatively implement or have more or fewer devices.

[0127] For example, according to embodiments of this disclosure, the verification method described above can be implemented as a computer software program. For instance, embodiments of this disclosure include a computer program product comprising a computer program carried on a non-transitory computer-readable medium, the computer program including program code for performing the verification method described above. In such embodiments, the computer program can be downloaded and installed from a network via a communication device 890, or installed from a storage device 880, or installed from a ROM 820. When the computer program is executed by the processing device 810, it can implement the functions defined in the verification method provided by embodiments of this disclosure.

[0128] At least one embodiment of this disclosure provides a computer-readable storage medium for non-transitory storage of computer-executable instructions, wherein the computer-executable instructions, when executed by a processor, implement the verification method provided in at least one embodiment of this disclosure.

[0129] Figure 9 This is a schematic diagram of a storage medium provided for some embodiments of this disclosure. For example... Figure 9 As shown, the storage medium 900 is used to store computer-executable instructions 910. For example, when the computer-executable instructions 910 are executed by a computer, one or more steps in the verification method described above can be performed.

[0130] The following points need to be explained:

[0131] (1) The accompanying drawings of the embodiments of this disclosure only involve the structures involved in the embodiments of this disclosure. Other structures can be referred to the general design.

[0132] (2) Where there is no conflict, the embodiments of this disclosure and the features in the embodiments can be combined with each other to obtain new embodiments.

[0133] The above description is merely a specific embodiment of this disclosure, but the scope of protection of this disclosure is not limited thereto. The scope of protection of this disclosure should be determined by the scope of protection of the claims.

Claims

1. A verification method for a computing device, the computing device comprising an interconnect module and at least one computing unit, the verification method comprising: A verification environment is provided for verifying the computing device, wherein the verification environment includes a request queue, a response queue, and a main bus function module; and Multiple verification threads are executed in parallel using the request queue and the response queue. The plurality of verification threads include a test thread, a main thread, and a computing device thread. The main thread runs the main bus function module to perform interface driving and sampling operations for the computing device and communicates with the at least one computing unit through the interconnect module. The test thread sends drive signals to the computing device and receives response signals from the computing device. The computing device thread runs the computing device to perform calculations.

2. The verification method according to claim 1, wherein, The multiple verification threads are executed in parallel using the request queue and the response queue, including: The request queue and the response queue are used for data communication between the multiple verification threads.

3. The verification method according to claim 2, wherein, Data communication between the plurality of verification threads is performed using the request queue and the response queue, including: The test thread uses the request queue to send drive signals to the computing device through the main thread, and uses the response queue to receive response signals from the computing device through the main thread.

4. The verification method according to claim 1, wherein, The verification environment also includes a clock generator module. The verification method further includes: The clock generator module is run to generate a system clock signal, which is then used to execute the multiple verification threads in parallel.

5. The verification method according to claim 1, wherein, The verification environment also includes a system software interface module. The test thread also runs the system software interface module to communicate with the system software to receive test cases and drivers for verification of the computing device from the system software.

6. The verification method according to claim 5, wherein, The test thread sends the drive signals corresponding to the test cases and drivers to the main bus function module through the request queue via the system software interface module, and feeds back the status of the computing device obtained by the main bus function module to the system software through the reply queue.

7. The verification method according to claim 1, wherein, The verification environment also includes a bus function module and a memory module. The main thread also runs the slave bus function module and the memory module, receives storage access requests from the computing device, and responds to the storage access requests through the slave bus function module to access the memory module.

8. The verification method according to claim 7, wherein, The slave bus function module provides sampling of the slave address channel from the computing device to the memory module and driving of the slave data channel.

9. The verification method according to claim 8, wherein, The master bus function module and the slave bus function module conform to the Advanced Scalable Interface Specification or the Peripheral Component Rapid Interconnect Specification.

10. The verification method according to claim 9, wherein, The memory module conforms to the high-bandwidth memory specification or the double-rate synchronous dynamic random access memory specification. The test thread also performs initialization operations on the computing device and trigger operations on the at least one computing unit.

11. The verification method according to claim 10, wherein, The computing device also includes a command processor module. The computing device thread runs the computing device, including: running the command processor module to interact with the at least one computing unit. The initialization operation includes the initialization of the command processor module and the initialization of the at least one computing unit.

12. The verification method according to claim 11, wherein, The computing device also includes multiple register modules. The test thread also configures the register module to initialize the computing device and trigger the at least one computing unit.

13. The verification method according to claim 1, wherein, The at least one computing unit includes multiple computing units. The computing device thread includes multiple computing device sub-threads, each of which is used to run the multiple computing units.

14. The verification method according to claim 1 or 13, wherein, The computing device is a single-core image processing unit or a multi-core image processing unit, and the computing unit is the computing core of the image processing unit.

15. The verification method according to claim 3, wherein, The request queue includes a write address queue and a write data queue. The test thread uses the request queue to send drive signals to the computing device through the main thread, including: The test thread sends an address signal to the write address queue. In response to the main thread detecting that the write address queue is not empty, the main thread drives the address to the computing device according to the bus timing. The test thread sends a data signal to the write data queue. In response to the main thread detecting that the write data queue is not empty, the main thread drives the data to the computing device according to the bus timing.

16. The verification method according to claim 15, wherein, The main bus function module provides the driving for the address channel of the address signal and the driving for the data channel, as well as the sampling of the write response channel for the address channel.

17. The verification method according to claim 1 or 15, wherein, The response queue includes a write command response queue. The write command reply queue is used to receive response signals from the computing device via the main thread.

18. A verification method for a plurality of computing devices coupled together, each of the computing devices including an interconnect module and at least one computing unit, the verification method for each of the computing devices comprising: A verification environment is provided for verifying the computing device, wherein the verification environment includes a request queue, a response queue, and a main bus function module; Multiple verification threads are executed in parallel using the request queue and the response queue. The plurality of verification threads include a test thread, a main thread, and a computing device thread. The main thread runs the main bus function module to perform interface driving and sampling operations for the computing device and communicates with the at least one computing unit through the interconnect module. The test thread sends drive signals to the computing device and receives response signals from the computing device. The computing device thread runs the computing device to perform calculations.

19. The verification method according to claim 18, wherein, The verification environment also includes a system software interface module. The test thread also runs the system software interface module to communicate with the system software to receive test cases and drivers for verification of each of the plurality of computing devices.

20. A verification apparatus for a computing device, the computing device comprising an interconnect module and at least one computing unit. The verification device includes a verification environment unit and a thread execution unit, wherein... The verification environment unit is configured to provide a verification environment for verifying the computing device, the verification environment including a request queue, a response queue, and a main bus function module; The thread execution unit is configured to execute multiple verification threads in parallel using the request queue and the response queue. The plurality of verification threads include a test thread, a main thread, and a computing device thread. The main thread runs the main bus function module to perform interface driving and sampling operations for the computing device and communicates with the at least one computing unit through the interconnect module. The test thread sends drive signals to the computing device and receives response signals from the computing device. The computing device thread runs the computing device to perform calculations.

21. An electronic device, comprising: processor; as well as Memory, which stores computer-executable instructions. The computer-executable instructions, when executed by the processor, implement the verification method according to any one of claims 1-19.

22. A computer-readable storage medium for non-transitory storage of computer-executable instructions, in, The computer-executable instructions, when executed by a processor, implement the verification method according to any one of claims 1-19.