Thread synchronization methods and devices, chip simulation methods and platforms, and related equipment

By acquiring lock state record information in multi-core processors and issuing lock requests after all processor cores have completed the previous synchronization process, the problem of poor task processing efficiency caused by unreasonable lock permission allocation is solved, and more efficient multi-core collaborative processing is achieved.

CN115729627BActive Publication Date: 2026-06-30HYGON INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HYGON INFORMATION TECH CO LTD
Filing Date
2022-11-16
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing multi-core processor concurrent processing mechanisms suffer from poor task processing efficiency, especially due to unreasonable lock permission allocation, which causes some CPU cores to be unable to acquire lock permissions, affecting the efficiency of collaborative processing.

Method used

Before issuing a lock request, the lock acquisition status record information of each processor core in the multi-core processor is obtained. The lock request is only issued when all processor cores have completed the task of accessing the shared storage device in the previous synchronization process, so as to reasonably allocate lock permissions.

Benefits of technology

By rationally allocating lock permissions, the waste of CPU instruction cycles when processor cores are waiting for lock permissions is avoided, thereby improving the efficiency and balance of task processing in multi-core processors.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115729627B_ABST
    Figure CN115729627B_ABST
Patent Text Reader

Abstract

This invention provides a thread synchronization method and apparatus, a chip simulation method and platform, and related devices. The thread synchronization method, applied to a processor core in a multi-core processor, includes: acquiring lock state record information, wherein the lock state record information records the lock acquisition state of each processor core in the multi-core processor, the lock acquisition state including at least a first state and a second state; after the lock state record information indicates that the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state, issuing a lock request to request lock permission to access the shared storage device of the multi-core processor. This method improves the task processing efficiency of the multi-core processor by controlling the issuance of lock requests.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The embodiments of the present invention relate to the field of processor technology, specifically to a thread synchronization method and apparatus, a chip simulation method and platform, and related equipment. Background Technology

[0002] In multi-core processors, concurrency mechanisms are used to define rules for the parallel processing flow of the multi-core processor. For example, a spin lock is an important concurrency mechanism. It provides lock privileges to threads on a processor core, allowing threads on processor cores that request and acquire lock privileges to access shared memory devices, while prohibiting threads on processor cores that have not acquired lock privileges from accessing shared memory devices. This effectively prevents errors in the parallel process when threads are synchronizing information.

[0003] However, existing concurrent processing mechanisms suffer from poor performance in multi-core processor task processing. Summary of the Invention

[0004] In view of this, embodiments of the present invention provide a thread synchronization method and apparatus, a chip simulation method and platform, and related equipment to improve the task processing efficiency of multi-core processors.

[0005] To achieve the above objectives, the embodiments of the present invention provide the following technical solutions:

[0006] This invention provides a thread synchronization method applied to a processor core in a multi-core processor, comprising:

[0007] Obtain lock state record information, which records the lock acquisition state of each processor core in a multi-core processor. The lock acquisition state includes at least a first state and a second state. The first state is used to indicate that the processor core is in the process of issuing a lock request to access the shared storage device of the multi-core processor, and the second state is used to indicate that the processor core is not in the first state.

[0008] When the lock state record information indicates that each processor core in the multi-core processor is in the second state in the previous synchronization process, a lock request is issued to request lock permission to access the shared storage device of the multi-core processor.

[0009] Optionally, issuing a lock request when the lock state record information indicates that each processor core in the multi-core processor was in the second state in the previous synchronization process includes:

[0010] Based on the acquired lock state record information, confirm whether the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state.

[0011] If so, issue a lock request.

[0012] Optionally, if a processor core in the multi-core processor has a lock acquisition state of the first state in the previous synchronization process, a loop process is entered. The loop process repeatedly executes the steps of acquiring lock state record information and confirming whether the lock acquisition state of each processor core in the multi-core processor is in the second state in the previous synchronization process based on the acquired lock state record information, until the lock acquisition state of each processor core in the multi-core processor is in the second state in the previous synchronization process.

[0013] Optionally, if the loop process times out, the loop process is terminated.

[0014] Optionally, the lock state record information includes multiple record sub-information, wherein different record sub-information corresponds to different types of threads.

[0015] Optionally, the lock state recording information includes first record sub-information corresponding to a first type of thread, the first type of thread being used to output data to the shared storage device, and in the first record sub-information, a processor core corresponds to a lock acquisition state, the lock acquisition state corresponding to the process of the processor core outputting data to the shared storage device.

[0016] Optionally, the lock state record information includes second record sub-information corresponding to the second type of thread. The second type of thread is used to exchange thread data of different processor cores based on the shared storage device. In the second record sub-information, one processor core corresponds to multiple lock acquisition states, wherein different lock acquisition states correspond to the process of data exchange between the processor core and different processor cores.

[0017] Optionally, the lock state record information includes third record sub-information corresponding to a third type of thread. The third type of thread is used to exchange thread data with a preset system based on the shared storage device. In the third record sub-information, a thread in a processor core corresponds to a lock acquisition state, wherein the lock acquisition state corresponds to the process of the thread exchanging thread data with the preset system.

[0018] Optionally, the initial state of the lock state recording information is configured such that the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state.

[0019] Optionally, the lock status record information is recorded in a lock status information table. Multiple lock status information tables cyclically record the lock acquisition status of each processor core in different synchronization processes. Alternatively, multiple lock status information tables correspond one-to-one with each synchronization process, recording the lock status record information of each processor core in the corresponding synchronization process.

[0020] This invention also provides a chip simulation method, wherein the chip simulation method uses the thread synchronization method provided in this invention to execute the thread synchronization process.

[0021] Optionally, the chip simulation method is used to simulate a target test instance, which includes a first test instance and a second test instance. The first test instance is a general verification methodology test instance, and the second test instance is a compiled language test instance. During the simulation of the first test instance, the second test instance is loaded for simulation.

[0022] In the second test instance, the thread synchronization method described in the embodiments of the present invention is used to synchronize threads.

[0023] Optionally, the simulation process for the second test instance includes:

[0024] Initialize the shared information for the second test instance;

[0025] Obtain the thread task of the second test instance;

[0026] The threads of the second test instance are initialized; wherein, the initialization step includes a second type of thread, and the corresponding second record sub-information is configured in the lock state record information;

[0027] The initialization results of the threads in the second test instance are synchronized; wherein, the synchronization step includes a first type of thread, and the lock state record information is configured with a corresponding first record sub-information;

[0028] The main function of the thread executing the second test instance; wherein the execution steps of the main function include a second type of thread, and the lock state record information is configured with corresponding second record sub-information;

[0029] Output the simulation data of the second test instance, wherein the output step includes a first type of thread and a second type of thread, and the lock state record information is configured with corresponding first record sub-information and second record sub-information.

[0030] This invention also provides a thread synchronization device, comprising:

[0031] A status information acquisition module is used to acquire lock status record information. The lock status record information records the lock acquisition status of each processor core in a multi-core processor. The lock acquisition status includes at least a first status and a second status. The first status is used to indicate that the processor core is in the process of issuing a lock request to access the shared storage device of the multi-core processor. The second status is used to indicate that the processor core is not in the first status.

[0032] The request sending module is used to send a lock request when the lock state record information indicates that the lock acquisition state of each processor core in the multi-core processor is the second state in the previous synchronization process, in order to request the lock permission to access the shared storage device of the multi-core processor.

[0033] This invention also provides a simulation platform, which performs chip simulation using the chip simulation method provided in this invention.

[0034] This invention also provides a processor core, which executes a thread synchronization process using the thread synchronization method provided in this invention.

[0035] This invention also provides a computer device, including: at least one memory and at least one processor; the memory stores one or more computer-executable instructions, and the processor invokes the one or more computer-executable instructions to execute the above-described thread synchronization method.

[0036] This invention also provides a storage medium that stores one or more computer-executable instructions for executing the thread synchronization method described above.

[0037] This invention provides a thread synchronization method and apparatus, a chip simulation method and platform, and related devices. The thread synchronization method, applied to a processor core in a multi-core processor, includes: acquiring lock state record information, wherein the lock state record information records the lock acquisition state of each processor core in the multi-core processor, the lock acquisition state including at least a first state and a second state, the first state indicating that the processor core is in the process of issuing a lock request to access the shared storage device of the multi-core processor, and the second state indicating that the processor core is not in the first state; after the lock state record information indicates that the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state, a lock request is issued to request lock permission to access the shared storage device of the multi-core processor.

[0038] As can be seen, in this embodiment of the invention, before issuing a lock request, lock status record information is obtained, which records the lock acquisition status of each processor core in the multi-core processor. After the lock status record information indicates that the lock acquisition status of each processor core in the multi-core processor in the previous synchronization process is that each processor core is not in the process of issuing a lock request to access the shared storage device of the multi-core processor, i.e., the second state, a lock request is issued. Thus, when none of the processor cores are in the stage of issuing a lock request to access the shared storage device of the multi-core processor in the previous synchronization process, a request is made to obtain the lock permission to access the shared storage device of the multi-core processor. This avoids the situation where a request for lock permission for the next synchronization process occurs when there are still processor cores that have not completed access in the previous synchronization process. In this way, by controlling the issuance of lock requests, the allocation of lock permissions in the concurrent processing mechanism is made more reasonable, and the task processing efficiency of the multi-core processor is improved. Attached Figure Description

[0039] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.

[0040] Figure 1 An optional computer structure provided for an embodiment of the present invention;

[0041] Figure 2 This is an optional flowchart for thread synchronization;

[0042] Figure 3 A schematic diagram of an optional process for a thread synchronization method provided in an embodiment of the present invention;

[0043] Figure 4 This is another optional flowchart illustrating the thread synchronization method provided in this embodiment of the invention;

[0044] Figure 5 The data structure for the first record sub-information provided in the embodiments of the present invention;

[0045] Figure 6 The data structure for the second record sub-information provided in the embodiments of the present invention;

[0046] Figure 7 The structure of the third record sub-information provided in the embodiments of the present invention;

[0047] Figure 8 A schematic diagram of an optional structure for a target test instance provided in an embodiment of the present invention;

[0048] Figure 9 A simulation flowchart for an optional second test instance provided in an embodiment of the present invention;

[0049] Figure 10 An optional block diagram of a thread synchronization device provided in an embodiment of the present invention. Detailed Implementation

[0050] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0051] The processor (central processing unit, CPU) is the core of a chip system, responsible for computation and control. It is the final execution unit for information processing and program execution within the chip system. In modern computer architectures, multi-core processors consist of multiple processor cores and utilize concurrent processing mechanisms to process computational tasks in parallel.

[0052] Figure 1 The example illustrates one possible architecture for a computer. For example... Figure 1 As shown, the computer may include multiple CPU cores and shared memory. These CPU cores can communicate with each other via a bus or a Network on Chip (NOC). Figure 1 In the computer architecture shown, the multiple CPU cores are Figure 1 The diagram shows cores 0 through N. Shared storage devices (memory) are used to enable data sharing and exchange among multiple CPU cores.

[0053] The CPU core can be a physical CPU core or a logical CPU core, where a logical CPU core is a logical unit abstracted from a physical CPU core. It can be understood that when multiple CPU cores process tasks synchronously, the threads of multiple CPU cores execute their corresponding thread tasks in parallel, and share and exchange data based on a concurrent processing mechanism. In a multi-core processor, multiple CPU cores achieve data sharing and exchange by accessing shared storage devices.

[0054] In one optional example, when a multi-core processor needs to access a shared memory device, a spinlock mechanism can be used to ensure synchronization between threads on multiple CPU cores. A spinlock is defined as a data type `spinlock_t`, and operations on it are atomic. Initially, the value of the spinlock is set to the integer 1. When a thread on a CPU core needs to access the shared memory device, it can request lock access by calling the `get_spin_lock()` function. This function reads the value of the spinlock, decrements it by 1, and checks if the result is 0. If it is 0, the spinlock is available, so the thread acquires the lock, decrements the value of the spinlock by 1, and executes the access to the shared memory device. If the result is negative, it means the shared memory device is already occupied by another thread, and the current thread needs to wait.

[0055] When multiple CPU cores' threads simultaneously request access to a shared memory space, for example, three CPU cores (core 0, core 1, and core 2) simultaneously request access to the shared memory space, refer to... Figure 2 The diagram illustrates an optional flowchart for thread synchronization, with the specific execution steps as follows:

[0056] Step S1. Core 0, core 1, and core 2 simultaneously initiate get_spin_lock();

[0057] Step S2.core 0 first acquires the spinlock, then performs the access operation on the shared memory space M, and after completion, releases the lock permission, restoring the value of the spinlock to 1;

[0058] Step S3.core 2: Obtain the spinlock, execute its program, and release the lock privileges after completion.

[0059] Step S4.core 1: Obtain the spinlock, execute its program, and release the lock privileges after completion.

[0060] In this process, CPU cores randomly acquire spinlocks. Theoretically, when multiple CPU cores simultaneously request spinlocks, the probability of each core acquiring a spinlock is random, and the waiting time is uniformly distributed. However, in one specific implementation, when three CPU cores simultaneously request lock access, core 0 can acquire the spinlock first and then execute its program. After core 0 releases its lock access, core 1 acquires the spinlock. While core 1 is executing its program, core 0 requests the spinlock again. After core 1 releases its lock access, core 0 acquires the spinlock again. Clearly, core 2 is in a state where it continuously fails to acquire lock access.

[0061] It can be seen that the lock allocation in the concurrent processing mechanism is unreasonable. Understandably, in a multi-core collaborative processing flow, the subtasks allocated to each CPU core must be processed synchronously to maximize collaborative processing efficiency. If the lock allocation is unreasonable in the collaborative processing flow, some CPU cores will be unable to acquire the corresponding lock privileges, causing their processing progress to be significantly slower than other CPU cores. This results in an uneven thread processing progress in the multi-core collaborative processing flow, ultimately affecting the task processing efficiency of the multi-core collaborative processing flow.

[0062] Furthermore, once a CPU core begins to request lock privileges, it will remain in a waiting state until it acquires the corresponding lock privileges. During the waiting period, it will frequently call get_spin_lock() to acquire a spinlock, thereby consuming a large number of CPU instruction cycles and further reducing the task processing efficiency of a single CPU core.

[0063] Based on this, the inventors believe that unreasonable allocation of lock permissions in concurrent processing mechanisms is an important reason for the poor task processing efficiency of multi-core collaborative processing processes and the poor task processing efficiency of a single CPU core.

[0064] Based on this, embodiments of the present invention provide a thread synchronization method and apparatus, a chip simulation method and platform, and related devices. The thread synchronization method, applied to a processor core in a multi-core processor, includes: acquiring lock state record information, wherein the lock state record information records the lock acquisition state of each processor core in the multi-core processor, the lock acquisition state including at least a first state and a second state, the first state indicating that the processor core is in the process of issuing a lock request to access the shared storage device of the multi-core processor, and the second state indicating that the processor core is not in the first state; after the lock state record information indicates that the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state, a lock request is issued to request lock permission to access the shared storage device of the multi-core processor.

[0065] As can be seen, in this embodiment of the invention, before issuing a lock request, lock status record information is obtained, which records the lock acquisition status of each processor core in the multi-core processor. After the lock status record information indicates that the lock acquisition status of each processor core in the multi-core processor in the previous synchronization process is that each processor core is not in the process of issuing a lock request to access the shared storage device of the multi-core processor, i.e., the second state, a lock request is issued. Thus, when none of the processor cores are in the stage of issuing a lock request to access the shared storage device of the multi-core processor in the previous synchronization process, a request is made to obtain the lock permission to access the shared storage device of the multi-core processor. This avoids the situation where a request for lock permission for the next synchronization process occurs when there are still processor cores that have not completed access in the previous synchronization process. Furthermore, by controlling the issuance of lock requests, the allocation of lock permissions in the concurrent processing mechanism is made more reasonable, thereby improving the task processing efficiency of the multi-core processor.

[0066] Below, based on Figure 3 The illustrated flowchart of an optional thread synchronization method provides a detailed description of the thread synchronization method provided in this embodiment of the invention. The thread synchronization method can be based on... Figure 1 The illustrated computer architecture is used for execution. The method is applied to a processor core in a multi-core processor, and the method is applied when a thread of the processor core needs to access the shared memory device of the multi-core processor, the method includes:

[0067] Step S100: Obtain lock state record information.

[0068] A thread is a sub-thread within a processor core and represents the smallest granularity that the processor can execute. In a multi-threaded processor core, when a processor core executes a subtask within a cooperative task, it can be broken down into multiple threads that execute concurrently. Within a single processor core, there can be one or more threads that need to access the shared memory devices of the multi-core processor. When a thread requests access to the shared memory devices of the multi-core processor, lock state record information can be acquired.

[0069] The lock state record information records the lock acquisition status of each processor core in a multi-core processor. In this embodiment of the invention, the lock state record information can be recorded in a lock state information table, so that the lock state record information can be obtained by retrieving the corresponding lock state information table. It is understood that by recording the lock acquisition status of each processor core in a multi-core processor, this embodiment of the invention can determine the stage of each processor core in the thread synchronization process, and thus can control the processor core to issue lock requests based on the lock state record information.

[0070] One processor core can correspond to one lock acquisition state or multiple lock acquisition states. For example, based on the premise that only one thread can acquire lock privileges at the same time, the current state of the processor core can be recorded as the lock acquisition state of the processor core. Alternatively, multiple lock acquisition states can be set according to the number of threads in the processor core that have access to shared storage devices.

[0071] The lock acquisition state includes at least a first state and a second state. The first state indicates that the processor core is in the process of issuing a lock request to access the shared storage device of the multi-core processor. It is understood that the process of the processor core issuing a lock request to access the shared storage device of the multi-core processor is the process of the processor core requesting lock permission and using the lock permission. During this process, the processor core is in the first state, indicating that the processor core has not yet completed its task of accessing the shared storage device.

[0072] The second state is used to indicate that the processor core is not in the first state. It can indicate that the processor core has completed the task of accessing the shared storage device that it needs to perform, or, in some alternative examples, it can also indicate that the processor core has not yet processed to the point where it needs to access the shared storage device.

[0073] Step S110: When the lock state record information indicates that the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state, a lock request is issued to request lock permission to access the shared storage device of the multi-core processor.

[0074] The lock state record information indicates that each processor core in the multi-core processor was in the second state of lock acquisition in the previous synchronization process, meaning that each processor core was no longer in the process of issuing a lock request to access the shared storage device of the multi-core processor in the previous synchronization process. It can be understood that when the multi-core processor performs thread synchronization using the method described in this embodiment from the beginning of the task, and each synchronization process confirms the state of the previous synchronization process, the lock state record information indicating that each processor core in the multi-core processor was in the second state of lock acquisition in the previous synchronization process can indicate that each processor core has completed the corresponding synchronization process in the previous synchronization process. In other words, a processor core only issues a lock request to acquire lock permission to access the shared storage device of the multi-core processor after each processor core has completed the corresponding synchronization process in the previous synchronization process.

[0075] Understandably, by issuing lock requests after each processor core has completed its corresponding synchronization process in the previous synchronization process, the allocation of lock permissions in multi-core processors can be made more reasonable. This allows the processing progress of collaborative processing tasks among the processor cores in a multi-core processor to be controlled at the same or similar nodes. In the multi-core collaborative processing process, the thread processing progress is more balanced, thus improving the processing efficiency of the multi-core collaborative processing process.

[0076] Meanwhile, the processor core only issues a lock request after each processor core has completed the corresponding synchronization process in the previous synchronization process. This ensures that no processor core will be unable to obtain the corresponding lock privileges in a continuous manner, avoiding the large consumption of CPU instruction cycles caused by the processor core being in the process of continuously requesting lock privileges, and improving the task processing efficiency of a single CPU core.

[0077] It should be noted that in the lock state recording information, the initial state can be configured such that the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state, thereby ensuring that the multi-core processor executes smoothly in the initial stage of the collaborative processing task. Additionally, in an optional example of the present invention, the lock state recording information can use multiple lock state information tables to cyclically record the lock acquisition state of each processor core in different synchronization processes, or it can use a lock state information table corresponding one-to-one with each synchronization process to record the lock state recording information of each processor core in the corresponding synchronization process.

[0078] In an optional example, refer to Figure 4 The illustrated alternative flowchart of the thread synchronization method shows that step S110 can be specifically as follows:

[0079] Step S111: Based on the acquired lock state record information, confirm whether the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state;

[0080] If yes, then execute step S112 to issue a lock request; if no, it indicates that there is a processor core in the multi-core processor whose lock acquisition state in the previous synchronization process is in the first state, then enter the loop process. The loop process repeatedly executes step S100 to obtain lock state record information and step S111 to confirm, based on the obtained lock state record information, whether the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is in the second state, until the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is in the second state, and then executes step S112 to issue a lock request.

[0081] It is understandable that the above steps, as reflected in the CPU core's processing flow, can be as follows: when a thread needs to access the shared storage device of the multi-core processor, the lock state record information is continuously refreshed until the lock state record information indicates that the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state.

[0082] Meanwhile, in an optional example of the present invention, the processor core may also be configured with a timeout mechanism, which terminates the loop process when the above loop process times out, so as to avoid the processor core consuming too many instruction cycles in the waiting state.

[0083] It is understood that the data structure of the lock state record information may differ depending on the thread type. Accordingly, in this embodiment of the invention, the lock state record information may include multiple record sub-information, wherein different record sub-information corresponds to different types of threads. These multiple record sub-information may, for example, be a first sub-record information, a second sub-record information, or a third sub-record information.

[0084] In an optional example, the lock state record information includes first record sub-information corresponding to a first type of thread, which is used to output data, as referenced. Figure 5 The data structure shown for the first record sub-information includes a lock acquisition state corresponding to each processor core. The lock acquisition state corresponds to the data output process of that processor core. For example, the lock acquisition state of processor core C0 can be located in bit 0 of the data structure, and the lock acquisition state of processor core C1 can be located in bit 1. A 64-bit data structure can record the lock acquisition states corresponding to 64 processor cores C0 to C63.

[0085] In an optional example, the lock state record information includes second record sub-information corresponding to the second type of thread, which is used to exchange thread data between different processor cores, see reference. Figure 6The data structure for the second record sub-information shown in the diagram includes multiple lock acquisition states for each processor core. Different lock acquisition states correspond to the data exchange process between the processor core and different processor cores. For example, the lock acquisition state of processor core C0 can be recorded in row 0 of the data structure. Row 0 records the lock acquisition states of other processor cores exchanging data with processor core C0. For instance, C1_C0 can record the lock acquisition states of processor core C1 exchanging data with processor core C0. The lock acquisition state of processor core C1 can be recorded in row 1 of the data structure, and so on.

[0086] In an optional example, the lock state record information includes third record sub-information corresponding to a third type of thread, which is used to exchange thread data with a preset system, see reference. Figure 7 The structure of the third record sub-information shown is such that one thread in a processor core corresponds to one lock acquisition state, wherein the lock acquisition state corresponds to the process of that thread exchanging thread data with a preset system. Figure 7 Taking processor core C2 as an example, in C2SV_n, C2 represents processor core C2, and SV_n represents the nth thread. n can be an integer such as 0, 1, or 2. C2SV_n corresponds to the lock acquisition state of the nth thread of processor core C2, and so on.

[0087] In a further example, this embodiment of the invention also provides a chip simulation method, which employs the thread synchronization method provided in the above embodiments to execute the thread synchronization process. The following description uses a simulated target test example of the chip simulation method as an example.

[0088] For details, please refer to Figure 8The diagram illustrates an optional structure of a target test instance. The target test instance can be a co-simulation model of SystemVerilog and C / ASM, executed within a verification environment provided by the UVM (Universal Verification Methodology) framework. Specifically, the target test instance (also called a testcase) comprises two parts: a first test instance, which can be a UVM testcase, used to configure the DUT (Design Under Test) and control the simulation process; and a second test instance, which can be a compiled language testcase. This second test instance can be, for example, a test program written in C / ASM, compiled into a hexadecimal file and stored in external DRAM (Dynamic Random Access Memory). During the simulation of the first test instance, the second test instance is loaded for simulation. Specifically, it can be loaded into the CPU core by bootcode after the DUT is configured.

[0089] Among them, reference Figure 9 The simulation flowchart of an optional second test case is shown. The simulation flow of the second test case, namely the test case part of the CPU core, includes the following steps:

[0090] Step S201: Initialize the shared information for the second test instance;

[0091] The initialization mentioned here refers to the initialization of registers, as well as the initialization of table entries and data pages. In this step, the shared information is the shared information of the processor core, that is, information that can be used by all threads in the processor core. In the specific simulation flow, step S201 is also marked as c_start.

[0092] Step S202: Obtain the thread task of the second test instance;

[0093] The processor core processes data based on the thread task of the second test instance by acquiring the thread task of the second test instance. Specifically, the thread task of the second test instance can be acquired by reading the current thread ID register to obtain the thread number. In the specific simulation process, step S202 is also marked as dispatch.

[0094] Step S203: Initialize the threads of the second test instance;

[0095] Specifically, for each thread, a corresponding initialization procedure is executed. These steps may include initializing the thread-specific memory space and initializing the stack. In the specific simulation process, step S203 is also labeled tn_init.

[0096] It should be noted that in step S203, the process of exchanging data between threads needs to be executed. Accordingly, when this step is running on different processor cores, it is necessary to exchange thread data between different processor cores based on shared memory devices. Based on this, this step includes a second type of thread. In this embodiment of the invention, a corresponding second record sub-information can be configured in the lock state record information, thereby recording the lock acquisition state corresponding to the process of data exchange between processor cores and different processor cores based on the second record sub-information.

[0097] In an optional example, the second record sub-information can be as follows: Figure 6 As shown, when a thread on processor core Cx exchanges data with a thread on processor core Cy, and a thread on processor core Cx writes data to a thread on processor core Cy, then the thread on processor core Cx accesses the address in the shared memory device corresponding to Cx_Cy in the second record sub-information, writes the corresponding data, and records the corresponding lock acquisition status in the second record sub-information. The thread on processor core Cy can check the lock acquisition status in Cx_Cy in the second record sub-information to determine if any information has been written; if so, it reads back the corresponding data.

[0098] It is understood that the data exchanged can be 32 bits, 64 bits, or can be set according to the requirements of the target test case. For a related description, please refer to the description of the second sub-record information; this invention will not elaborate further here.

[0099] Step S204: Synchronize the initialization results of the threads in the second test instance;

[0100] After the thread initialization of the second test instance is completed, the initialization results of the threads in the second test instance can be synchronized to output the initialization results of each thread. In the specific simulation process, step S204 is also marked as init_done.

[0101] Accordingly, this step requires executing a process of outputting data to the shared storage device. Consequently, when this step runs on different processor cores, it requires outputting data to the shared storage device. Therefore, this step includes a first type of thread. In this embodiment of the invention, a corresponding first record sub-information can be configured in the lock state record information, thereby recording the lock acquisition state corresponding to the process of the processor core outputting data to the shared storage device based on the first record sub-information.

[0102] In an optional example, the first record sub-information can be as follows: Figure 5 As shown, 0 can be configured to indicate incomplete or 1 to indicate completed. Specifically, after a thread finishes execution, a 1 can be written to the corresponding Cn. After writing, a polling method can be used to determine whether all threads have completed synchronization, thus determining whether to execute subsequent programs. For further details, please refer to the description of the first sub-record information mentioned above; this invention will not repeat them here.

[0103] Step S205: Execute the main function of the thread of the second test instance;

[0104] Specifically, the main function can be executed based on the program information in the second test instance. In the specific simulation process, step S204 is also marked as main_n.

[0105] In an optional example, during the execution of the main function of the thread in the second test instance, a process of exchanging data between threads needs to be performed. Accordingly, when this step runs on different processor cores, it is necessary to exchange thread data between different processor cores based on shared memory. Therefore, this step includes a second type of thread. In this embodiment of the invention, a corresponding second record sub-information can be configured in the lock state record information, thereby recording the lock acquisition state corresponding to the process of data exchange between processor cores and different processor cores based on the second record sub-information.

[0106] In an optional example, the second record sub-information can be as follows: Figure 6 As shown, when a thread on processor core Cx exchanges data with a thread on processor core Cy, and a thread on processor core Cx writes data to a thread on processor core Cy, then the thread on processor core Cx accesses the address corresponding to Cx_Cy in the second record sub-information and writes the corresponding data. The thread on processor core Cy can check if there is any information written to Cx_Cy in the second record sub-information; if so, it reads back the corresponding data. Here, x and y can be integers such as 0, 1, 2, and 3.

[0107] Step S206: Output the simulation data of the second test instance;

[0108] After the main function of the thread in the second test instance finishes execution, the UVM platform can be notified to end the simulation and output the corresponding simulation data. In this embodiment of the invention, the simulation data of the second test instance is output to a preset system, such as the UVM platform.

[0109] In an optional example, the simulation data of the second test instance can first be synchronized based on the shared storage device, that is, the process of outputting data to the shared storage device is executed. Based on this, this step includes a first type of thread. In this embodiment of the invention, a corresponding first record sub-information can be configured in the lock state record information, thereby recording the lock acquisition state corresponding to the process of the processor core outputting data to the shared storage device based on the first record sub-information.

[0110] In an optional example, the first record sub-information can be as follows: Figure 5 As shown, 0 can be configured to indicate incomplete or 1 to indicate completed. Specifically, after a thread finishes execution, a 1 can be written to the corresponding Cn. After writing, a polling method can be used to determine whether all threads have completed synchronization, thus determining whether to execute subsequent programs. For further details, please refer to the description of the first sub-record information mentioned above; this invention will not repeat them here.

[0111] Simultaneously, this step also requires exchanging thread data with a preset system to output simulation data to the preset system. When this step runs on different processor cores, it is necessary to exchange thread data with the preset system based on shared storage devices. Therefore, this step also includes a third type of thread. In this embodiment of the invention, a corresponding third record sub-information can be configured in the lock state record information, thereby recording the lock acquisition state corresponding to the process of exchanging thread data between the processor core and the preset system based on the third record sub-information.

[0112] In an optional example, the third record sub-information can be as follows: Figure 7 As shown, when processor core C2 has a thread exchanging thread data with the preset system, the corresponding thread can write information to the corresponding C2SV_n. This information can be completion information or completion status information. When the preset system (e.g., systemverilog) detects that information has been written to the third record sub-information, it reads the information and determines whether the current thread is as expected or prints the log based on the status information.

[0113] It is understood that the data exchanged can be 32 bits, 64 bits, or can be set according to the requirements of the target test case. For a related description, please refer to the description of the second sub-record information; this invention will not elaborate further here.

[0114] It should be noted that the first test instance can also be used for simulation end data comparison, log printing, etc. after the second test instance simulation is completed. This invention does not impose specific limitations on this.

[0115] In this embodiment of the invention, Cn can occupy 1 bit, and Cx_Cy and C2SV_n can be 32 bits, 64 bits, or set according to the requirements of the test cases. In one optional example, the CPU is 64-bit, and correspondingly, 64 bits of data can be read back per instruction cycle. When Cn occupies 1 bit and Cx_Cy and C2SV_n are 32 bits or 64 bits, all the status information of Cn can be read back in one instruction cycle; or, the read and write operations of Cx_Cy or C2SV_n can be completed in one instruction cycle, thereby reducing the number of CPU instructions executed and improving simulation efficiency.

[0116] It should be further noted that the embodiments of the present invention are not limited to x86 architecture CPU systems, but can also be applied to architectures such as ARM, MIPS, and RISC-V. Furthermore, the embodiments of the present invention are not limited to single-die CPU system verification, but can also be applied to multi-die CPU system verification.

[0117] The thread synchronization apparatus provided in the embodiments of the present invention will be described below. The apparatus described below can be considered as the functional modules required by a computer device to implement the thread synchronization method provided in the embodiments of the present invention. The apparatus described below can be referred to in correspondence with the method described above.

[0118] Figure 10 An optional block diagram of a thread synchronization device provided in an embodiment of the present invention is shown. For example... Figure 10 As shown, the device may include:

[0119] The status information acquisition module 200 is used to acquire lock status record information. The lock status record information records the lock acquisition status of each processor core in the multi-core processor. The lock acquisition status includes at least a first status and a second status. The first status is used to indicate that the processor core is in the process of issuing a lock request to access the shared storage device of the multi-core processor. The second status is used to indicate that the processor core is not in the first status.

[0120] The request sending module 210 is used to send a lock request when the lock state record information indicates that the lock acquisition state of each processor core in the multi-core processor is the second state in the previous synchronization process, in order to request the lock permission to access the shared storage device of the multi-core processor.

[0121] Optionally, the request sending module 210 is configured to issue a lock request when the lock state record information indicates that the lock acquisition state of each processor core in the multi-core processor is the second state in the previous synchronization process, including:

[0122] Based on the acquired lock state record information, confirm whether the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state.

[0123] If so, issue a lock request.

[0124] Optionally, if a processor core in the multi-core processor is in the first state of lock acquisition in the previous synchronization process, the steps of acquiring lock state record information and confirming whether the lock acquisition state of each processor core in the multi-core processor is in the second state in the previous synchronization process are repeated until the lock acquisition state of each processor core in the multi-core processor is in the second state in the previous synchronization process.

[0125] Optionally, if the loop process times out, the loop process is terminated.

[0126] Optionally, the lock state record information includes multiple record sub-information, wherein different record sub-information corresponds to different types of threads.

[0127] Optionally, the lock state recording information includes first record sub-information corresponding to a first type of thread, the first type of thread being used to output data to the shared storage device, and in the first record sub-information, a processor core corresponds to a lock acquisition state, the lock acquisition state corresponding to the process of the processor core outputting data to the shared storage device.

[0128] Optionally, the lock state record information includes second record sub-information corresponding to the second type of thread. The second type of thread is used to exchange thread data of different processor cores based on the shared storage device. In the second record sub-information, one processor core corresponds to multiple lock acquisition states, wherein different lock acquisition states correspond to the process of data exchange between the processor core and different processor cores.

[0129] Optionally, the lock state record information includes third record sub-information corresponding to a third type of thread. The third type of thread is used to exchange thread data with a preset system based on the shared storage device. In the third record sub-information, a thread in a processor core corresponds to a lock acquisition state, wherein the lock acquisition state corresponds to the process of the thread exchanging thread data with the preset system.

[0130] Optionally, the initial state of the lock state recording information is configured such that the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state.

[0131] Optionally, the lock status record information is recorded in a lock status information table. Multiple lock status information tables cyclically record the lock acquisition status of each processor core in different synchronization processes. Alternatively, multiple lock status information tables correspond one-to-one with each synchronization process, recording the lock status record information of each processor core in the corresponding synchronization process.

[0132] In this embodiment of the invention, a simulation platform is also provided, wherein the chip simulation platform performs chip simulation using the chip simulation method provided in this embodiment of the invention.

[0133] In this embodiment of the invention, a processor core is also provided, wherein the processor core executes the thread synchronization process using the thread synchronization method provided in this embodiment of the invention.

[0134] In this embodiment of the invention, a computer device is also provided, including: at least one memory and at least one processor; the memory stores one or more computer-executable instructions, and the processor invokes the one or more computer-executable instructions to execute the thread synchronization method provided in this embodiment of the invention.

[0135] In this embodiment of the invention, a storage medium is also provided, which stores one or more computer-executable instructions for executing the thread synchronization method provided in this embodiment of the invention.

[0136] The foregoing describes multiple embodiments of the present invention. The optional methods described in each embodiment can be combined and cross-referenced without conflict, thereby extending to a variety of possible embodiments. These can all be considered as embodiments disclosed or made public by the present invention.

[0137] While the embodiments of the present invention have been disclosed above, the present invention is not limited thereto. Any person skilled in the art can make various modifications and alterations without departing from the spirit and scope of the present invention; therefore, the scope of protection of the present invention should be determined by the scope defined in the claims.

Claims

1. A thread synchronization method, characterized in that, A processor core used in a multi-core processor includes: The lock state record information is obtained, which records the lock acquisition state of each processor core in the multi-core processor. The lock acquisition state indicates the stage of each processor core in the thread synchronization process. The lock acquisition state includes at least a first state and a second state. The first state is used to indicate that the processor core is in the process of issuing a lock request to access the shared storage device of the multi-core processor. The second state is used to indicate that the processor core is not in the first state, indicating that the processor core has completed the task of accessing the shared storage device that it needs to execute, or the processor core has not yet processed to the process of accessing the shared storage device. When the lock state record information indicates that each processor core in the multi-core processor is in the second state in the previous synchronization process, a lock request is issued to request lock permission to access the shared storage device of the multi-core processor.

2. The thread synchronization method according to claim 1, characterized in that, When the lock state record information indicates that all processor cores in the multi-core processor were in the second state in the previous synchronization process, a lock request is issued, including: Based on the acquired lock state record information, confirm whether the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state. If so, issue a lock request.

3. The thread synchronization method according to claim 2, characterized in that, If any processor core in the multi-core processor is in the first state of lock acquisition in the previous synchronization process, a loop process is entered. The loop process repeatedly executes the steps of acquiring lock state record information and confirming whether the lock acquisition state of each processor core in the multi-core processor is in the second state in the previous synchronization process based on the acquired lock state record information, until the lock acquisition state of each processor core in the multi-core processor is in the second state in the previous synchronization process.

4. The thread synchronization method according to claim 3, characterized in that, If the loop process times out, the loop process ends.

5. The thread synchronization method according to claim 1, characterized in that, The lock state record information includes multiple record sub-information, where different record sub-information corresponds to different types of threads.

6. The thread synchronization method according to claim 5, characterized in that, The lock state record information includes first record sub-information corresponding to a first type of thread. The first type of thread is used to output data to the shared storage device. In the first record sub-information, a processor core corresponds to a lock acquisition state. The lock acquisition state corresponds to the process of the processor core outputting data to the shared storage device.

7. The thread synchronization method according to claim 5, characterized in that, The lock state record information includes second record sub-information corresponding to the second type of thread. The second type of thread is used to exchange thread data of different processor cores based on the shared storage device. In the second record sub-information, one processor core corresponds to multiple lock acquisition states, wherein different lock acquisition states correspond to the process of data exchange between the processor core and different processor cores.

8. The thread synchronization method according to claim 5, characterized in that, The lock state record information includes third record sub-information corresponding to the third type of thread. The third type of thread is used to exchange thread data with the preset system based on the shared storage device. In the third record sub-information, one thread in one processor core corresponds to one lock acquisition state, wherein the lock acquisition state corresponds to the process of exchanging thread data with the preset system.

9. The thread synchronization method according to claim 1, characterized in that, The initial state configuration of the lock state record information is that the lock acquisition state of each processor core in the multi-core processor in the previous synchronization process is the second state.

10. The thread synchronization method according to claim 1, characterized in that, The lock status record information is recorded in a lock status information table. Multiple lock status information tables cyclically record the lock acquisition status of each processor core in different synchronization processes. Alternatively, multiple lock status information tables correspond one-to-one with each synchronization process, recording the lock status record information of each processor core in the corresponding synchronization process.

11. A chip simulation method, characterized in that, The chip simulation method uses the thread synchronization method described in claim 1 to execute the thread synchronization process.

12. The chip simulation method according to claim 11, characterized in that, The chip simulation method is used to simulate a target test instance, which includes a first test instance and a second test instance. The first test instance is a general verification methodology test instance, and the second test instance is a compiled language test instance. During the simulation of the first test instance, the second test instance is loaded for simulation. In the second test instance, the thread synchronization method described in claim 1 is used to synchronize the threads.

13. The chip simulation method according to claim 12, characterized in that, The simulation process for the second test instance includes: Initialize the shared information for the second test instance; Obtain the thread task of the second test instance; The threads of the second test instance are initialized; wherein, the initialization includes a second type of thread, and the corresponding second record sub-information is configured in the lock state record information; The initialization results of the threads in the second test instance are synchronized; wherein, the synchronization includes a first type of thread, and the lock state record information is configured with a corresponding first record sub-information; The main function of the thread executing the second test instance; wherein the execution steps of the main function include a second type of thread, and the lock state record information is configured with corresponding second record sub-information; Output the simulation data of the second test instance, wherein the output includes a first type of thread and a second type of thread, and the lock state record information is configured with corresponding first record sub-information and second record sub-information.

14. A thread synchronization device, characterized in that, include: A status information acquisition module is used to acquire lock status record information. The lock status record information records the lock acquisition status of each processor core in a multi-core processor. The lock acquisition status indicates the stage that each processor core in the multi-core processor is in the thread synchronization process. The lock acquisition status includes at least a first state and a second state. The first state is used to indicate that the processor core is in the process of issuing a lock request to access the shared storage device of the multi-core processor. The second state is used to indicate that the processor core is not in the first state, indicating that the processor core has completed the task of accessing the shared storage device that it needs to execute, or the processor core has not yet processed to the process of needing to access the shared storage device. The request sending module is used to send a lock request when the lock state record information indicates that the lock acquisition state of each processor core in the multi-core processor is the second state in the previous synchronization process, in order to request lock permission to access the shared storage device of the multi-core processor.

15. A simulation platform, characterized in that, The simulation platform uses the chip simulation method described in any one of claims 11 to 13 to perform chip simulation.

16. A processor core, characterized in that, The processor core executes the thread synchronization process using the thread synchronization method described in any one of claims 1 to 10.

17. A computer device, characterized in that, include: At least one memory and at least one processor; The memory stores one or more computer-executable instructions, and the processor invokes the one or more computer-executable instructions to execute the thread synchronization method as described in any one of claims 1 to 10.

18. A storage medium, characterized in that, The storage medium stores one or more computer-executable instructions, which are used to execute the thread synchronization method as described in any one of claims 1 to 10.