Method, device, electronic equipment and readable storage medium for controlling multi-core processor

By introducing independent storage units and a main core monitoring mechanism in multi-core processors, the problem of low resource utilization efficiency caused by cache coherency failures is solved, and cache repair and task recovery from the core are realized, thereby improving the processor's operating efficiency.

CN122285282APending Publication Date: 2026-06-26LOONGSON TECH CORP

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
LOONGSON TECH CORP
Filing Date
2026-03-31
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

In multi-core processors, cache coherency failures prevent cores from accessing cached data properly, reducing processor efficiency. Existing technologies restrict cores to the Uncached region or actively jump back to the Uncached region, resulting in low resource utilization and high development costs.

Method used

Introducing independent storage units in multi-core processors allows the main core to monitor the cache consistency status of slave cores in real time, save execution information, and send reset commands to put slave cores into a sleep state to repair the cache. After repair, the slave cores are woken up to continue executing the original task, thus preventing the cache consistency problem from escalating.

Benefits of technology

This ensures that the core can continue to access cached data normally after the cache is repaired, improving processor efficiency and avoiding resource waste and additional development costs caused by cache inconsistency failure.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122285282A_ABST
    Figure CN122285282A_ABST
Patent Text Reader

Abstract

This invention provides a control method for a multi-core processor. The method includes: a master core monitoring a slave core in real time; when the slave core is in a cache coherence failure state, saving the slave core's execution information and sending a reset command to the slave core; the slave core responding to the reset command sent by the master core repairing its cache unit according to the standby command in the storage unit; and after the repair is completed, entering a sleep state; and the slave core responding to a wake-up command sent by the master core resuming execution of its original task based on the execution information in the wake-up command. This invention ensures the normal access and use of cached data resources by the slave core, improving processor operating efficiency. Furthermore, this solution eliminates the need to pre-define fallback paths to the Uncached area for various potential error scenarios, resulting in lower software development costs.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer technology, specifically to a control method, apparatus, electronic device, and computer-readable storage medium for a multi-core processor. Background Technology

[0002] In a multi-core processor, under the mechanism of maintaining data consistency between the core's private cache and shared memory, the cache data of a certain core may be inconsistent with the data in the shared memory. This will cause a cache consistency problem during the startup or operation of that core. If the cache continues to be used at this time, the consistency problem will be amplified.

[0003] In related technologies, when a cache consistency failure is detected in the slave core, the slave core is often controlled to always run in the Uncached region (non-cache region). However, this method will restrict the slave core's normal access to and use of cached data resources, reducing the processor's operating efficiency. Summary of the Invention

[0004] The present invention aims to provide a control method, apparatus, electronic device and computer-readable storage medium for a multi-core processor, at least to solve the problems in the prior art.

[0005] To solve the above-mentioned technical problems, the present invention is implemented as follows: In a first aspect, embodiments of the present invention provide a control method for a multi-core processor, the multi-core processor further comprising a storage unit, the method comprising: The master core monitors the slave core in real time. When the slave core is in a cache consistency failure state, the master core saves the execution information of the slave core and sends a reset command to the slave core. The slave core responds to the reset command sent by the master core, repairs the cache unit of the slave core according to the standby command in the storage unit, and enters a hibernation state after the repair is completed; The slave core responds to the wake-up command sent by the master core and resumes the execution of the original task in the slave core based on the execution information in the wake-up command.

[0006] Secondly, embodiments of the present invention also provide a control device for a multi-core processor, the multi-core processor further including a storage unit, the device comprising: The storage module is used by the main core to monitor the slave core in real time. When the slave core is in a cache consistency failure state, the module saves the execution information of the slave core and sends a reset command to the slave core. The repair module is used to repair the cache unit of the slave core in response to the reset command sent by the master core, according to the standby command in the storage unit, and enter the hibernation state after the repair is completed; The wake-up module is used by the slave core to respond to the wake-up command sent by the master core and, based on the execution information in the wake-up command, resume the execution of the original task in the slave core.

[0007] Thirdly, embodiments of the present invention also provide an electronic device, which includes a processor, a memory, and a program or instructions stored in the memory and executable on the processor, wherein the program or instructions, when executed by the processor, implement the steps of the method described.

[0008] Fourthly, embodiments of the present invention also provide a computer-readable storage medium on which a program or instructions are stored, wherein the program or instructions, when executed by a processor, implement the steps of the method described.

[0009] In summary, this embodiment establishes an independent, non-cached storage unit. Under the control of the master core, a slave core experiencing cache inconsistency issues enters the storage unit and executes its built-in standby instructions. This allows the slave core to repair its cache unit and enter a hibernation state, preventing the problem from escalating if the slave core continues to use the cache. Subsequently, the master core wakes up the slave core according to a wake-up command and controls it to continue executing its original tasks based on the repaired cache unit, ensuring the slave core's basic computational functions. In this embodiment, the slave core does not always run in the uncached region; instead, it continues to use the cache for task processing after the cache is repaired. This ensures the slave core's normal access to and use of cached data resources, and eliminates the need to pre-set fallback paths to the uncached region for various potential error scenarios, thus improving processor efficiency. Attached Figure Description

[0010] Figure 1 This is a flowchart illustrating the steps of a control method for a multi-core processor provided in an embodiment of the present invention; Figure 2 This is an architecture diagram of a control method for a multi-core processor provided in an embodiment of the present invention; Figure 3 This is a flowchart illustrating the implementation of a control method for a multi-core processor provided in an embodiment of the present invention. Figure 4 This is a block diagram of a control device for a multi-core processor provided in an embodiment of the present invention; Figure 5 This is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of the present invention. Detailed Implementation

[0011] The technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

[0012] Figure 1 This is a flowchart illustrating the steps of a control method for a multi-core processor according to an embodiment of the present invention. The method is applied to a multi-core processor, which includes a main core, slave cores, and a storage unit. The storage unit is a non-cached unit that stores standby instructions, such as... Figure 1 As shown, the method may include: Step 101: The master core monitors the slave core in real time. When the slave core is in a cache consistency failure state, the master core saves the execution information of the slave core and sends a reset command to the slave core.

[0013] Step 102: In response to the reset command sent by the master core, the slave core repairs the cache unit of the slave core according to the standby command in the storage unit, and enters a hibernation state after the repair is completed.

[0014] In this embodiment of the invention, for steps 101-102, refer to Figure 2 The diagram illustrates an architecture for a multi-core processor control method, comprising multiple nodes. The nodes are interconnected via an inter-chip bus 10, which serves as a high-speed point-to-point communication link between different nodes. Each node can communicate with other nodes through its own inter-chip bus controller.

[0015] Multiple nodes constitute a multi-processor system architecture. Each node can be a NUMA node, which is a logical and physical partition within a multiprocessor system. A NUMA node is a relatively independent computing unit, typically containing a set of processor cores sharing local hardware resources. All memory in a multi-processor system is physically allocated across the nodes for use. When a core within a NUMA node accesses the physical memory allocated to that NUMA node, the latency is extremely low and the bandwidth is extremely high. When a core within a NUMA node accesses the memory of other NUMA nodes, it must do so through the inter-chip bus 10.

[0016] In one embodiment, a specific processor core can be configured in the node's firmware as the master core (e.g., selecting the first physical core within the node (e.g., Core0) as the master core). All other cores in the node are slave cores, ensuring the master core is unique. After the node powers on, the hardware logic first enables one master core to start running, while keeping the other slave cores in a reset or waiting state. This physically avoids interference between cores and ensures the uniqueness of the master core.

[0017] For example, when the master node powers on, its internal hardware state machine starts working. The state machine enables one core (Core0) to become the master core according to fixed rules (such as selecting the first physical core) and releases it from the reset state. Other cores (Core1, Core2, etc.) are either kept in the reset state or enter a sleep state waiting for an interrupt.

[0018] Furthermore, in a multi-core processor, each core is allocated a corresponding cache unit, and there is also a shared memory in physical memory for data synchronization between the processor cores. To ensure that the operation of each core meets cache consistency requirements, that is, in a multi-core processor, the data between the private cache unit of each core and the shared memory must be consistent. This consistency requirement can be automatically maintained by hardware (based on cache consistency protocols such as MESI, Modified, Exclusive, Shared, and Invalid) through state machines and buses. When a core modifies the memory data in the shared memory, data synchronization must be performed through hardware protocols or software so that the cache units of other cores can be updated. In other words, when a core modifies some data in the shared memory, other cores, to ensure data consistency, must read the modified value when reading the data later, and not the old value.

[0019] Shared memory is a storage chip independent of the multi-core processor. All cores are connected to the memory controller via an on-chip interconnect bus, and the memory controller accesses this physical memory. In other words, shared memory is configured as a space accessible to all cores; if any core writes data to a physical address in shared memory, other cores can read the latest value from shared memory. A cache unit is a storage area inside the multi-core processor; specifically, it's a copy of the data in shared memory. When a core needs to access a memory address, the hardware loads a contiguous block of memory containing that address into the cache unit. The difference between shared memory and cache units is that cache units are integrated inside the multi-core processor, very close to the cores, and have extremely fast access speeds; data in one core's cache unit is not visible to other cores. Shared memory, on the other hand, is a memory chip located outside the multi-core processor, has slower access speeds, and its data is visible to all cores.

[0020] In multi-core processor environments, there exists an uncached region. This uncached region is a specific area in physical memory that provides space for data access and does not go through a caching system. In contrast, cache units are a mechanism introduced to improve data access speed. Cache units are used to temporarily store frequently accessed data in smaller, faster memory. Compared to cache units, access to the uncached region is slower, but it avoids data inconsistency issues caused by cache units. Related technologies can control a core to always run in the uncached region when a cache consistency failure occurs, or actively jump back to the uncached region when a cache consistency failure occurs. However, having a core always run in the uncached region restricts normal access and use of cached data resources from the core, reducing processor efficiency; while actively jumping back to the uncached region requires pre-setting fallback paths for various possible error scenarios, resulting in higher software development costs.

[0021] To address the aforementioned issues, this invention first adds an independent storage unit to the multi-core processor. This storage unit can be a One-Time Programmable (OTP) memory, a type of non-volatile memory where data cannot be modified after it is written. Relevant information can be entered into the OTP before it leaves the factory; after the OTP leaves the factory, the entered data can only be read and cannot be changed.

[0022] Furthermore, in this embodiment of the invention, a standby instruction can be pre-set in the storage unit. When the main core detects a cache consistency failure problem in the slave core, the slave core is scheduled by software to run the standby instruction in the storage unit. This allows the slave core to first repair its own private cache unit and then enter a hibernation state, waiting to be woken up to continue executing the original task, thereby avoiding serious consequences caused by the cache consistency failure problem.

[0023] The standby instruction is a short, unmodifiable boot program pre-programmed into the storage unit. Its functions include: 1. Indicating the entry point, which specifies the starting location of data in the storage unit, allowing the kernel to automatically jump to the indicated starting address after a reset to perform repair operations. 2. Identifying whether the current kernel is the master or slave kernel. 3. If it is determined to be a slave kernel, executing a reset instruction to put the slave kernel into hibernation mode (if it is determined to be the master kernel, it directly jumps to the normal system boot code entry point (e.g., firmware in Flash), allowing the master kernel to execute the normal boot process).

[0024] In this step, when the master core detects that a slave core is in a cache coherency failure state, it can generate a reset instruction and send it to the slave core, and save the slave core's execution information to the non-cached area in memory. The execution information reflects the slave core's current execution node of the task, and is stored in a data structure. The execution information includes the slave core's current execution context, such as register data, program counter, stack pointer, etc.

[0025] Assuming a 64-bit multi-core processor, each of its registers occupies 8 bytes. For core 1, the non-cached region used to store its execution context data is defined as 0x9000_1000 ~ 0x9000_11FF (512 bytes). An example of storing the current register value of core 1 would be: R0 = 0x00000000_00000001; R1 = 0x00000000_00000002; ... R31 = 0x00000000_0000001F; PC = 0x8000_5678 (address of the instruction where the original task was interrupted); SP = 0x9000_2000 (current stack top); Status register = 0x00000000_00000103 (indicating interrupt enabled, kernel mode, etc.).

[0026] The master core can send reset instructions to slave cores by writing to the reset control register. For example, the master core writes a reset flag into the reset control register via a reset instruction. When the slave core reads the reset instruction from the reset control register, it parses the reset flag and immediately executes the reset operation. The reset operation includes instructions such as stopping the currently executing task, stopping the pipeline, and clearing the hardware cache state. After the slave core completes the reset, its program counter is forcibly loaded to the entry address of the memory location, thus jumping to the memory location to execute standby instructions.

[0027] For example, suppose a multi-core processor has 4 cores, with core 0 designated as the master core and cores 1, 2, and 3 as slave cores. The physical address of the chip reset control register is 0x1FE0_0200. Bit [1] of this register corresponds to the reset control of core 1, bit [2] corresponds to core 2, and bit [3] corresponds to core 3.

[0028] Upon detecting the abnormal notification from Core 1, the main core writes the value 0x02 (binary 0010) to the reset control register, sets the reset flag of Core 1 to 1, thereby triggering the main core to send a reset instruction, which in turn triggers a hardware reset of Core 1. In the reset instruction, the hardware reset vector of Core 1 is fixed at the starting address 0x1FC0_0000 of the Out-of-Memory (OTP) memory location (where standby instructions are pre-programmed). Therefore, after Core 1 resets, it automatically retrieves and executes the standby instructions from this address based on the reset instruction's indication.

[0029] During the execution of the standby instruction, Core 1 first identifies its own core identifier as 1 (indicating a slave core). Based on the core identifier being 1, it jumps to the code segment in the memory unit specifically prepared for slave cores (i.e., the standby instruction at physical address 0x1FC0_0100). After Core 1 repairs its own cache unit, it enters a low-power sleep state. In this state, Core 1 does not access any cache and waits to be woken up with minimal power consumption.

[0030] In some embodiments, the core can perform repair operations on its private cache units to eliminate errors caused by cache inconsistency failures.

[0031] In other embodiments, the master core can also perform repair operations on the cache units of the slave core to eliminate errors caused by cache inconsistency failure. After repair, the master core can notify the slave core that the repair is complete.

[0032] The aforementioned repair operations include performing a cache invalidation operation, which resets the core cache state machine to its initial state. Specific operations include: flushing cache lines in the cache unit, reinitializing the cache unit's controller, and resetting the cache unit's consistency protocol state machine.

[0033] In this embodiment of the invention, during the repair process of the slave core's cache unit, a storage space is required. This storage space can be set in the uncached address space, which ensures that the repair operation itself will not cause new cache consistency issues. This storage space can be reserved in advance during system startup and is not affected by the operating system's memory management, ensuring that the repair process can be reliably executed at any stage. Alternatively, the above repair process can also be executed using an over-the-top storage unit (OTP), which is also unaffected by slave core interference.

[0034] Furthermore, the repair operation aims to completely clear the inconsistency anomalies in the slave core's cache units, restoring them to a usable state. This is achieved by writing a specific trigger command (such as 0x01) to the flush control register in the slave core's cache controller. This initiates a global flush operation on the slave core's private cache units, forcing all dirty data in the cache back to memory, ensuring no data loss, and clearing the contents of the cache lines. By writing a specific command to the cache controller, a global flush operation on the slave core's private cache is triggered. The hardware traverses all cache lines, writes data lines marked as modified back to memory, and then invalidates these lines, thereby clearing the slave core's cache and ensuring all data is consistent with memory.

[0035] Reinitializing the cache controller can be achieved by writing a predefined initialization sequence (such as 0xA5A5) to the controller initialization register of the cache controller from the core. This restores the cache controller's configuration registers, status registers, etc., to their default values, eliminating faults caused by misconfiguration or abnormal states. Specifically, an initialization sequence can be written to the cache controller's control register, causing the cache controller hardware logic to return to a known initial state, ready for re-enabled operation.

[0036] Resetting the cache consistency protocol state machine can be achieved by writing a specific reset command (such as 0x5A5A) to the state machine reset register of the cache controller in the slave core. This restores the distributed state machine of the cache consistency protocol to its initial state, clears any possible illegal states or deadlocks, and resets the cache consistency capability of the slave core, allowing it to participate in consistency transactions normally in the future.

[0037] For example, when flushing a cache line in a cache unit, the value 0x01 is written to the flush control register of the core to trigger a global flush of the flush control register. Then, the status register is read repeatedly to check bit [0]. After several clock cycles, the status register bit [0] becomes 0, indicating that the flush is complete.

[0038] During the reinitialization of the cache controller, the value 0xA5A5 is written to the initialization register corresponding to the cache controller, triggering the cache controller to reinitialize. By polling the status register corresponding to the cache controller, the system waits for bit [0] to change from 1 to 0 to confirm that the cache controller initialization is complete.

[0039] During the process of resetting the cache coherence protocol state machine, the value 0x5A5A is written to the reset register corresponding to the state machine to trigger the cache coherence protocol state machine reset. By polling the status register corresponding to the cache coherence protocol state machine, the system waits for bit [0] to change from 1 to 0 to confirm that the cache coherence protocol state machine reset is complete.

[0040] It should be noted that in some embodiments, the slave core can compare the first data in its private cache unit with the second data corresponding to the first data in the shared memory to obtain the comparison result. If the comparison result is inconsistent, it is determined that the slave core has a cache consistency failure problem. At this time, the slave core is in an abnormal state, and in the subsequent exception handling triggered by the abnormal state, it sends a "cache abnormality" exception notification message to the master core, so that the master core can detect that the slave core is in a cache consistency failure state.

[0041] In other embodiments, the master core can also identify slave cores in a cache coherence failure state through hardware or software means. Hardware means include: reading the register value reflecting the cache coherence failure state from the error log register, and identifying the slave core in this state through the register value; identifying a timeout interrupt from the bus controller, which is generated when the bus controller does not receive a response for an extended period after a slave core initiates a coherence request. Software means include: the master core reading software error information generated due to cache coherence failure, and then locating the slave core in the cache coherence failure state. The error log register is typically designed in groups, with each core having its own independent set of error log registers. Each bit in the register group corresponds to one core; if a bit is set to 1, it indicates that the corresponding core's error log register contains valid error information awaiting processing. A core's error log register group typically includes: Error status register: Records the error type.

[0042] Error address register: Records the physical address that caused the error.

[0043] Error information register: Records other auxiliary information when an error occurs (such as transaction type, cache line status, etc.).

[0044] For example, a multi-core processor can have a set of machine check banks built in to record hardware error events. When a cache coherence protocol error occurs in the slave core, the hardware writes the corresponding error type flag bit to the error status register (e.g., bit [3] is set to 1, indicating a coherence protocol deadlock), writes the physical address accessed by the transaction that caused the error to the error address register, and writes auxiliary information, such as the transaction type and the current status of the cache line involved, to the error information register. This enables automatic writing of error information to the slave core's error record register: bit [3] of 0x1FE0_0340 (reflecting core 2) in the error status register is set to 1, indicating that core 2 has a cache coherence protocol problem. The master core reads 0x1FE0_0340 and finds that bit [3] is 1, confirming that the error type of core 2 is a cache coherence failure.

[0045] For example, the master core reads software error information recorded by the operating system: for instance, when core 2 encountered a cache consistency protocol error, it wrote a soft error record of consistency check failure to the log buffer, with the buffer address being 0x9000_2000. The master core can read the contents of this buffer through this address 0x9000_2000 to further confirm that core 2 is a slave core that needs to be recovered. If so, a reset command is sent to core 2; otherwise, it is ignored.

[0046] Step 103: In response to the wake-up command sent by the master core, the slave core resumes execution of the original task in the slave core based on the execution information in the wake-up command.

[0047] In this embodiment of the invention, the master core sends a reset command to the slave core, causing the slave core to repair its private cache unit and enter a sleep state. After completing the repair operation on the slave core's cache unit, the error caused by cache inconsistency failure is eliminated, and the slave core can subsequently resume using the cache unit. The master core can then send a wake-up command to the slave core. Since the master core has pre-stored the slave core's execution context data, the wake-up command controls the slave core to exit the sleep state and resumes execution from the interrupt point of the original task based on the execution context data indicated by the wake-up command. This process restores the slave core's state to the moment before the inconsistency error occurred, enabling it to seamlessly continue executing the original task and ensuring the smooth execution of tasks by the slave core. This mechanism provided by this embodiment of the invention is effective not only in the firmware stage but also reliably operates in the kernel stage.

[0048] For example, assuming the execution context of kernel 1 is stored at memory address 0x9000_1000, and this execution context includes kernel 1's register data, program counter, stack pointer, etc., before the interrupt task, stored as a structure in the unbuffered area of ​​memory, then the wake-up instruction could include the memory address 0x9000_1000. The kernel can then read the execution context from memory based on this memory address and set its own registers, program counter, stack pointer, etc., according to the execution context. For example, the context data might include the current register values ​​of kernel 1. R0 = 0x00000000_00000001; R1 = 0x00000000_00000002; ... R31 = 0x00000000_0000001F; PC = 0x8000_5678 (address of the instruction where the original task was interrupted); SP = 0x9000_2000 (current stack top); Status register = 0x00000000_00000103 (indicating interrupt enabled, kernel mode, etc.).

[0049] Then, from core 1, the extracted context data, including the register values, can be written into its corresponding register to complete the restoration of the register data of the original task.

[0050] Since memory address 0x8000_5678 is the instruction location before the original task was interrupted, kernel 1 can continue to execute the original task based on the instruction corresponding to address 0x8000_5678.

[0051] Optionally, the reset instruction includes: an entry address, a first code segment size, and a first checksum; the entry address is used to indicate the storage location of the standby instruction in the memory unit; step 101 may specifically include: Sub-step 1021: Load the data in the storage unit according to the entry address and the size of the first code segment.

[0052] Sub-step 1022: Calculate the second checksum based on the loaded data.

[0053] Sub-step 1023: If the second checksum matches the first checksum, execute the standby instruction in the storage unit.

[0054] In this embodiment of the invention, for sub-steps 1021-1023, the format of the reset instruction can be set as: entry address—first code segment size—first checksum. The entry address can be an address indicating a storage unit; this entry address is a non-cached address. Furthermore, this entry address can be modified, such as to other non-cached addresses (memory address, other device address, etc.). In this embodiment of the invention, to solve the cache consistency problem from the core, preferably, the entry address is the address of the storage unit (OTP). The first code segment size reflects the size of the verification data to be loaded from the OTP, and the first checksum reflects the result calculated by a preset verification algorithm (such as an accumulation algorithm) using the correctly loaded data.

[0055] Specifically, testing and ( checksum A checksum is a command used in data processing and data communication to sum a set of data items for verification purposes. It is usually represented in hexadecimal notation. The checksum indicates the accumulation of transmitted bits. When transmission ends, the receiver can use this value to determine whether all data has been received, thus verifying data integrity.

[0056] Furthermore, upon receiving a reset command from the core, a security verification is first performed to ensure access security. This security verification involves loading verification data from the storage unit based on the OTP entry address and the size of the first code segment, and calculating a second checksum based on the loaded data. If the second checksum matches the first checksum in the reset command, the loaded verification data is confirmed to be correct (i.e., the loaded verification data is a standby command). Subsequently, the loaded verification data, i.e., the standby command, can be executed to allow the core to repair its private cache unit and enter a sleep state after repair. If the second checksum does not match the first checksum in the reset command, the loaded verification data is confirmed to be incorrect, and execution stops.

[0057] For example, in a multi-core processor: initially, core 0 is the master core and core 1 is the slave core. The address of the OTP memory unit is 0x1FC0_0000; the physical address of the standby instruction stored in the OTP is 0x1FC0_0200, the instruction length is 128 bytes, and the checksum pre-calculated for the standby instruction is 0xA5A5_5A5A.

[0058] The reset instruction received by Core 1 includes: 0xCAFEBABE (to identify that the instruction is a reset instruction), entry address: 0x1FC0_0200 (indicating the position of the standby instruction in OTP), first code segment size: 128 (bytes), and first checksum: 0xA5A5_5A5A.

[0059] First, Core 1 reads 128 bytes of data (i.e., standby instruction code segment A) from OTP byte by byte based on the entry address 0x1FC0_0200 and the code segment size 128, and calculates the second checksum (e.g., cumulative sum) corresponding to the 128 bytes of data in real time, obtaining the value 0xA5A5_5A5A.

[0060] The calculated second checksum is compared with the read first checksum 0xA5A5_5A5A. If they match, the checksum passes, indicating that the reset instruction matches the standby instruction in the memory unit. The jump instruction is then executed, jumping to the entry address of the reset instruction 0x1FC0_0200, and the standby instruction in the memory unit begins to be executed. After the standby instruction is executed, core 1 repairs its private cache unit and enters a hibernation state after the cache unit is repaired.

[0061] Optionally, step 102 may further include: Sub-step 1024: Perform the cache invalidation operation from the core and reset the cache state machine from the core to the initial state.

[0062] In this embodiment of the invention, the slave core, in response to a standby command, can perform a repair operation on the cache unit. Specifically, this includes initiating a cache invalidation operation on the slave core and resetting the slave core's cache state machine to its initial state. The purpose of this is to clear dirty data that causes cache inconsistency failure. That is, after a cache inconsistency error occurs, the slave core's private cache may contain a large amount of dirty data (cache lines that have been modified but not written back to memory) or invalid data (cache lines inconsistent with memory). If this data is not cleaned up, it will continue to occupy cache space, causing the cache inconsistency problem to persist. Furthermore, by resetting the state machine to its initial state, it can be ensured that when the slave core is subsequently woken up, its cache unit is in an empty data state. During the slave core's sleep period, its cache unit is cleared and the cache state machine is reset, which is equivalent to the slave core being completely decoupled from the cache system, preventing the cache inconsistency problem from escalating due to the slave core continuing to use the cache unit.

[0063] Specifically, a cache invalidation operation can be triggered by issuing a command to the cache controller, marking core-private cache units as invalid. This can be achieved by writing to a specific cache control register, and the hardware will iterate through all cache lines and clear their valid bits. The completion of the cache invalidation is then confirmed by reading the cache status register. Alternatively, a command can be issued to the cache controller to reset the core's cache coherence protocol state machine to its initial state.

[0064] For example, in a multi-core processor, core 0 is the master core, and core 1 is the slave core. The registers controlled by the cache control register of core 1 include: o Cache invalidation trigger register: Writing to 0x01 triggers invalidation operations on all private cache units in core 1.

[0065] o Cache invalid status register: bit [0] is 1 to indicate that an invalid operation is in progress, and 0 to indicate that the operation is complete.

[0066] o State machine reset trigger register: Writing 0x02 triggers the cache coherence protocol state machine reset of core 1.

[0067] o State machine reset state register: bit [0] is 1 to indicate that the reset is in progress, and 0 to indicate that the operation is complete.

[0068] Then, by writing 0x01 to the cache invalidation trigger register, the invalidation operation of the core 1 cache unit is triggered, and the invalidation operation is confirmed to be completed by checking that bit [0] of the cache invalidation status register is 0; by writing 0x02 to the cache invalidation status register, the cache coherence protocol state machine of core 1 is triggered to reset, and the reset operation is confirmed to be completed by checking that bit [0] of the state machine reset status register is 0.

[0069] Optionally, the wake-up instruction includes: the context address of the slave core, the size of the second code segment, and the third checksum; the context address is used to indicate the storage location of the context data of the slave core; step 103 may specifically include: Sub-step 1031: Load the context data according to the context address and the size of the second code segment.

[0070] Sub-step 1032: Calculate the fourth checksum based on the loaded context data.

[0071] Sub-step 1033: If the fourth checksum matches the third checksum, continue executing the original task based on the context data to enable the recovery from the core.

[0072] In this embodiment of the invention, regarding sub-steps 1021-1023, when a cache consistency failure occurs in the slave core, the slave core stops executing its task. The master core can store the execution context data of the slave core in a non-cached area of ​​memory, that is, in a memory segment that is not used as a cache. After repairing the cache unit of the slave core, the master core sends a wake-up command to the slave core, which is in a dormant state. The wake-up command is used to trigger the slave core to exit the dormant state, retrieve the context data from the memory segment, and continue executing the original task according to the original interruption point.

[0073] The specific wake-up instructions include: the context address from the core, the size of the second code segment, and the third checksum; the context address reflects the memory storage address of the context data, the size of the second code segment reflects the size of the context data to be loaded from memory, and the third checksum reflects the result of the correct context data calculated by a preset check algorithm (such as an accumulation algorithm).

[0074] When a wake-up command is received from the kernel, a security check is first performed to ensure the security of access. This security check involves loading context data from memory based on the context address and the size of the second code segment, and then calculating a fourth checksum based on the loaded context data. If the fourth checksum matches the third checksum in the wake-up command, the loaded context data is confirmed to be correct, and the context data can be parsed to obtain the interrupt point. Execution of the original task can then resume from the original interrupt point to ensure the normal operation of the kernel's task. Since the kernel's cache unit has been repaired, the process of resuming execution of the original task from the kernel can be completed using the kernel's cache unit. If the third checksum and the fourth checksum do not match, it indicates that the loaded context data is incorrect, and execution stops.

[0075] For example, in a multi-core processor, core 0 is the master core and core 1 is the slave core. Core 1 was previously reset by core 0 due to a cache coherence error, and has executed the standby instruction in OTP to complete cache repair and enter hibernation mode. Core 0 pre-stores the context data of core 1 in memory.

[0076] The storage address for the context data is: 0x9000_1000~0x9000_11FF.

[0077] Furthermore, Core 0 pre-calculates the third checksum for the context data using an accumulation algorithm, resulting in 0x8765_4321. Therefore, the wake-up instruction sent from Core 0 to Core 1 includes: 0xCAFE_BABE (to identify this instruction as a wake-up instruction), context address 0x9000_1000, second code segment size 512 bytes, and third checksum 0x8765_4321.

[0078] Core 1 responds to the wake-up command, exits sleep, and reads data byte by byte from physical memory based on the context address 0x9000_1000 and the second code segment size of 512 bytes. During the reading process, Core 1 calculates the checksum in real time (e.g., using the same cumulative sum algorithm as Core 0). After reading 512 bytes, a fourth checksum is obtained, let's assume the result is 0x8765_4321. Core 1 compares the calculated fourth checksum with the third checksum 0x8765_4321; they match, indicating that the context data corresponding to the context address has not been corrupted in memory. Core 1 resumes execution of the original task based on the context data; that is, Core 1 writes the loaded context data into its register set. Finally, Core 1 executes a jump instruction to continue executing the instructions before the interruption of the original task.

[0079] Optionally, the method also includes: Step 104: Read the first data and the second data respectively and compare them; the first data is the data in the cache unit of the core, and the second data is the data in the shared memory corresponding to the first data in the multi-core processor.

[0080] Step 105: When the first data and the second data are inconsistent, determine that the slave core is in a cache consistency failure state, and send an exception notification to the master core so that the master core can send the reset command to the slave core.

[0081] In this embodiment of the invention, regarding steps 104-105, the slave core can detect consistency failures immediately by comparing the consistency of the first data in the cache with the second data in memory, thus preventing error propagation. Upon detecting a cache consistency failure, the slave core immediately sends an exception notification (such as an inter-core interrupt) to the master core, informing it of the problem. After receiving the exception notification, the master core can save the slave core's execution context and send a reset command to instruct the slave core's storage unit to execute a standby command. The exception notification itself can carry simple information, such as the slave core's identifier and the address where the inconsistency occurred, helping the master core quickly locate the problem. Furthermore, since the cache unit is a private unit of the slave core, the slave core can directly and quickly locate the specific data item and address of the first data, thus improving response speed when comparing the consistency of the first and second data.

[0082] Specifically, the first data is a data item currently stored in the kernel's private cache unit. The second data is the latest value of the same data item located in shared memory, corresponding to the first data. The kernel can retrieve the second data via a load instruction (forcing a read from memory, bypassing the cache) and then compare it with the first data in the cache.

[0083] For example, if core 1 needs to verify the consistency of counter data, then the first data, the value of counter, is extracted from the private cache unit of core 1 and denoted as cache_val. Core 1 can directly read the cache line corresponding to address 0x8000_0000 from its L1 cache to obtain the first data = 100.

[0084] Core 1 can also read the corresponding second data in shared memory: the current value of counter, denoted as mem_val. Core 1 can execute a load instruction (e.g., readl()) to force the reading of data from physical address 0x8000_0000, bypassing all caches. This read directly accesses memory and obtains the second data = 200 (the value updated by the main core).

[0085] By comparing the first and second data, a discrepancy is found, and core 1 determines that it is in a cache coherence failure state. Core 1 immediately sends an inter-core interrupt to master core 0. Core 1 can include simple information in the interrupt message, such as its identifier and the address 0x8000_0000 where the discrepancy occurred.

[0086] Optionally, step 101 may specifically include: Sub-step 1011: Read the register value from the preset error record register.

[0087] Sub-step 1012: When the register value is a value written by the interrupt controller that reflects the cache coherency failure state, determine the slave core that has the cache coherency failure state based on the register value.

[0088] In this embodiment of the invention, for sub-steps 1011-1012, the master core can read the register value reflecting the cache coherency failure state from the error record register, and identify the slave core that is in the cache coherency failure state through the register value.

[0089] For example, a multi-core processor can have a set of machine check error log registers built in to record hardware exception events. When a cache coherence protocol error occurs in a slave core, the hardware automatically writes error information into the slave core's error log register: bit [3] of 0x1FE0_0340 (reflecting core 2) in the error status register is set to 1, indicating that core 2 has a cache coherence protocol problem. The master core reads 0x1FE0_0340 and finds that bit [3] is 1, confirming that the error type of core 2 is cache coherence failure.

[0090] Optionally, step 101 may specifically include: Sub-step 1013: Receive the exception notification sent by the slave core, and determine that the slave core is in a cache consistency failure state based on the exception notification; the exception notification is sent by the slave core when the first data and the second data are inconsistent, the first data is the data in the cache unit of the slave core, and the second data is the data in the shared memory corresponding to the first data.

[0091] For details on this step, please refer to steps 104-105 above; they will not be repeated here.

[0092] Optionally, step 101 may specifically include: Sub-step 1014: The main core reads the context data of the slave core through the debugging interface and stores the context data in the non-cache area of ​​memory.

[0093] Sub-step 1015: The main core adds the context address corresponding to the context data, the size of the second code segment, and the third checksum to the wake-up command.

[0094] In this embodiment of the invention, for sub-steps 1014-1015, the master core can promptly save the execution information of a slave core when it detects that a slave core is in a cache coherence failure state. The execution information reflects the current execution node of the task by the slave core, and includes the current execution context of the slave core, such as register data, program counter, stack pointer, etc.

[0095] The debug interface can be a set of hardware registers that allow the master core to directly access the internal state of each slave core, including general-purpose registers, program counters, stack pointers, and status registers. Access to these registers is not cached, ensuring that the data read is real-time and accurate, thus preventing access from being affected by cache consistency issues.

[0096] Furthermore, in this embodiment of the invention, the context data is stored in a non-cached area of ​​memory. Since the context data is crucial for subsequent resumption of execution from the kernel, its integrity in memory must be guaranteed. If stored in a cached area, it may be corrupted or lost due to subsequent cache consistency errors. Therefore, the main kernel can write the context data to a non-cached physical memory area.

[0097] For example, the main kernel can allocate a fixed context storage area to each slave kernel. For instance, the starting address of the storage area of ​​kernel 1 is 0x9000_1000, which serves as the context address of kernel 1. The size of the second code segment that holds the context data is 512 bytes (sufficient to hold all the context data).

[0098] Furthermore, to ensure that the context data is not corrupted during storage, the master core can calculate a third checksum for the entire context data to check its integrity. The checksum algorithm can be a cumulative sum, CRC, hash, etc. Taking a cumulative sum as an example: starting from the beginning address of the context data, all data is added sequentially in 4-byte units to obtain the final 32-bit value. The context address, the size of the second code segment, and the third checksum constitute the key information required for recovery from the slave core. When the master core subsequently wakes up the slave core, it can send these parameters to the slave core via a wake-up command. The slave core loads these parameters and verifies the context data accordingly. If the verification passes, it resumes execution of the original task.

[0099] Further reference Figure 3 It illustrates a flowchart of the implementation of a control method for a multi-core processor, including: S1, Enter the storage unit (OTP).

[0100] S2. Determine if the current state is the primary core.

[0101] If it is the main core, then exit the storage unit (OTP).

[0102] If it is not the primary core, then proceed to S3: determine if the slave core's cache is abnormal.

[0103] If the cache is not abnormal, wait to be woken up from the core.

[0104] If a cache error occurs, proceed to S4: compare the first data in the cache unit with the corresponding second data in shared memory from the core.

[0105] S5. Determine whether the comparison results are consistent.

[0106] If they are consistent, then there is no cache consistency issue.

[0107] If there is a discrepancy, it is determined that there is a cache consistency problem, and the process enters S6 and notifies the main core through an inter-core interrupt.

[0108] S7: The master core sends a reset command to the slave core.

[0109] S8. After repairing the private cache unit from the core, it enters hibernation.

[0110] S9. The kernel responds to the wake-up of the main kernel and continues to execute the original task. In summary, this embodiment establishes an independent, non-cached storage unit. Under the control of the master core, a slave core experiencing cache inconsistency issues enters the storage unit and executes its built-in standby instructions. This allows the slave core to repair its cache unit and enter a hibernation state, preventing the problem from escalating if the slave core continues to use the cache. Subsequently, the master core wakes up the slave core according to a wake-up command and controls it to continue executing its original tasks based on the repaired cache unit, ensuring the slave core's basic computational functions. In this embodiment, the slave core does not always run in the uncached region; instead, it continues to use the cache for task processing after the cache is repaired. This ensures the slave core's normal access to and use of cached data resources, and eliminates the need to pre-set fallback paths to the uncached region for various potential error scenarios, thus improving processor efficiency.

[0111] Figure 4 This is a block diagram of a control device for a multi-core processor provided in an embodiment of the present invention, such as... Figure 4 As shown, the multi-core processor also includes a storage unit, and the device includes: The storage module 201 is used for the main core to monitor the slave core in real time. When the slave core is in a cache consistency failure state, the module stores the execution information of the slave core and sends a reset command to the slave core. Repair module 202 is used for the slave core to respond to the reset command sent by the master core, repair the cache unit of the slave core according to the standby command in the storage unit, and enter the hibernation state after the repair is completed; The wake-up module 203 is used to respond to the wake-up command sent by the master core and, based on the execution information in the wake-up command, resume the execution of the original task in the slave core.

[0112] Optionally, the reset instruction includes: an entry address, a first code segment size, and a first checksum; the entry address is used to indicate the storage location of the standby instruction in the storage unit; The repair module 202 includes: The first loading submodule is used to load data from the storage unit according to the entry address and the size of the first code segment; The first calculation submodule is used to calculate the second checksum based on the loaded data; The first verification submodule is used to execute the standby instruction in the storage unit when the second checksum is consistent with the first checksum.

[0113] Optionally, the repair module 202 includes: The hibernation submodule is used to initiate the cache invalidation operation from the kernel and reset the cache state machine from the kernel to its initial state.

[0114] Optionally, the wake-up instruction includes: the context address of the slave core, the size of the second code segment, and the third checksum; the context address is used to indicate the storage location of the context data of the slave core; The wake-up module 203 includes: The second loading submodule is used to load the context data according to the context address and the size of the second code segment; The second calculation submodule is used to calculate the fourth checksum based on the loaded context data; The second verification submodule is used to continue executing the original task based on the context data if the fourth checksum is consistent with the third checksum.

[0115] The device further includes: A comparison module is used to compare first data and second data; the first data is data from the core cache unit, and the second data is data in shared memory corresponding to the first data; The sending module is used to determine that the slave core is in a cache consistency failure state when the first data and the second data are inconsistent, and to send an exception notification to the master core so that the master core can send the reset command to the slave core.

[0116] Optionally, the device further includes: The read module is used by the main core to read register values ​​from a preset error record register; An identification module is used to determine, based on the register value, the slave core in the multi-core processor where the cache coherence failure state has occurred, when the register value is a value written by the interrupt controller reflecting the cache coherence failure state.

[0117] Optionally, the storage module 201 includes: The submodule is invoked so that the main core can read the context data of the slave core through the debugging interface and store the context data in memory; A submodule is added, which is used by the main core to add the context address corresponding to the context data, the size of the second code segment, and the third checksum to the wake-up instruction.

[0118] In summary, this embodiment establishes an independent, non-cached storage unit. Under the control of the master core, a slave core experiencing cache inconsistency issues enters the storage unit and executes its built-in standby instructions. This allows the slave core to repair its cache unit and enter a hibernation state, preventing the problem from escalating if the slave core continues to use the cache. Subsequently, the master core wakes up the slave core according to a wake-up command and controls it to continue executing its original tasks based on the repaired cache unit, ensuring the slave core's basic computational functions. In this embodiment, the slave core does not always run in the uncached region; instead, it continues to use the cache for task processing after the cache is repaired. This ensures the slave core's normal access to and use of cached data resources, and eliminates the need to pre-set fallback paths to the uncached region for various potential error scenarios, thus improving processor efficiency.

[0119] Optionally, embodiments of the present invention also provide an electronic device, including a processor, a memory, and a program or instructions stored in the memory and executable on the processor. When the program or instructions are executed by the processor, they implement the various processes of the above-described multi-core processor control method embodiments and achieve the same technical effects. To avoid repetition, they will not be described again here.

[0120] It should be noted that the electronic devices in the embodiments of the present invention include the mobile electronic devices and non-mobile electronic devices described above.

[0121] Figure 5 A schematic diagram of the hardware structure of an electronic device to implement an embodiment of the present invention.

[0122] The electronic device 1300 includes, but is not limited to, components such as: radio frequency unit 1301, network module 1302, audio output unit 1303, input unit 1304, sensor 1305, display unit 1306, user input unit 1307, interface unit 1308, memory 1309, and processor 1310.

[0123] Those skilled in the art will understand that the electronic device 1300 may also include a power supply (such as a battery) for supplying power to various components. The power supply may be logically connected to the processor 1310 through a power management playback device, thereby enabling functions such as managing charging, discharging, and power consumption through the power management playback device. Figure 5 The electronic device structure shown does not constitute a limitation on the electronic device. The electronic device may include more or fewer components than shown, or combine certain components, or have different component arrangements, which will not be elaborated here.

[0124] It should be understood that, in this embodiment of the invention, the input unit 1304 may include a graphics processing unit (GPU) 13041 and a microphone 13042. The GPU 13041 processes image data of still images or videos obtained by an image capture device (such as a camera) in video capture mode or image capture mode. The display unit 1306 may include a display panel 13061, which may be configured in the form of a liquid crystal display, an organic light-emitting diode, or the like. The user input unit 1307 includes a touch panel 13071 and at least one of other input devices 13072. The touch panel 13071 is also called a touch screen. The touch panel 13071 may include a touch detection device and a touch controller. Other input devices 13072 may include, but are not limited to, physical keyboards, function keys (such as volume control buttons, power buttons, etc.), trackballs, mice, and joysticks, which will not be described in detail here.

[0125] The memory 1309 can be used to store software programs and various data. The memory 1309 may primarily include a first storage area for storing programs or instructions and a second storage area for storing data. The first storage area may store application programs or instructions required to operate the playback device or at least one function (such as sound playback function, image playback function, etc.). Furthermore, the memory 1309 may include volatile memory or non-volatile memory, or both.

[0126] Processor 1310 may include one or more processing units; optionally, processor 1310 integrates an application processor and a modem processor, wherein the application processor mainly handles operations involving the operation of playback devices, user interfaces, and applications, and the modem processor mainly handles wireless communication signals, such as a baseband processor. It is understood that the aforementioned modem processor may also not be integrated into processor 1310.

[0127] Embodiments of the present invention also provide a readable storage medium storing a program or instructions. When the program or instructions are executed by a processor, they implement the various processes of the above-described multi-core processor control method embodiments and achieve the same technical effects. To avoid repetition, they will not be described again here.

[0128] The embodiments of the present invention have been described above with reference to the accompanying drawings. However, the present invention is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of the present invention without departing from the spirit and scope of the claims, and all of these forms are within the protection scope of the present invention.

Claims

1. A control method for a multi-core processor, characterized in that, The multi-core processor further includes a storage unit, and the method includes: The master core monitors the slave core in real time. When the slave core is in a cache consistency failure state, the master core saves the execution information of the slave core and sends a reset command to the slave core. The slave core responds to the reset command sent by the master core, repairs the cache unit of the slave core according to the standby command in the storage unit, and enters a hibernation state after the repair is completed; The slave core responds to the wake-up command sent by the master core and resumes the execution of the original task in the slave core based on the execution information in the wake-up command.

2. The method according to claim 1, characterized in that, The reset instruction includes: an entry address, a first code segment size, and a first checksum; the entry address is used to indicate the storage location of the standby instruction; Then, when the slave core responds to the reset command sent by the master core: Based on the entry address and the size of the first code segment, the verification data in the storage unit is loaded; Calculate the second checksum based on the verification data; If the second checksum matches the first checksum, the standby instruction in the storage unit is executed.

3. The method according to claim 1 or 2, characterized in that, The repair of the cache unit from the core includes: Perform the cache invalidation operation from the kernel and reset the cache state machine from the kernel to its initial state.

4. The method according to claim 1, characterized in that, The wake-up command includes: the context address of the slave core, the size of the second code segment, and the third checksum; the context address is used to indicate the storage location of the context data of the slave core. The step of resuming execution of the original task in the kernel based on the execution information in the wake-up command includes: The context data is loaded based on the context address and the size of the second code segment; Calculate the fourth checksum based on the context data; If the fourth checksum matches the third checksum, the original task continues to be executed based on the context data, so that the operation can be resumed from the core.

5. The method according to claim 1, characterized in that, The method further includes: Read the first data and the second data respectively and compare them; the first data is the data in the cache unit, and the second data is the data in the shared memory corresponding to the first data in the multi-core processor. When the first data and the second data are inconsistent, the slave core is determined to be in a cache consistency failure state, and an exception notification is sent to the master core so that the master core can send the reset command.

6. The method according to claim 1, characterized in that, The method further includes: The main core reads the register value from the preset error recording register; When the register value is a value written by the interrupt controller that reflects the cache coherency failure state, the slave core in the multi-core processor that has experienced the cache coherency failure state is determined based on the register value.

7. The method according to claim 1, characterized in that, The process of saving the execution information from the core includes: The main core reads the context data of the slave core through the debugging interface and stores the context data in memory; The main core adds the context address, the second code segment size, and the third checksum corresponding to the context data to the wake-up command.

8. A control device for a multi-core processor, characterized in that, The multi-core processor also includes a storage unit, and the device includes: The storage module is used by the main core to monitor the slave core in real time. When the slave core is in a cache consistency failure state, the module saves the execution information of the slave core and sends a reset command to the slave core. The repair module is used to repair the cache unit of the slave core in response to the reset command sent by the master core, according to the standby command in the storage unit, and enter the hibernation state after the repair is completed; The wake-up module is used by the slave core to respond to the wake-up command sent by the master core and, based on the execution information in the wake-up command, resume the execution of the original task in the slave core.

9. An electronic device, characterized in that, include: Memory; Memory used to store the processor's executable instructions; The processor is configured to execute the instructions to implement the method as described in any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that, When the instructions in the computer-readable storage medium are executed by the processor of the electronic device, the electronic device is enabled to perform the method as described in any one of claims 1 to 7.