Reusing fetched flushed instructions after instruction pipeline flush to reduce instruction refetch in response to hazard in processor

By detecting and reusing performance-degrading instructions in the instruction pipeline within the processor, the instruction pipeline flushing problem is solved, thereby improving the processor's instruction throughput and execution efficiency.

CN122308922APending Publication Date: 2026-06-30MICROSOFT TECHNOLOGY LICENSING LLC

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
MICROSOFT TECHNOLOGY LICENSING LLC
Filing Date
2021-04-25
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In processors, the instruction pipeline is susceptible to being swept away by structural or control flow hazards, resulting in reduced instruction throughput. Existing technologies struggle to effectively address these hazards and thus improve performance.

Method used

By detecting performance degradation instructions (PDI) in the instruction pipeline and reusing the fetched instructions after pipeline flushing, re-fetching is avoided. These instructions are captured and injected using instruction processing circuitry and fetch-refill circuitry, reducing the impact of flushing.

Benefits of technology

It increases the throughput of the instruction pipeline, reduces performance degradation caused by flushing, and improves the processor's execution efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122308922A_ABST
    Figure CN122308922A_ABST
Patent Text Reader

Abstract

Embodiments of this disclosure relate to the retrieval of computer program instructions to be executed in a processor. Reusing retrieved, flushed instructions after instruction pipeline flushing in response to a danger in the processor is disclosed to reduce instruction refetching. Instruction processing circuitry is configured to detect retrieved performance degradation instructions (PDIs) during the pre-execution phase of the instruction pipeline, which may cause a precise interrupt that results in instruction pipeline flushing. In response to the detection of a PDI in the instruction pipeline, the instruction processing circuitry is configured to capture, in a pipeline refill circuitry, the retrieved PDI and / or its subsequent, newer retrieved instructions processed after the PDI in the instruction pipeline. If the subsequent execution of a PDI in the instruction pipeline causes instruction pipeline flushing, the instruction processing circuitry can inject the retrieved PDI and / or its newer instructions previously captured from the pipeline refill circuitry into the instruction pipeline to be processed, without requiring such instructions to be refetched.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] This application is a divisional application of the international application PCT / US2021 / 029028, filed on April 25, 2021, entered the Chinese national phase on December 22, 2022, and has the Chinese national application number 202180044701.3, entitled "Reusing retrieved, flushed instructions after instruction pipeline flushing in response to a danger in the processor to reduce instruction refetching". Technical Field

[0002] The technology disclosed herein relates to a computer processor (“processor”), and more specifically to the retrieval of computer program instructions to be executed in the processor. Background Technology

[0003] A microprocessor, also known as a "processor," performs computational tasks for various applications. A traditional microprocessor includes a Central Processing Unit (CPU), which comprises one or more processor cores, also called "CPU cores." The CPU executes computer program instructions ("instructions"), also known as software instructions, to perform data-based operations and generate results, i.e., produced values. The processing of each instruction in the processor is divided into a series of different stages or steps, called the instruction pipeline. This allows multiple instructions to be processed simultaneously at different stages, increasing instruction processing throughput, rather than processing each instruction sequentially and executing it completely before processing the next instruction. Instruction pipeline steps are executed in one or more instruction pipelines within the processor, each instruction pipeline consisting of multiple processing stages.

[0004] Optimal processor performance can be achieved if all pipeline stages in the instruction pipeline can process instructions concurrently on the pipeline. However, dangers can arise in the instruction pipeline where instructions fail to execute without leading to incorrect computational results. An example of a danger that can flush the instruction pipeline is a structural hazard. An example of a structural hazard is a load instruction that fails to load an entry into a potentially full load queue. If the load instruction fails to execute, deadlock can occur in the instruction pipeline. Another example of a danger that can flush the instruction pipeline is a control hazard caused by the execution of control flow instructions that result in precise interrupts in the processor. An example of a control flow instruction that can cause a control hazard is a conditional branch instruction. Conditional branch instructions include predicate conditions that are not fully evaluated at a later stage of instruction pipeline execution to determine whether the instruction flow will branch. To avoid pausing the fetching of subsequent, newer instructions following a conditional branch instruction back into the instruction pipeline before execution, control flow prediction circuitry can be provided in the processor to speculatively predict the branch target address of the conditional branch instruction. Then, based on the prediction of the branch target address, the processor can speculatively fetch the subsequent instruction during the instruction fetch phase of the instruction pipeline after the conditional branch instruction has been fetched.

[0005] If the actual resolved branch target address during execution is determined to match the predicted branch address, a pause in the instruction pipeline will not occur. This is because when a conditional branch instruction arrives at the execution stage of the instruction pipeline, the subsequent instruction starting at the predicted branch target address will be correctly fetched and already exists in the instruction pipeline. However, if the predicted branch target address and the resolved branch target address do not match, a mispredicted branch risk will occur in the instruction pipeline, leading to a precise interrupt. Therefore, the instruction pipeline will be filled with existing, previously fetched instructions in each stage. The fetch stage of the instruction pipeline is instructed to fetch new instructions from the correct, resolved branch target. Thus, stages in the instruction pipeline will remain dormant until the newly fetched instruction is processed and executed through the instruction pipeline, thereby reducing instruction throughput performance.

[0006] Other situations may arise when processing instructions other than branch instructions, leading to structural hazards and causing pipeline erosion. Examples include deadlocks and instructions that cannot be executed due to a lack of resources, such as available space in a queue. Summary of the Invention

[0007] Exemplary aspects disclosed herein include reusing fetched, flushed instructions after instruction pipeline flushing in response to a danger in the processor, thereby reducing instruction refetching. The processor includes instruction processing circuitry configured to fetch instructions back into the instruction pipeline for processing and execution as part of the instruction pipeline during the execution phase. Execution circuitry within the instruction processing circuitry is configured to generate a precise interrupt in response to a danger encountered during instruction execution (e.g., structural or control flow danger). For example, a precise interrupt might be generated due to a mispredicted conditional branch instruction, where a subsequent, control-dependent instruction on a conditional branch instruction from an incorrect instruction flow path has been fetched back into the instruction pipeline. In response to a precise interrupt, the instruction processing circuitry is configured to flush the instruction that caused the precise interrupt and its subsequent, newer instructions in the instruction pipeline to overcome the danger. This results in reduced instruction throughput in the instruction pipeline. If these fetched instructions can be reused in the instruction pipeline after flushing without having to be fetched again, then these fetched instructions can be injected into the instruction pipeline without having to fetch them again, thus mitigating the instruction reduction caused by flushing.

[0008] In this regard, in the exemplary aspects disclosed herein, the instruction processing circuitry in the processor is configured to detect fetched instructions during the pre-execution phase of the instruction pipeline, which may cause a precise interrupt that results in a pipeline flush. These instructions may be referred to as performance degradation instructions (PDIs). For example, the instruction processing circuitry may be configured to detect PDIs after they have been decoded during the decoding phase of the instruction pipeline. In response to the detection of a PDI in the instruction pipeline, the instruction processing circuitry is configured to capture the fetched PDI (if it does not already exist) and any subsequent, newer fetched instructions processed in the instruction pipeline following the PDI in the pipeline fetch-refill circuitry. Thus, if the execution of a PDI in the instruction pipeline results in a precise interrupt, thereby causing a pipeline flush (flushing event), the instruction processing circuitry can then determine whether the detected instruction (i.e., the PDI and / or the subsequent instruction) exists in the pipeline fetch-refill circuitry, as previously captured. If the detected instruction exists in the pipeline fetch-refill circuit, the instruction processing circuit can inject the detected instruction and its successor, a newer instruction previously captured in the pipeline fetch-refill circuit, into the instruction pipeline to be processed, without having to be fetched again. Therefore, the latency associated with re-fetching these instructions does not contribute to the instruction throughput of the instruction pipeline. The instruction processing circuit can provide a "through" program counter (PC) to the fetch stage in the instruction pipeline to know from where re-fetched instructions must be retrieved in response to a flush event, as they cannot be injected from the pipeline fetch-refill circuit. The through PC is the PC of the instruction following the last newer instruction captured in the pipeline fetch-refill circuit. The instruction processing circuit can be configured to capture the through PC in the pipeline fetch-refill circuit associated with the captured PDI.

[0009] In other exemplary aspects, the instruction pipeline circuitry can also be configured to capture instruction flow path information for a PDI with a variable instruction flow path in the instruction processing circuitry. For example, the instruction flow path taken after a conditional branch instruction or indirect branch instruction varies according to the resolution of the branching behavior of such instructions. In this way, the captured subsequent specific instruction flow path, the newer instruction from the captured PDI, is known. In response to the detection and determination of the presence of the PDI in the pipeline fetch and refill circuitry in response to a flushing event, the instruction processing circuitry can also determine whether the captured, newer control-dependent instruction from the PDI should be injected into the instruction pipeline as if from the correct instruction flow path. If the captured, newer control-dependent instruction comes from a correctly resolved instruction flow path, which originates from the PDI, the newer control-dependent instruction can be injected into the instruction pipeline because it is known that it comes from the correct instruction flow path of the PDI and does not need to be fetched again. If the captured, newer control-dependent instruction is determined not to come from the correct instruction flow path, which originates from the PDI, the newer control-dependent instruction can be ignored and fetched again. For newer instructions that are independent of PDI control, instruction flow path information does not need to be recorded because control-independent instructions do not depend on the instruction flow path parsed from PDI.

[0010] In this regard, in one exemplary aspect, a processor is provided. The processor includes instruction processing circuitry. The instruction processing circuitry includes instruction fetch circuitry configured to fetch a plurality of instructions from program code back into an instruction pipeline to be executed as a plurality of fetched instructions. The instruction processing circuitry also includes execution circuitry coupled to the instruction fetch circuitry, the execution circuitry being configured to execute one of the plurality of fetched instructions in the instruction pipeline, and in response to the execution of the fetched instructions, including generating a dangerous PDI and generating a pipeline flushing event to flush the instruction pipeline. The processor also includes instruction fetch reuse circuitry coupled to the instruction pipeline. The instruction fetch reuse circuitry is configured, in response to the pipeline flushing event, to determine whether the source identifier of a fetched instruction matches a source identifier in a refill tag in a fetch refill entry, the fetch refill entry being a matching fetch refill entry among the plurality of fetch refill entries in the pipeline fetch refill circuitry. In response to a match between the source identifier of a retrieved instruction and the source identifier in the refill tag of a retrieved refill entry, the instruction retrieval reuse circuit is configured to insert one or more captured instructions of the matching retrieved refill entry into the instruction pipeline following the instruction retrieval circuit to be processed.

[0011] In another exemplary aspect, a method for reusing fetched, flushed instructions in an instruction pipeline in a processor is provided. The method includes fetching a plurality of instructions from program code back into an instruction pipeline to be executed as a plurality of fetched instructions. The method also includes executing one of the fetched instructions in the instruction pipeline. The method further includes generating a pipeline flushing event to flush the instruction pipeline in response to a risk of performance degradation instruction (PDI) arising from the execution of the fetched instructions in the plurality of instructions. In response to the pipeline flushing event, the method further includes determining whether the source identifier of a detected instruction matches a source identifier in a refill tag in a fetch refill entry, the fetch refill entry being a matching fetch refill entry among a plurality of fetch refill entries in a pipeline fetch refill circuit. In response to the source identifier of the detected instruction matching the source identifier in the refill tag in the fetch refill entry, the method further includes inserting one or more captured instructions from the matching fetch refill entry into the instruction pipeline as fetched instructions to be executed.

[0012] Those skilled in the art will understand the scope of this disclosure and realize its additional aspects after reading the following detailed description of preferred embodiments associated with the accompanying drawings. Attached Figure Description

[0013] The accompanying drawings, which are incorporated in and form a part of this disclosure, illustrate several aspects of this disclosure and, together with the description, are used to explain the principles of this disclosure.

[0014] Figure 1 This is a schematic diagram of an exemplary processor-based system including a processor with instruction processing circuitry including one or more instruction pipelines for processing computer instructions, in response to pipeline flushing caused by the execution of a captured PDI, wherein the instruction processing circuitry is configured to reuse captured, retrieved instructions (i.e., retrieved PDIs and / or their captured, retrieved, subsequent, newer instructions) to be processed in the instruction pipeline to avoid the need to retrieve the PDIs and the newer instructions to be processed. Figure 2A It is a diagram. Figure 1 A flowchart of an exemplary process of the instruction processing circuitry in the instruction pipeline, which detects and captures retrieved instructions in the instruction pipeline into the pipeline retrieve and refill circuitry; Figure 2B It is a diagram. Figure 1 A flowchart of an exemplary process of the instruction processing circuitry in the instruction pipeline, in response to a flushing event caused by the execution of PDI, reuses the fetched instructions captured by the pipeline fetch and refill circuitry in the instruction pipeline. Figure 3This is another exemplary schematic diagram of a processor-based system, which includes a processor having instruction processing circuitry including one or more instruction pipelines for processing computer instructions, in response to pipeline flushing caused by the execution of a captured PDI, wherein the instruction processing circuitry is configured to reuse captured, fetched instructions to be processed in the instruction pipeline to avoid the need to fetch the PDI and its newer instructions to be processed. Figure 4 yes Figure 3 A schematic diagram of an exemplary pipeline fetch-fill circuit is shown. This pipeline fetch-fill circuit is configured to store captured, fetched instructions present in the instruction pipeline and is configured to provide captured fetched PDIs and / or their newer fetched instructions for reuse in response to a flush event caused by the execution of a PDI; and Figure 5 This is a block diagram of an exemplary processor-based system including a processor with instruction processing circuitry configured to multiplex captured, fetched instructions in the instruction pipeline to be processed in response to pipeline flushing caused by the execution of a captured PDI, to avoid the need to re-fetch the captured, fetched instructions to be processed, including but not limited to... Figure 1 and Figure 3 Exemplary instruction processing circuitry in, and according to, but not limited to, Figure 2A and Figure 2B An example process in [the text]. Detailed Implementation

[0015] Exemplary aspects disclosed herein include reusing fetched, flushed instructions after instruction pipeline flushing in response to a danger in the processor, thereby reducing instruction refetching. The processor includes instruction processing circuitry configured to fetch instructions back into the instruction pipeline for processing and execution as part of the instruction pipeline during the execution phase. Execution circuitry within the instruction processing circuitry is configured to generate a precise interrupt in response to a danger encountered during instruction execution (e.g., structural or control flow danger). For example, a precise interrupt might be generated due to a mispredicted conditional branch instruction, where a subsequent, control-dependent instruction on a conditional branch instruction from an incorrect instruction flow path has been fetched back into the instruction pipeline. In response to a precise interrupt, the instruction processing circuitry is configured to flush the instruction that caused the precise interrupt and its subsequent, newer instructions in the instruction pipeline to overcome the danger. This results in reduced instruction throughput in the instruction pipeline. If these fetched instructions can be reused in the instruction pipeline after flushing without having to be fetched again, then these fetched instructions can be injected into the instruction pipeline without having to be fetched again, thus mitigating the instruction reduction caused by flushing.

[0016] In this regard, in the exemplary aspects disclosed herein, the instruction processing circuitry in the processor is configured to detect fetched instructions during the pre-execution phase of the instruction pipeline, which may cause a precise interrupt that results in a flush of the instruction pipeline. These instructions may be referred to as performance degradation instructions (PDIs). For example, the instruction processing circuitry may be configured to detect PDIs after they have been decoded during the decoding phase of the instruction pipeline. In response to the detection of a PDI in the instruction pipeline, the instruction processing circuitry is configured to capture the fetched PDI (if it does not already exist) and subsequent, newer fetched instructions processed in the instruction pipeline following the PDI in the pipeline fetch-refill circuitry. Thus, if the execution of a PDI in the instruction pipeline results in a precise interrupt, thereby causing a flush of the instruction pipeline (flushing event), the instruction processing circuitry can then determine whether the detected instruction (i.e., the PDI and / or the subsequent instruction) exists in the pipeline fetch-refill circuitry, as previously captured. If the detected instruction exists in the pipeline fetch-refill circuit, the instruction processing circuit can inject the detected instruction and its subsequent, newer instructions previously captured in the pipeline fetch-refill circuit into the instruction pipeline to be processed without re-fetching such instructions. Therefore, the latency associated with re-fetching these instructions does not contribute to the instruction throughput of the instruction pipeline. The instruction processing circuit can provide a "through" program counter (PC) to the fetch stage in the instruction pipeline to know where to begin re-fetching instructions that must have been fetched in response to a flush event, as they cannot be injected from the pipeline fetch-refill circuit. The through PC is the PC of the instruction following the last captured newer instruction in the pipeline fetch-refill circuit. The instruction processing circuit can be configured to capture the through PC in the pipeline fetch-refill circuit associated with the captured PDI.

[0017] in this regard, Figure 1 This is a schematic diagram of an exemplary processor-based system 100 including processor 102. As will be discussed in more detail below, processor 102 is configured to reuse fetched instructions that are acquired and present in the instruction pipeline, and subsequently flushed in response to encountered hazards to reduce instruction re-fetching. Other components of processor 102 are first discussed below before discussing the reuse of fetched instructions in response to pipeline flushing, which is in response to the occurrence of encountered hazards.

[0018] refer to Figure 1 The processor 102 includes an instruction processing circuit 104, which includes one or more instruction pipelines I0-I N It is used to process computer instructions for execution. Processor 102 is Figure 1The processor shown is an out-of-order processor (OoP), but it could also be an ordered processor. Instruction processing circuitry 104 includes instruction fetch circuitry 106 configured to fetch instruction 108 from instruction memory 110. As an example, instruction memory 110 may be provided in or as part of system memory in the processor-based system 100. Instruction cache 112 may also be provided in processor 102 to cache instruction 108 fetched from instruction memory 110 to reduce timing latency in instruction fetch circuitry 106. In this example, instruction fetch circuitry 106 is configured to provide instruction 108 as fetched instruction 108F to one or more instruction pipelines I0-I before the fetched instruction 108F arrives at execution circuitry 116 to be executed as executed instruction 108E. N The instruction stream 114 is used in the instruction processing circuit 104 to be preprocessed. Instruction pipeline I0-I N Different processing circuits or stages across instruction processing circuitry 104 are provided to preprocess and process the fetched instruction 112F in a series of steps, which can be performed simultaneously to increase throughput before the execution of instruction 108F fetched by execution circuitry 116.

[0019] The control flow prediction circuit 118 (e.g., a control flow prediction circuit) can also be provided in Figure 1 In the instruction processing circuitry 104 of the processor 102, the result of the predicate of the fetched conditional control instruction 108F, such as a conditional branch instruction, is inferred or predicted, which affects the instruction pipeline I0-I N The instruction control flow path of the instruction flow 114 processed in the instruction processing circuit 104. The prediction of the control flow prediction circuit 118 can be used by the instruction fetch circuit 106 to determine the next fetched instruction 108F to be fetched based on the predicted branch target address. The instruction processing circuit 104 also includes an instruction decoding circuit 120, which is configured to decode the fetched instruction 108F fetched by the instruction fetch circuit 106 into a decoded instruction 108D. The instruction type and required action encoded in the decoded instruction 108D can also be used to determine which instruction pipeline I0-I the decoded instruction 108D should be placed in. N middle.

[0020] In this example, the decoded instruction 108D is placed in instruction pipeline I0-I. NOne or more of the data are then provided to the renaming circuit 122 in the instruction processing circuit 104. The renaming circuit 122 is configured to determine whether any register names in the decoded instruction 108D need to be renamed to break any register dependencies that would prevent parallel or out-of-order processing. The renaming circuit 122 is configured to invoke the rename access table circuit 124 to rename the logical source register operands and / or write the destination register operands of the decoded instruction 108D to P0, P1, ..., P1 in the available physical register file (PRF). X The rename access table circuit 124 contains multiple register mapping entries 128(0)-128(P), each entry mapped to (i.e., associated with) a corresponding logic register R0-R P Register mapping entries 128(0)-128(P) are each configured to store the physical registers P0-P corresponding to the pointers in PRF 126. X Logic registers R0-R P The corresponding mapping information. Each physical register P0-P X Data entries 130(0)-130(X) are configured to be stored in the source and / or destination register operands for the decoded instruction 108D.

[0021] Figure 1 The instruction processing circuitry 104 in the processor 102 also includes a register access circuitry 132, which is located in the instruction pipeline I0-I preceding the issue circuitry 134. N In the middle. Register access circuit 132 is configured to be based on logic registers R0-R0 mapped to rename access table circuit 124. P Register mapping entries 128(0)-128(P) access physical registers P0-P in PRF126. X This is used as the input value for the named source register operand of the decoded instruction 108D to be executed in execution circuit 116. The release circuit 134 is configured to store the decoded instruction 108D in instruction pipeline I0-I. N The reserved entries in the instruction set are available for consumption during execution until their respective source register operands are available. The issue circuit 134 issues the decoded instruction 108D ready for execution to the execution circuit 116. The commit circuit 136 is also provided in the instruction processing circuit 104 to commit or write back the value generated by the execution of the decoded instruction 108D to memory such as PRF 126, cache memory, or system memory.

[0022] Figure 1The execution circuit 116 in the instruction processing circuit 104 of the processor 102 is configured to generate a precise interrupt in response to a hazard (e.g., structural hazard or control flow hazard) when executing the decoded instruction 108D. Instruction 108D, when executed, causes or is determined to cause a hazard in the processor 102, is referred to herein as a "Performance Degradation Instruction (PDI)". When the execution circuit 116 encounters the hazard of executing PDI 108D, the subsequent, newer instruction 108D has been fetched back into the instruction pipeline I0-I. N And potentially decoded for processing. In response to a precise interrupt, instruction processing circuitry 104 is configured to generate a flush event 138, so that the instruction 108D that caused the precise interrupt and its successor, already in the instruction pipeline I0-I, are processed. N The newer instruction 108D that has been fetched is flushed and re-executed to overcome the danger. The re-fetching of PDI 108D and its newer successor instruction 108D unintentionally reduces the throughput in instruction processing circuitry 104.

[0023] To avoid the need to re-fetch the flushing instruction 108D flushed by the instruction processing circuit 104, Figure 1 The instruction processing circuit 104 in the example includes a PDI detection circuit 140 and an instruction fetch and reuse circuit 142. The PDI detection circuit 140 and the instruction fetch and reuse circuit 142 can be part of the instruction processing circuit 104 or external to the instruction processing circuit 104. Both the PDI detection circuit 140 and the instruction fetch and reuse circuit 142 are coupled to the instruction pipeline I0-I. N As will be discussed in more detail below, the PDI detection circuit 140 is configured to detect that the instruction has been fetched back to the instruction pipeline I0-I. N The PDI in the fetched instruction 108D is to be processed and executed. For example, the PDI detection circuit 140 can be configured to detect the PDI 108D after it has been decoded in the instruction decoding circuit 120. In response to the PDI detection circuit 140 in the instruction pipeline I0-I... N In the process of detecting PDI 108D, PDI detection circuit 140 is configured to capture detected instructions, which can be the detected PDI 108D and / or its subsequent, newer fetched instructions 108D, which are located in the instruction pipeline I0-I following PDI 108D. NThe information is processed into the pipeline retrieve and refill circuit 144. The pipeline retrieve and refill circuit 144 may be a table circuit comprising a plurality of retrieve and refill entries 146(0)-146(R), each entry being configured to store information about the detected PDI 108D and its subsequent, newer retrieved instruction 108D. Thus, in response later to a flush event 138, the instruction retrieve and reuse circuit 142 may determine whether the detected instruction (i.e., the PDI 108D whose execution caused the flush event 138 and / or its newer, subsequent instruction 108D) was previously captured in retrieve and refill entries 146(0)-146(R) in the pipeline retrieve and refill circuit 144. If the detected instruction 108D appears in the retrieve and refill entries 146(0)-146(R) in the pipeline retrieve and refill circuit 144, the instruction retrieve and reuse circuit 142 can inject the previously captured PDI 108D and / or its newer, subsequent retrieved instruction 108D from the pipeline retrieve and refill circuit 144 into the instruction pipeline I0-I N The decoded instructions are processed, and such decoded instructions do not need to be extracted again by 108D.

[0024] Therefore, the delay associated with retrieving these previously fetched instructions 108D does not contribute to the instruction throughput of instruction processing circuit 104. Instruction fetch reuse circuit 142 can provide instruction fetch circuit 106 with a "through" program counter (PC) 148, so that instruction fetch circuit 106 knows where to begin fetching instructions 108 in response to flush event 138. Through PC 148 is the PC of the next instruction 108D that, after the last previously captured, subsequent, newer instruction 108D for PDI 108D, causes flush event 138 in pipeline fetch refill circuit 144. As discussed in more detail below, PDI detection circuit 140 is also configured to record through PC 148 in fetch refill entries 146(0)-146(R), which are assigned to capture instruction pipelines I0-I N The detected retrieved PDI 108D and its newer, subsequent retrieved instruction 108D. In this way, the instruction fetch circuit 106 can begin fetching instructions that will not be injected into the instruction pipeline I0-I by the instruction fetch reuse circuit 142. N The new instruction 108D in the system.

[0025] Figure 2A It is a diagram. Figure 1 A flowchart of an exemplary process 200 of the instruction processing circuit 104 in the circuit, which detects and captures Figure 1 Instruction Pipeline I0-I NThe retrieved PDI 108D and the retrieved newer instruction 108D are then used. This is so that, later in response to a flush event 138, the captured retrieved PDI 108D and its retrieved, newer instruction 108D can be reused by the instruction retrieval reuse circuit 142 and injected into the instruction pipeline (I0-I50). N In the context of ), it does not need to be retrieved again. The following combines... Figure 1 Discussion of processor 102 in Figure 2A The process in 200.

[0026] In this respect, process 200 includes fetching multiple instructions 108 from the program code back to the instruction pipeline I0-I to be executed. N In the middle, as multiple retrieved instructions 108F ( Figure 2A (Block 202 in the process). Process 200 also includes a PDI detection circuit 140, whose detection command pipeline I0-I N Is the retrieved instruction 108D in the data PDI 108D? Figure 2A Block 204 in the middle). The PDI detection circuit 140 has multiple ways to detect instruction pipeline I0-I. N Whether the retrieved instruction 108D is a PDI 108D, examples of these methods will be discussed in more detail below. Then, the PDI detection circuit 140 may optionally determine whether the detected instruction 108D was previously captured in the pipeline retrieval and refill circuit 144, thereby determining whether the PDI 108D and its newer, subsequent retrieved instruction 108D have been previously captured.

[0027] In one example, instruction processing circuitry 104 is configured to capture the retrieved PDI 108 itself in pipeline fetch refill circuitry 144 in response to a detected PDI 108D, if PDI 108D is an instruction type that will also be flushed in response to flush event 138 and therefore will need to be re-fetched. This allows the captured retrieved PDI 108 to also be reused later by instruction fetch reuse circuitry 142 as a re-fetched PDI 108, such as in response to flush event 138, without needing to re-fetch PDI 108. An example of a PDI flushed in response to flush event 138 and therefore re-fetched for re-execution is a memory load instruction that encounters a deadlock. In another example, if PDI 108D is an instruction type that will not be flushed in response to flush event 138 and therefore needs to be re-fetched, then instruction processing circuitry 104 is not configured to capture the retrieved PDI 108 in pipeline fetch-refill circuitry 144 in response to a detected PDI 108D. This is because the captured retrieved PDI does not need to be re-executed. An example of a PDI 108 that is not flushed and therefore not re-fetched for re-execution in response to flush event 138 is a predictive error conditional branch instruction.

[0028] Return to reference Figure 2A In this example, in response to the PDI detection circuit 140 detecting a retrieved command 108D as the detected command, the command can be PDI 108D and / or PDI ( Figure 2A In the newer, successor instruction 108D (box 204), the PDI detection circuit 140 determines whether the source identifier 150 (e.g., source address, program counter (PC)) of the detected instruction 108D matches the pipeline fetch and refill circuit 144 (box 204). Figure 2A The source identifier (e.g., source address PC) in the refill tags 152(0)-152(R) of the refill refill entries 146(0)-146(R) in box 206) matches. This is to determine whether the detected instruction 108D has been captured by the PDI detection circuit 140 in the pipeline refill refill circuit 144. In response to the source identifier 150 of the detected instruction 108D matching the refill refill entries 146(0)-146(R) Figure 2A If the source identifiers in the refill tags 152(O)-152(R) in box 208 do not match, the PDI detection circuit 140 allocates available refill entries 146(0)-146(R) in multiple refill entries 146(0)-146(R) in the pipeline refill circuit 144 to capture one or more subsequent, newer instructions 108D after the detected instruction 108D for later reuse. Figure 2A(Box 210 in the image). As described above, if the detected instruction 108D is an instruction that will be retried in response to the flush event 138, the PDI detection circuit 140 also captures the detected PDI 108D in the available retrieve and refill entries 146(0)-146(R). Then, the PDI detection circuit 140 stores the source identifier 150 of the detected instruction 108D (i.e., the detected PDI 108D and / or its newer, successor instruction 108D) in the available retrieve and refill entries 146(0)-146(R). Figure 2A The refill labels 152(0)-152(R) are then placed in box 212. The PDI detection circuit 140 then retrieves the refill circuit 144 from the production line. Figure 2A The capture instruction pipeline I0-I in the assigned fetch and refill entries 146(0)-146(R) in box 214) N One or more subsequent, newer fetch instructions 108D following the detected instruction 108D in the execution circuit 116 are then processed and executed. Figure 2A Block 216 in the middle.

[0029] Figure 2B It is a diagram. Figure 1 The flowchart illustrates an exemplary process 220 of the instruction retrieval and reuse circuit 142, which reuses retrieved instructions (which may be PDI 108D and / or retrieved successor instructions 108D) previously captured by the PDI detection circuit 140 in the pipeline retrieval and refill circuit 144 into the instruction pipeline I0-I. N In response to flush event 138, as described above, if the PDI 108D that caused flush event 138 was previously captured, the captured retrieved PDI 108D and / or the retrieved, subsequent, newer instruction 108D can be obtained from pipeline retrieval and refill circuit 144 to be injected into instruction pipeline I0-I. N This avoids re-fetching these instructions into the instruction pipeline I0-I using the 108D. N The following is in accordance with the needs. Figure 1 Discussion of processor 102 in Figure 2B The process in 220.

[0030] In this respect, process 220 includes, in response to the execution of instruction 108D among a plurality of instructions 108D, generating a danger as PDI 108D, processor 102 generates pipeline flushing event 138 to flush instruction pipeline I0-I N ( Figure 2B Box 222 in the middle). Responding to pipeline flushing event 138 ( Figure 2BIn block 224), instruction retrieval reuse circuit 142 determines whether the source identifier 150 of the retrieved instruction 108D matches the source identifier in the refill tags 152(0)-152(R) in the retrieve refill entries 146(0)-146(R), which are pipelined retrieval refill circuits 144 ( Figure 2B In box 226), a matching fetch refill entry 146(0)-146(R) is found among multiple fetch refill entries 146(0)-146(R). In response to a source identifier 150 of a fetched instruction 108D matching a source identifier in a refill tag 152(0)-152(R) of a matching fetch refill entry 146(0)-146(R), the instruction fetch reuse circuit 142 inserts one or more of the captured fetched instructions 108D from the matching fetch refill entries 146(0)-146(R) into instruction pipeline I0-I. N The retrieved instruction 108D is to be executed. Figure 2B (Box 228 in the text). Then process 220 includes executing the injection instruction pipeline I0-I. N Reusable retrieval instruction 108D ( Figure 2B Block 230 in the middle.

[0031] Different options and features can be provided in the instruction processing circuit 104 to support the reuse of detected instructions captured in the instruction pipeline in response to pipeline flushing caused by the execution of captured PDIs, thereby avoiding the need to fetch the PDIs and their newer instructions to be processed. In this regard, Figure 3 This is another exemplary schematic diagram of a processor-based system 300, which includes a processor 302 having instruction processing circuitry 304, which is similar to... Figure 1 The instruction processing circuit 104 in the middle. Figure 1 Instruction processing circuit 104 and Figure 3 The common circuits and components among the instruction processing circuits 304 are shown with common component numbers and will not be described again.

[0032] like Figure 3 As shown, the instruction processing circuit 304 includes similar components. Figure 1 The PDI detection circuit 140 and the PDI detection circuit 340 in the middle. Figure 3 The instruction processing circuit 304 in the middle also includes similar Figure 1 The instruction fetch reuse circuit 142 and instruction fetch reuse circuit 342 are configured in the instruction fetch reuse circuit 142. The PDI detection circuit 340 is configured to detect instructions fetched into the instruction pipeline I0-I. NThe PDI 108D is processed and executed in the retrieved instruction 108D. For example, the PDI detection circuit 340 can be configured to detect the PDI 108D after it has been decoded in the instruction decoding circuit 120. Figure 3 The PDI detection module 340 in the example is in instruction pipeline I0-I N In the ordered stage, the instruction pipeline I0-I is coupled between the instruction decoding circuit 120 and the renaming circuit 122. N This allows the PDI detection unit 340 to receive decoding information about the decoded instruction 108D to detect the decoded PDI 108D. In this example, the PDI detection circuit 340 is configured in the instruction pipeline I0-I... N In the ordered phase, the decoded instruction 108D is received. If the decoded instruction 108D is detected as PDI 108D, the PDI detection circuit 340 can be configured to capture the instruction pipeline I0-I. N The subsequent decoded instruction 108D in the pipeline is known to follow the PDI 108D detected in the program code, from which instruction stream 114 is retrieved.

[0033] The PDI detection circuit 340 detects whether the fetched instruction 108F or the decoded instruction 108D is a PDI in different ways. In one example, if the decoded instruction 108D is a branch instruction with branch behavior resolved at execution time (such as a conditional branch instruction, an indirect branch instruction, or a conditional indirect branch instruction), the PDI detection circuit 340 can be configured to use a branch predictor confidence 354 updated by the control flow prediction circuit 118. The branch predictor confidence 354 is a measure of confidence that the branch behavior of the branch instruction can be correctly predicted. When instruction 108D has been executed in the past, the control flow prediction circuit 118 can be configured to predict the branch behavior of the branch instruction 108D and update the branch predictor confidence 354 based on whether the predicted branch behavior matches the resolution of the branch behavior determined by the execution circuit 116 at branch time. Therefore, the PDI detection circuit 340 can use the branch predictor confidence 354 to predict or determine whether the branch instruction 108D is a PDI. Branch instructions 108D with a low branch predictor confidence of 354 are more likely to be mispredicted, and therefore more likely to cause a risk of flushing event 138 when executed in execution circuitry 116.

[0034] PDI detection circuit 340 can also be configured to determine whether a memory operation instruction 108D, such as a load instruction, is a PDI. The memory operation instruction 108D relates to performing a memory operation at a specified memory address, which can be a direct memory address or an indirect memory address. Execution circuit 116 can be configured to store a PDI indicator corresponding to the memory operation instruction 108D when a danger occurs and a flush event 138 occurs during the execution of the memory operation instruction 108D. Execution circuit 116 can be configured to store the PDI indicator in a PDI indicator circuit 358 containing multiple PDI indicator entries 360(0)-360(I), where the PDI indicator corresponding to the memory operation instruction can be stored. When PDI detection circuit 340 receives a memory operation instruction 108D and determines whether it is a PDI, PDI detection module 340 can refer to PDI indicator circuit 358 to determine whether the PDI indicator exists in the PDI indicator entries 360(0)-360(I) for the memory operation instruction 108D. The PDI detection circuit 340 can use the PDI indicator to determine whether the corresponding memory operation instruction 108D should be considered a PDI for PDI detection purposes.

[0035] Continue to refer to Figure 3 In response to the PDI detection circuit 340, the instruction pipeline I0-I N The received instruction 108D is detected as a PDI. The PDI detection circuit 340 is configured to capture the retrieved PDI 108D and / or its successor, the newer retrieved 108D, which follows the retrieved PDI 108D instruction pipeline I0-I in the pipeline retrieval and refill circuit 344. N As described below, this allows the instruction retrieval and reuse circuit 342 to obtain these retrieved instructions 108D for reuse and injection into the instruction pipeline I0-I in response to a flush event 138 generated by the later execution of the detected PDI 108D. N In the middle. The pipeline fetch-refill circuit 344 may be a memory table circuit that includes a plurality of fetch-refill entries 346(0)-346(R), each entry being configured to store information about the detected PDI 108D and subsequent, newer fetched instructions 108D. Figure 3 A more detailed example of the pipeline retrieval and refill circuit 344 is as follows: Figure 4 As shown, and discussed below.

[0036] Figure 4 yes Figure 3 A schematic diagram of an exemplary pipeline retrieval and refill circuit 344, configured to store... Figure 3 Instruction pipeline I0-I in processor 302N The captured retrieved PDI 108D and its retrieved newer instructions are present in the pipeline. The pipeline retrieve refill circuit 342 includes a plurality of retrieve refill entries 346(0)-346(R), each entry being configured to be allocated for storing the PDI 108D detected by the PDI detection circuit 340 for later reuse by the instruction retrieve reuse circuit 342. The pipeline retrieve refill circuit 344 will combine Figure 3 The exemplary operation of the PDI detection circuit 340 is discussed.

[0037] In this regard, when the PDI detection circuit 340 is as described above in the instruction pipeline I0-I N When the received decoded instruction 108D being processed is detected as a PDI, the PDI detection circuit 340 can first determine whether the retrieved refill entries 346(0)-346(R) in the pipeline retrieved refill circuit 344 have been allocated and are storing the PDI 108D. If so, it is not necessary to reallocate another retrieved refill entry 346(0)-346(R) for the detected PDI 108D. In this example, in order to determine whether the retrieved refill entries 346(0)-346(R) in the pipeline retrieved refill circuit 344 have been allocated and are storing the PDI 108D, the PDI detection circuit 340 is configured to determine Figure 3 The source identifier 350 of the detected instruction (i.e., PDI 108D and / or its newer, successor instruction 108D) is checked against the source identifier 362(0)-362(R) in the corresponding refill tags 352(0)-352(R) in the retrieve and refill entries 346(0)-346(R) in the pipeline retrieve and refill circuit 344. The source identifier 350 of the detected instruction 108D may be the program counter (PC) of the detected instruction 108D, which uniquely identifies its presence in the program code from which it has been retrieved into the instruction stream 114 of the instruction processing circuit 304. If the source identifier 350 of the detected instruction 108D is contained in the source identifier 362(0)-362(R) of the corresponding refill tag 352(0)-352(R) in the retrieve and refill entries 346(0)-346(R) in the pipeline retrieve and refill circuit 344, this means that PDI 108D and / or its successor, newer instruction 108D has been stored in the retrieve and refill entries 346(0)-346(R), which include the corresponding refill tags 352(0)-352(R) with matching source identifiers 362(0)-3062(R). The PDI detection circuit 340 does not need to further process the detected instruction 108D.

[0038] However, if the source identifier 350 of the detected instruction 108D is not included in the source identifier 362(0)-362(R) of the corresponding refill tags 352(0)-352(R) in the retrieve and refill entries 346(0)-346(R) in the pipeline retrieve and refill circuit 344, then the PDI detection circuit 340 is configured to process the detected PDI 108D. The PDI detection circuit 340 is configured to allocate available retrieve and refill entries 346(0)-346(R) in the pipeline retrieve and refill circuit 344 to store the source identifier 350 of the detected instruction 108D for later identification by the instruction retrieve and reuse circuit 342, which is discussed in more detail below. The PDI detection circuit 340 is also configured to store the source identifier 350 of the detected instruction 108D in the source identifier 362(0)-362(R) of the fetch and refill entries 346(0)-346(R) allocated in the pipeline fetch and refill circuit 344. If the detected PDI 108D is a branch instruction with a predictable branch instruction flow path, but is not resolved until execution in the execution circuit 116, the PDI detection circuit 340 can also be configured to retrieve the instruction from the instruction pipeline I0-I N The refill path 364 of the received branch instruction 108D (e.g., take or not take for a conditional branch instruction) is stored in the corresponding assigned refill entries 346(0)-346(R) in the pipeline fetch refill circuit 344. This allows the instruction fetch reuse circuit 342 to know whether the subsequent instruction 108D captured in the pipeline fetch refill circuit 344 for the branch instruction 108E that caused the flush event 138 should be reused because it is in the correct instruction flow path from the branch instruction 108E. Then, Figure 3 The PDI detection circuit 340 in the middle is configured to... Figure 4 The valid indicators 368(0)-368(R) of the corresponding assigned retrieve and refill entries 346(0)-346(R) in the pipeline retrieve and refill circuit 344 are set to the valid state so that the instruction retrieve and reuse circuit 342 will know that the reference matches the retrieve and refill entries 346(0)-346(R) and reuse the previously retrieved and captured instruction 108D in the retrieve and refill entries 346(0)-346(R) corresponding to the PDI 108E that caused the flushing event 138.

[0039] Then, the PDI detection circuit 340 is configured to capture information about the instruction pipeline I0-I N Information on subsequent, newer instructions 108D, which are in Figure 4After the detected PDI 108D is assigned in the retrieve and refill circuit 344 of the pipeline, it is used for possible reuse later. In this respect, the PDI detection circuit 340 is configured to store the received subsequent, follower, newer instructions 108D as captured instructions 372(1)-372(X) in the assigned retrieve and refill entries 346(0)-346(R). For example, the retrieve and refill entry 346(0) can be configured to store up to 'X' subsequent, follower, newer instructions 108D as captured instructions 372(0)(1)-372(0)(X). The PDI detection circuit 340 is also configured to capture the metadata 374(1)-374(X) of each of the captured instructions 372(1)-372(X) after the detected PDI 108D as data that can be used to assist in the processing of the captured instructions 372(1)-372(X), if reused and injected into the instruction pipeline I0-I by the instruction retrieval and reuse circuit 342. N For example, retrieving and refilling entry 346(0) can store up to 374(0)(1)-374(0) metadata of the “X” capture instruction 108D. Metadata 374(1)-374(X) may include indications that if a flushing event 138 occurs in response to its corresponding PDI 108E and the corresponding captured instruction 372(1)-372(X) is reused later, certain instruction pipelines I0-I N Whether the information can be skipped. For example, the direction of the conditional branches within the captured instructions 372(1)-372(X) can be stored as metadata 374(1)-374(X).

[0040] The PDI detection circuit 340 is also configured to store the source addresses (e.g., PC) 370(0)-370(X) in the correspondingly assigned fetch refill entries 346(0)-346(R). This is so that the instruction fetch reuse circuit 342 can use this information to notify Figure 3The instruction retrieval circuit 106 begins retrieving new instructions 108 in response to the reuse of captured instructions 372(1)-372(X) for PDI 108D that result in refresh event 138. In this respect, PDI detection circuit 340 can be configured to store the PC after the last captured instructions 372(0)-372(X) captured in the corresponding fetch refill entries 346(0)-446(R) for the detected PDI 108D in the corresponding through source addresses 370(0)-370(R). For example, when the PDI detection circuit 340 encounters the next PDI 108D in the instruction stream 114, the PDI detection circuit 340 can be configured to stop capturing subsequent, newer instructions 108D after detecting PDI 108A in the captured instructions 372(0)-372(X) in the fetch and refill entries 346(0)-346(R) for the detected PDI 108D. As another example, the PDI detection circuit 340 can be configured to stop capturing subsequent, newer instructions 108D after the detected PDI 108D in the fetch and refill entries 346(0)-346(R) for the detected PDI 108D once the pipeline fetch and refill circuit 344 is full. Alternatively, the PDI detection circuit 340 can be configured to stop capturing subsequent, newer instructions 108D when the next PDI 108D is encountered or the pipeline fetch and refill circuit 344 is full (as another example, whichever occurs first).

[0041] Figure 4Each retrieved-and-refill entry 346(0)-346(R) in the pipeline retrieved-and-refill circuit 344 can also be configured to store a corresponding usefulness indicator 376(0)-376(X). As will be discussed in more detail below, the usefulness indicators 376(0)-376(X) are configured to store usefulness, which indicates the usefulness of the retrieved-and-refill entries 346(0)-346(R). The usefulness stored in the usefulness indicators 376(0)-376(Y) is a measure of the likelihood that the PDI 108D associated with the corresponding retrieved-and-refill entry 346(0)-346(R) is reused by the instruction retrieved-and-reuse circuit 342 to reuse the captured retrieved instruction 108D in the retrieved-and-refill entries 346(0)-346(R). As an example, usefulness can be a count value, and the usefulness indicators 376(0)-376(X) can be counters. Useful indicators 376(0)-376(X) can allow individual processes to update and monitor the usefulness stored in useful indicators 376(0)-376(Y) as a way to retrieve and refill entries 346(0)-346(R) to free up space for future detection of PDI 108D and its associated successor instructions 108D for later reuse.

[0042] Return to reference Figure 3 As described above, the instruction retrieval and reuse circuit 342 is configured to reuse previously captured instruction 108D from captured instructions 372(0)-372(X) in retrieval and refill entries 346(0)-346(R) of the pipeline retrieval and refill circuit 344. This previously captured instruction 108D corresponds to the executed PDI 108E, the execution of which causes a flush event 138 to occur. In this regard, in response to the flush event 138, the instruction retrieval and reuse circuit 342 is configured to determine the source identifier 378 of the previously captured detected instruction (i.e., PDI 108D and / or its newer, successor instruction 108D). For example, the source identifier 378 of the detected instruction 108D may be the PC of PDI 108D. The instruction retrieval and reuse circuit 342 can be configured to determine whether the source identifier 378 of the detected instruction 108D matches (i.e., a hit) the source identifier 362(0)-326(R) of the corresponding refill tag 352(0)-352(R) in the corresponding retrieval and refill entries 346(0)-346(R) in the pipeline retrieval and refill circuit 344. If so, the instruction retrieval and reuse circuit 342 can be configured to access the captured instructions 372(1)-372(X) in the retrieval and refill entries 346(0)-346(R) whose source identifier 362(0)-326(R) matches the source identifier 378 of the detected instruction 108D, and inject the captured instructions 372(1)-372(X) into the instruction pipeline I0-IN The captured instructions 372(1)-372(X) do not need to be re-fetched by the instruction fetch circuit 106. The instruction fetch reuse circuit 342 is also configured to inject the captured instructions 372(1)-372(X) after the instruction decoding circuit 120 into, for example, the renaming circuit 122 or the instruction pipeline I0-I N In later stages (such as execution circuit 116), instruction fetch reuse circuit 342 is also configured to provide the through source addresses 370(0)-370(X) of the matching fetch refill entries 346(0)-346(R) to instruction fetch circuit 106. Instruction fetch circuit 106 can start fetching instructions 108 from through source addresses 370(0)-370(X) to avoid re-fetching the same instructions 108, since the captured instructions 372(1)-372(X) are reused and injected into instruction pipeline I0-I N middle.

[0043] However, if the instruction fetch reuse circuit 342 determines that the source identifier 378 of the detected instruction 108D that caused the flush event 138 does not match the source identifier 362(0)-362(R) of the corresponding refill tags 352(0)-352(R) in the corresponding fetch refill entries 346(0)-346(R) in the pipeline fetch refill circuit 344, the instruction fetch reuse circuit 342 may ignore the detected instruction 108D. The instruction fetch circuit 106 will re-fetch PDI 108D and / or its successor instruction 108D. The instruction fetch reuse circuit 342 may be configured to provide the source address 370 through the instruction fetch circuit 106, which is the PC of PDI 108E, such that the instruction fetch circuit 106 will re-fetch PDI 108E and its successor instruction 108D.

[0044] The instruction fetch reuse circuit 342 can also be configured to verify, before reusing the corresponding fetch refill entries 346(0)-346(R) in the pipeline fetch refill circuit 344, that the refill tags 352(0)-352(R) in the pipeline fetch refill entries 346(0)-346(R) match the source identifier 362(0)-362(R) of the PDI 108D. Thus, for example, if the PDI 108D is a branch instruction that may take different instruction flow paths depending on the resolution of its execution, the instruction fetch reuse circuit 342 can ensure that the captured instructions 372(1)-372(X) in the matching fetch refill entries 346(0)-346(R) come from the same instruction flow path as the instruction flow path resolved by the execution of the PDI 108D. In this way, the instruction fetch reuse circuit 342 is used in the instruction pipeline I0-I N The reuse of captured instructions 372(1)-372(X) will not be used for incorrect instruction flow paths. If the captured instructions 372(1)-372(X) in the corresponding matched retrieved and refilled entries 346(0)-346(R) are not used for the correct instruction flow path according to their recorded refilled paths 366(0)-366(R), the instruction retrieval and reuse circuit 342 may choose not to reuse those captured instructions 372(1)-372(X) and instead allow them to be retrieved by the instruction processing circuit 304. In this case, the instruction retrieval and reuse circuit 342 may be configured to provide the instruction processing circuit 304 with the source identifiers 362(0)-362(R) of the executed PDI 108E that caused the flush event 138, which the instruction processing circuit 304 will then retrieve the PDI 108E and its subsequent, subsequent instructions 108D.

[0045] As described above, it may be desirable to provide a mechanism to release retrieved refill entries 346(0)-346(R) in the pipeline retrieved refill circuit 344 to free up space for capturing the more recently executed PDI 108E and its successor instruction 108D that caused the flush event 138, for potential reuse. Some retrieved refill entries 346(0)-346(R) in the pipeline retrieved refill circuit 344 may be assigned to PDI 108D and / or the more recently executed successor instruction 108D, which is not as useful as the more recently executed PDI 108E that caused the flush event 138 (i.e., is unlikely to occur in the future).

[0046] As described above, the instruction retrieval and reuse circuit 342 determines that the source identifier 378 of PDI 108D and / or its newer, successor instruction 108D has been included in the valid retrieval and refill entries 346(0)-346(R) in the pipeline retrieval and refill circuit 344 (i.e., the source identifier 378 matches the source identifier 362(0)-362(R)). If the source identifier 378 of PDI 108D and / or its newer, successor instruction 108D has been included in the valid retrieval and refill entries 346(0)-346(R), then the instruction retrieval and reuse circuit 342 can be configured to increase the usefulness of the corresponding useful indicators 376(O)-376(X) in the corresponding retrieval and refill entries 346(0)-346(R). For example, if useful indicators 376(0)-376(X) are counters, the instruction fetch reuse circuit 342 can be configured to increment useful indicators 366(0)-376(X) to indicate increased usefulness, as an example. However, if the source identifier 378 of the executed PDI 108E is not yet included in the valid fetch refill entries 346(0)-346(R), so that new valid fetch refill entries 346(0)-346(R) need to be allocated, the instruction fetch reuse circuit 342 can equally reduce the usefulness of all useful indicators 376(0)-376(X) in the corresponding fetch refill entries 346(0)-346(R), as an example. If the usefulness of the usefulness indicators 376(O)-376(X) of the retrieved and refilled entries 346(0)-346(R) in the pipeline retrieved and refilled circuit 344 is lower than the set usefulness threshold, the instruction retrieved and reused circuit 342 or other circuits can be configured to release such retrieved and refilled entries 346(0)-346(R) to be reassigned to the new PDI 108E.

[0047] Alternatively, instead of immediately and equally reducing the usefulness of all useful indicators 376(O)-376(X) in the corresponding retrieve / refill entries 346(0)-346(R) in response to a miss in the pipeline retrieve / refill circuit 344, if the source identifier 378 of the detected instruction 108D (i.e., PDI 108D and / or its newer, successor instruction 108D) is not already included in a valid retrieve / refill entry 346(0)-346(R), then Figure 3The global allocation failure indicator 380 can be incremented or increased. Then, once the global allocation failure indicator 380 exceeds the threshold global allocation, the usefulness of useful indicators 376(O)-376(X) in each retrieved and refilled entry 346(0)-346(R) can be reduced. This mechanism controls the rate at which the usefulness of useful indicators 376(0)-376(X) in each retrieved and refilled entry 346(0)-346(R) decreases, so as not based on each failure event corresponding to the pipeline retrieved and refilled circuit 344. Similarly, if the usefulness of the usefulness indicators 376(O)-376(X) of the retrieved and refilled entries 346(0)-346(R) in the pipeline retrieved and refilled circuit 344 is lower than the set usefulness threshold, the instruction retrieved and reused circuit 342 or other circuits can be configured to release such retrieved and refilled entries 346(0)-346(R) for reassignment to a new PDI 108D and / or its successor instruction 108D.

[0048] As another alternative, the usefulness of the useful indicators 376(0)-376(X) of the retrieve and refill entries 346(0)-346(R) in the pipeline retrieve and refill circuit 344 can be found in the instruction pipeline I0-I N The usefulness of each PDI 108D processed in the pipeline retrieve and refill circuit 344 can be reduced by the usefulness indicators 376(O)-376(X) of the retrieve and refill entries 346(0)-346(R). Alternatively, the usefulness of each PDI 108D detected by the PDI detection circuit 340 can be reduced by the usefulness indicators 376(0)-376(X) of the retrieve and refill entries 346(0)-346(R) in the pipeline retrieve and refill circuit 344.

[0049] Figure 5 This is a block diagram of an exemplary processor-based system 500, which includes a processor 502 (e.g., a microprocessor). The processor 502 includes instruction processing circuitry 504, which includes a PDI detection circuitry 505 and an instruction retrieval and reuse circuitry 506 for detecting the PDI, capturing retrieved instructions as a PDI and / or a newer instruction following the PDI, and reusing the captured instructions in response to a refresh event caused by the execution of the corresponding PDI. For example, Figure 5 The processor 502 in the middle can be Figure 1 The processor 102 or Figure 3 The processor 302 in the figure. As another example, the instruction processing circuit 504 can be the instruction processing circuit 104 in Figure 2 or... Figure 3The instruction processing circuit 304. As another example, the PDI detection circuit 505 could be... Figure 1 The PDI detection circuit 140 or Figure 3 The PDI detection circuit 340 in the example. As another example, the instruction retrieval and reuse circuit 506 could be... Figure 1 The instruction in the instruction retrieves the reuse circuit 142 or Figure 3 The instruction retrieves the reuse circuit 342.

[0050] Processor-based system 500 may be one or more circuits included in an electronic board, such as a printed circuit board (PCB), server, personal computer, desktop computer, notebook computer, personal digital assistant (PDA), computing board, mobile device, or any other device, and may represent, for example, a server or user's computer. In this example, processor-based system 500 includes processor 502. Processor 502 represents one or more general-purpose processing circuits, such as a microprocessor, central processing unit, etc. More specifically, processor 502 may be an EDGE instruction set microprocessor, or other processor implementing an instruction set that supports explicit consumer naming for delivering produced values ​​generated by the execution of producer instructions. Processor 502 is configured to execute processing logic in instructions for performing the operations and steps discussed herein. In this example, processor 502 includes instruction cache 508 for temporary fast-access memory storage of instructions accessible to instruction processing circuitry 504. Instructions fetched or prefetched from memory (e.g., from system memory 510 via system bus 512) are stored in instruction cache 508. The instruction processing circuit 504 is configured to process instructions fetched into the instruction cache 508 and to process instructions for execution.

[0051] Processor 502 and system memory 510 are coupled to system bus 512 and can be coupled to peripheral devices included in the processor-based system 500. Processor 502 is known to communicate with these other devices by exchanging address, control, and data information on system bus 512. For example, processor 502 can transmit bus transaction requests to memory controller 514 in system memory 510, which is an example of a slave device. Although in Figure 5Not shown in the diagram, but multiple system buses 512 may be provided, each forming a different architecture. In this example, the memory controller 514 is configured to provide memory access requests to the memory array 516 in the system memory 510. The memory array 516 includes an array of memory bit cells for storing data. As a non-limiting example, the system memory 510 may be read-only memory (ROM), flash memory, dynamic random access memory (DRAM), such as synchronous DRAM (SDRAM), and static memory (e.g., flash memory, static random access memory, etc.).

[0052] Other devices can be connected to system bus 512. For example... Figure 5 As shown, these devices may include system memory 510, one or more input devices 518, one or more output devices 520, modem 522, and one or more display controllers 524, as an example. The input devices 518 may include any type of input device, including but not limited to input keys, switches, voice processors, etc. The modem 522 may be any device configured to allow data exchange with network 526. Network 526 may be any type of network, including but not limited to wired or wireless networks, private or public networks, local area networks (LANs), wireless local area networks (WLANs), wide area networks (WANs), Bluetooth™ networks, and the Internet. The modem 522 may be configured to support any type of desired communication protocol. The processor 502 may also be configured to access the display controllers 524 via system bus 512 to control information sent to one or more displays 528. The displays 528 may include any type of display, including but not limited to cathode ray tube (CRT), liquid crystal displays (LCDs), plasma displays, etc.

[0053] Figure 5 The processor-based system 500 may include an instruction set 530, executed by the processor 502 for any application desired according to the instructions. The instructions 530 may be stored in system memory 510, the processor 502, and / or the instruction cache 508, as an example of a non-transitory computer-readable medium 532. The instructions 530 may also reside wholly or at least partially in system memory 510 and / or the processor 502 during their execution. The instructions 530 may also be transmitted or received on a network 526 via a modem 522, such that the network 526 includes the computer-readable medium 532.

[0054] Although computer-readable medium 532 is shown as a single medium in the exemplary embodiment, the term "computer-readable medium" should be understood to include a single medium or multiple media (e.g., a centralized or distributed database, and / or associated caches and servers) storing one or more instruction sets. The term "computer-readable medium" should also be considered to include any medium capable of storing, encoding, or carrying instruction sets for processing devices to execute and cause the processing devices to perform any one or more methods of the embodiments disclosed herein. Therefore, the term "computer-readable medium" should include, but is not limited to, solid-state storage, optical media, and magnetic media.

[0055] The embodiments disclosed herein include various steps. The steps of the embodiments disclosed herein may be formed by hardware components or may be implemented in machine-executable instructions that can be used by a general-purpose or special-purpose processor programmed with those instructions to perform these steps. Alternatively, these steps may be performed by a combination of hardware and software.

[0056] The embodiments disclosed herein may be provided as a computer program product or software, which may include a machine-readable medium (or computer-readable medium) having instructions stored thereon that can be used to write a computer system (or other electronic device) to perform processes according to the embodiments disclosed herein. A machine-readable medium includes any mechanism for storing or transmitting information in a machine-readable (e.g., computer-readable) form. For example, machine-readable media include: machine-readable storage media (e.g., ROM, random access memory (“RAM”), disk storage media, optical storage media, flash memory devices, etc.); etc.

[0057] Unless otherwise specified and apparent from the preceding discussion, it should be understood that throughout the description, discussions using terms such as “processing,” “computing,” “determining,” and “displaying” refer to the actions and processes of a computer system or similar electronic computing device that manipulate data represented as physical (electronic) quantities in computer system registers and memory, and convert them into other data represented as similar physical quantities in computer system memory or registers or other such information storage, transmission, or display devices.

[0058] The algorithms and displays presented herein are not inherently related to any particular computer or other device. Various systems can be used in conjunction with programs based on the teachings herein, or it may be shown that it is convenient to construct more specialized devices to perform the required method steps. The required structures of various such systems can be seen from the above description. Furthermore, the embodiments described herein are not described with reference to any particular programming language. It should be understood that various programming languages ​​can be used to implement the teachings of the embodiments described herein.

[0059] Those skilled in the art will further understand that the various illustrative logic blocks, modules, circuits, and algorithms described in conjunction with the embodiments disclosed herein can be implemented as electronic hardware, stored in memory or another computer-readable medium and executed by a processor or other processing device, or a combination of both. As an example, components of the distributed antenna system described herein can be implemented in any circuit, hardware component, integrated circuit (IC), or IC chip. The memory disclosed herein can be of any type and size and can be configured to store any type of desired information. To clearly illustrate this interchangeability, various illustrative components, blocks, modules, circuits, and steps have been described above generally according to their functionality. How such functionality is implemented depends on the specific application, design choices, and / or design constraints imposed on the system as a whole. Those skilled in the art can implement the described functionality in different ways for each specific application, but such implementation decisions should not be construed as causing a departure from the scope of this embodiment.

[0060] The various illustrative logic blocks, modules, and circuits described in conjunction with the embodiments disclosed herein may be implemented or processed by a processor, digital signal processor (DSP), application-specific integrated circuit (ASIC), field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof, and are intended to perform the functions described herein. Furthermore, the controller may be a processor. The processor may be a microprocessor, but alternatively, the processor may be any conventional processor, controller, microcontroller, or state machine. The processor may also be implemented as a combination of computing devices (e.g., a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors combined with a DSP core, or any other such configuration).

[0061] The embodiments disclosed herein can be implemented in hardware and instructions stored in the hardware, and can reside in, for example, RAM, flash memory, ROM, electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), registers, hard disks, removable disks, CD-ROMs, or any other form of computer-readable medium known in the art. An exemplary storage medium is coupled to a processor such that the processor can read information from and write information to the storage medium. Alternatively, the storage medium can be integrated with the processor. The processor and storage medium can reside in an ASIC. The ASIC can reside in a remote station. Alternatively, the processor and storage medium can reside as discrete components in a remote station, base station, or server.

[0062] It should also be noted that the operational steps described in any exemplary embodiment herein are described to provide examples and discussion. The described operations may be performed in many different sequences than those shown. Furthermore, the operations described in a single operational step may actually be performed in multiple different steps. Additionally, one or more operational steps discussed in the exemplary embodiments may be combined. Those skilled in the art will also understand that information and signals can be represented using any of a variety of techniques and skills. For example, data, instructions, commands, information, signals, bits, symbols, and chips (which may be the entirety of the above description) may be represented by voltage, current, electromagnetic waves, magnetic fields or particles, light fields or particles, or any combination thereof.

[0063] Unless otherwise expressly stated, no method described herein should be construed as requiring its steps to be performed in a particular order. Therefore, no particular order should be inferred if a method claim does not actually describe the order in which its steps are followed, or if the claims or description do not expressly state that the steps will be restricted to a particular order.

[0064] Those skilled in the art will recognize that various modifications and variations can be made without departing from the spirit or scope of the invention. Since modifications, combinations, sub-combinations, and variations of the disclosed embodiments incorporating the spirit and essence of the invention may occur to those skilled in the art, the invention should be construed as encompassing all contents within the scope of the appended claims and their equivalents.

Claims

1. A processor, comprising: Instruction processing circuitry, including: The instruction fetch circuit is configured to fetch multiple instructions from the program code as multiple fetched instructions to be executed in the instruction pipeline; and An execution circuit coupled to the instruction fetch circuit is configured to: Execute the first retrieved instruction of the plurality of retrieved instructions in the instruction pipeline; and In response to the danger generated by executing the first retrieved instruction, a pipeline flushing event is generated to flush the instruction pipeline; A performance degradation instruction PDI detection circuit is coupled to the instruction pipeline, and the PDI detection circuit is configured as follows: Detect whether the second fetched instruction among the plurality of fetched instructions in the instruction pipeline is a PDI, wherein the PDI is an instruction that is determined to cause a risk of generating a precise interrupt when executed by the circuitry. In response to detecting that the second retrieved instruction is the PDI, the PDI detection circuit is further configured to: Assign available retrieved refill entries among multiple retrieved refill entries in the pipeline retrieved refill circuit; and The source identifier of the second retrieved instruction, including one of the PDI and subsequent instructions, is stored in the refill tag of the available retrieved refill entry; and An instruction retrieval and reuse circuit is coupled to the instruction pipeline, and the instruction retrieval and reuse circuit is configured to respond to a pipeline flushing event: Determine whether the source identifier of the third retrieved instruction among the plurality of retrieved instructions matches the source identifier in the refill tag of the retrieve refill entry, wherein the retrieve refill entry is a matching retrieve refill entry among the plurality of retrieve refill entries in the pipeline retrieve refill circuit; and In response to the third retrieved instruction, the source identifier matches the source identifier in the refill tag of the retrieved refill entry: One or more captured instructions from the matched fetch and refill entries are inserted into the instruction pipeline following the instruction fetch circuit for processing.

2. The processor according to claim 1, wherein the instruction processing circuit further comprises: A decoding circuit is coupled to the instruction fetch circuit, the decoding circuit being configured to decode the first fetched instruction into a first decoded instruction; The execution circuit is configured as follows: Execute the first decoded instruction in the instruction pipeline; as well as In response to the danger generated by executing the first decoded instruction, the pipeline flushing event is generated to flush the instruction pipeline; as well as In response to the source identifier of the third retrieved instruction matching the source identifier in the refill tag of the retrieved refill entry of the pipeline retrieved refill circuit: The captured instructions in the matched and retrieved entries are inserted between the decoding and execution circuits in the instruction pipeline for processing.

3. The processor of claim 1, wherein the instruction processing circuitry is configured to match the source identifier of the third retrieved instruction with the source identifier in the refill tag of the retrieved refill entry: The captured instructions in the matched and retrieved entries are inserted into the execution circuitry in the instruction pipeline for execution.

4. The processor of claim 1, wherein the instruction retrieval and reuse circuitry is configured to, in response to a match between the source identifier of the third retrieved instruction and the source identifier in the refill tag of the retrieved refill entry: Determine whether the third retrieved instruction is a PDI; Determine whether the instruction flow path of the third retrieved instruction matches the refill path in the refill tag of the matched retrieved refill entry; as well as In response to the instruction flow path of the third retrieved instruction matching the refill path in the refill tag of the matching retrieved refill entry, and the third retrieved instruction being a PDI: The one or more captured instructions in the matched fetch and refill entries are inserted into the instruction pipeline after the instruction fetch circuit for processing.

5. The processor according to claim 1, wherein: The instruction retrieval and reuse circuit is further configured such that, in response to the source identifier of the third retrieved instruction matching the source identifier in the refill tag of the retrieved refill entry: The through-source address in the matched and refilled entry is transmitted to the instruction retrieval circuit; and The instruction retrieval circuit is configured to retrieve instructions starting from the through-source address in response to receiving the through-source address.

6. The processor according to claim 1, wherein: The instruction retrieval and reuse circuit is further configured such that, in response to the source identifier of the third retrieved instruction matching the source identifier in the refill tag of the retrieved refill entry: Increase the usefulness of the usefulness indicator in the matching retrieve and refill entry in the pipeline retrieve and refill circuit, the usefulness indicator indicating the usefulness of the matching retrieve and refill entry.

7. The processor according to claim 6, wherein: The instruction retrieval and reuse circuit is further configured to respond to a mismatch between the source identifier of the third retrieved instruction and the source identifier in the refill tag of the retrieved refill entry: Reduce the usefulness of the useful indicators in each of the plurality of retrieve and refill entries in the pipeline retrieve and refill circuit.

8. The processor according to claim 7, wherein: The instruction retrieval and reuse circuit is further configured as follows: Determine whether the usefulness of the usefulness indicator in the second retrieval and refill entry of the plurality of retrieval and refill entries in the pipeline retrieval and refill circuit is lower than a threshold usefulness; as well as In response to the usefulness in the usefulness indicator of the second retrieved and refilled entry being lower than the threshold usefulness, the second retrieved and refilled entry in the pipeline retrieved and refilled circuit is released.

9. The processor according to claim 6, wherein: The instruction retrieval and reuse circuit is further configured to, in response to a mismatch between the source identifier of the third retrieved instruction and the source identifier in the refill tag of the retrieved refill entry, add a global allocation in the global allocation failure indicator for the pipeline retrieval and refill circuit. as well as In response to the global allocation exceeding a threshold global allocation in the global allocation failure indicator, the usefulness of the useful indicator in each of the plurality of retrieve and refill entries in the pipeline retrieve and refill circuit is reduced.

10. The processor of claim 9, in response to the usefulness of the usefulness indicator in the second retrieved refill entry of the plurality of retrieved refill entries in the pipeline retrieved refill circuit being lower than a threshold usefulness, releasing the second retrieved refill entry in the pipeline retrieved refill circuit.

11. The processor of claim 1, wherein the PDI detection circuit is further configured to, In response to the source identifier of the third retrieved instruction not matching the source identifier in the refill tag of the retrieved refill entry: Allocating the available retrieved refill entries among the plurality of retrieved refill entries in the pipeline retrieved refill circuit; and The source identifier of the third retrieved instruction is stored in the refill tag of the available retrieved refill entry.

12. The processor of claim 1, wherein the PDI detection circuit is further configured to, In response to the source identifier of the third retrieved instruction not matching the source identifier in the refill tag of the retrieved refill entry: The available retrieve and refill entries allocated in the pipeline retrieve and refill circuit capture one or more subsequent instructions in the instruction pipeline following the third retrieved instruction.

13. The processor according to claim 12, wherein: The multiple instructions include branch instructions; The instruction fetch circuit is configured to fetch the branch instruction back into the instruction pipeline for execution. The instruction processing circuit is configured to determine the instruction flow path of the branch instruction; The PDI detection circuit is configured to detect whether the retrieved branch instruction in the instruction pipeline is a PDI. as well as The PDI detection circuit is further configured to, in response to detecting that the retrieved branch instruction is a PDI and in response to the source identifier of the retrieved branch instruction detected as the PDI not matching the source identifier in the refill tag of the retrieved refill entry: The instruction flow path detected as a PDI, which is the retrieved branch instruction, is stored in the refill path of the available retrieved refill entry.

14. The processor of claim 12, wherein the PDI detection circuit is further configured to, in response to a mismatch between the source identifier of the third retrieved instruction and the source identifier in the refill tag of the retrieved refill entry: Determine whether the instruction following the one or more subsequent instructions is a PDI; and In response to determining that the subsequent instruction is a PDI: The subsequent instruction is captured as one or more subsequent instructions identified as PDIs in the instruction pipeline, not in the allocated available retrieve and refill entry in the pipeline retrieve and refill circuit.

15. The processor of claim 14, wherein the PDI detection circuit is further configured to, in response to determining that the subsequent instruction is a PDI: The source identifier of the subsequent instruction identified as PDI is stored as a through source address in the assigned fetch-refill entry in the pipeline fetch-refill circuit.

16. The processor of claim 13, wherein the PDI detection circuit is further configured to respond to a mismatch between the source identifier of the third retrieved instruction and the source identifier in the refill tag of the retrieved refill entry: Determine whether the pipeline retrieval and refill circuit is full; and In response to determining that the pipeline retrieval and refill circuit is full: The subsequent instructions of one or more successor instructions in the instruction pipeline are not captured in the allocated available retrieve and refill entries in the pipeline retrieve and refill circuit.

17. The processor of claim 16, wherein the PDI detection circuit is further configured to: The source identifier of the subsequent instruction of the one or more subsequent instructions that were not captured in the allocated available retrieve and refill entry in the pipeline retrieve and refill circuit is stored as a through source address in the allocated available retrieve and refill entry in the pipeline retrieve and refill circuit.

18. The processor of claim 1, wherein the PDI detection circuit is further configured to, in response to a mismatch between the source identifier of the third retrieved instruction and the source identifier in the refill tag of the retrieved refill entry: Add global allocation to the global allocation failure indicator for the pipeline retrieve and refill circuit; and The instruction processing circuit is further configured to: Determine whether the global allocation in the global allocation failure indicator exceeds the threshold global allocation; In response to the global allocation exceeding the threshold global allocation in the global allocation failure indicator, the usefulness of the useful indicator in each of the plurality of retrieve and refill entries in the pipeline retrieve and refill circuit is reduced; as well as Determine whether the usefulness of the usefulness indicator in the second retrieval and refill entry of the plurality of retrieval and refill entries in the pipeline retrieval and refill circuit is lower than a threshold usefulness; In response to the usefulness of the usefulness indicator in the second retrieved refill entry of the plurality of retrieved refill entries in the pipeline retrieved refill circuit being lower than the threshold usefulness, the second retrieved refill entry of the pipeline retrieved refill circuit is released.

19. The processor according to claim 1, wherein: The plurality of instructions includes branch instructions with branching behavior; The instruction fetch circuit is configured to fetch the branch instruction back into the instruction pipeline as a fetched branch instruction for execution; The instruction processing circuit further includes a control flow prediction circuit, which is configured to predict the branch behavior of the branch instruction. The execution circuit is configured to execute the branch instruction to generate the parsed branch behavior of the branch instruction; The instruction processing circuit is further configured to: Determine whether the parsed branch behavior of the executed branch instruction matches the predicted branch behavior of the branch instruction; and Based on whether the parsed branch behavior matches the predicted branch behavior of the branch instruction, update the branch predictor confidence corresponding to the branch instruction; as well as The PDI detection circuit is configured to detect whether a branch instruction in the instruction pipeline is a PDI based on the branch predictor confidence of the branch instruction.

20. The processor according to claim 1, wherein: The plurality of instructions includes memory operation instructions; The instruction fetch circuit is configured to fetch the memory operation instruction back into the instruction pipeline as a fetched memory operation instruction for execution; The execution circuit is configured to execute the memory operation instruction at the memory address of the memory operation instruction; The instruction processing circuit is further configured to flush the instruction pipeline in response to the execution circuit generating the pipeline flushing event in response to the execution of the memory operation instruction, and to store a PDI indicator for the memory operation instruction as a PDI. as well as The PDI detection circuit is configured to detect whether the memory operation instruction in the instruction pipeline is a PDI based on the PDI indicator for the memory operation instruction.

21. A method for reusing fetched, flushed instructions in an instruction pipeline in a processor, comprising: Fetch multiple instructions from the program code as multiple fetched instructions to be executed in the instruction pipeline; Execute the first retrieved instruction among the plurality of retrieved instructions in the instruction pipeline; In response to the danger generated by executing the first retrieved instruction, a pipeline flushing event is generated to flush the instruction pipeline; The system detects whether the second fetched instruction in the instruction pipeline is a performance degradation instruction (PDI), wherein the PDI is an instruction that is determined to cause a risk of generating a precise interrupt when executed by the circuit being executed. as well as In response to detecting that the second retrieved instruction is the PDI, the method further includes: Assign available retrieved refill entries among multiple retrieved refill entries in the pipeline retrieved refill circuit; and The source identifier of the second retrieved instruction, including one of the PDI and subsequent instructions, is stored in the refill tag of the available retrieved refill entry; and In response to the aforementioned production line flushing event: Determine whether the source identifier of the third retrieved instruction among the plurality of retrieved instructions matches the source identifier in the refill tag of the retrieve refill entry, wherein the retrieve refill entry is a matching retrieve refill entry among the plurality of retrieve refill entries in the pipeline retrieve refill circuit; and In response to the third retrieved instruction, the source identifier matches the source identifier in the refill tag of the retrieved refill entry: One or more captured instructions from the matched retrieved and refilled entries are inserted into the instruction pipeline as retrieved instructions for execution.

22. The method of claim 21, further comprising, in response to the source identifier of the third retrieved instruction matching the source identifier in the refill tag of the retrieved refill entry: Determine whether the third retrieved instruction is a PDI; Determine whether the instruction flow path of the third retrieved instruction matches the refill path in the refill tag of the matched retrieved refill entry; as well as In response to the instruction flow path of the third retrieved instruction matching the refill path in the refill tag of the matching retrieved refill entry, and the third retrieved instruction being a PDI: The captured instructions in the matched and refilled entries are inserted into the instruction pipeline for processing.

23. The method of claim 21, further comprising: Detect whether the third retrieved instruction in the instruction pipeline is a PDI; as well as In response to the detection that the third retrieved instruction is a PDI: A second available retrieval refill entry is assigned among the plurality of retrieval refill entries in the pipeline retrieval refill circuit; as well as The source identifier of one or more subsequent instructions that are detected as PDI after the third retrieved instruction is stored in the refill tag of the allocated second available retrieve refill entry.