Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

56 results about "Branch misprediction" patented technology

Branch misprediction occurs when a central processing unit mispredicts the next instruction to process in branch prediction, which is aimed at speeding up execution. During the execution of certain programs there are places where the program execution flow can continue in several ways. These are called branches, or conditional jumps. The CPU also uses a pipeline which allows several instructions to be processed at the same time. When the code for a conditional jump is read we do not yet know the next instruction to execute and insert into the execution pipeline. This is where branch prediction comes in. Branch prediction guesses the next instruction to execute and inserts the next assumed instruction to the pipeline. Guessing wrong is called branch misprediction. The partially processed instructions in the pipeline after the branch have to be discarded and the pipeline has to start over at the correct branch when a branch misprediction is detected. This slows down the program execution.

Branch misprediction recovery mechanism for microprocessors

A system and method for reducing branch misprediction penalty. In response to detecting a mispredicted branch instruction, circuitry within a microprocessor identifies a predetermined condition prior to retirement of the branch instruction. Upon identifying this condition, the entire corresponding pipeline is flushed prior to retirement of the branch instruction, and instruction fetch is started at a corresponding address of an oldest instruction in the pipeline immediately prior to the flushing of the pipeline. The correct outcome is stored prior to the pipeline flush. In order to distinguish the mispredicted branch from other instructions, identification information may be stored alongside the correct outcome. One example of the predetermined condition being satisfied is in response to a timer reaching a predetermined threshold value, wherein the timer begins incrementing in response to the mispredicted branch detection and resets at retirement of the mispredicted branch.
Owner:ORACLE INT CORP

System and Method for Mitigating the Impact of Branch Misprediction When Exiting Spin Loops

A computer system may recognize a busy-wait loop in program instructions at compile time and / or may recognize busy-wait looping behavior during execution of program instructions. The system may recognize that an exit condition for a busy-wait loop is specified by a conditional branch type instruction in the program instructions. In response to identifying the loop and the conditional branch type instruction that specifies its exit condition, the system may influence or override a prediction made by a dynamic branch predictor, resulting in a prediction that the exit condition will be met and that the loop will be exited regardless of any observed branch behavior for the conditional branch type instruction. The looping instructions may implement waiting for an inter-thread communication event to occur or for a lock to become available. When the exit condition is met, the loop may be exited without incurring a misprediction delay.
Owner:ORACLE INT CORP

Method and apparatus for synthesizing hardware counters from performance sampling

A system and method for performance monitoring may use data collected from a hardware event agent comprising a hardware sampling mechanism and / or one or more hardware counters to increment one or more synthesized performance counters by an amount dependent on an expression involving the collected data. Each synthesized performance counter may be configured to count events of a different type and may comprise a machine addressable storage location. The event types may include various memory references or misses, branches, branch mispredictions, or any other event of interest in performance monitoring. The hardware event agent may comprise one or more instruction counters, cycle counters, timers, or other hardware performance counters. One hardware performance counter may be used in a time-multiplexed or data-multiplexed manner to monitor events of multiple event types. The hardware sampling mechanism may return a statistical packet for sampled instructions, which may be examined to determine the event type.
Owner:ORACLE INT CORP

Context switching within a data processing system having a branch prediction mechanism

A branch target buffer 10 is provided which maintains its entries across context switches within a virtually addressed system. Branch mispredictions are detected for individual entries 12 within the branch target buffer 10 and those individual entries are invalidated.
Owner:ARM LTD

Return address predictor using a copy of a top of stack pointer to track a last valid return address

A processor pipeline includes a return stack buffer (RSB) and a top of stack pointer (RSB_TOS) to indicate the status of buffer entries. A copy of the current RSB_TOS (C_TOS) is associated with each branch instruction that is detected at the front end of the pipeline. When the branch instruction is a call instruction that is predicted taken, an associated return address is pushed onto the RSB and the current RSB_TOS is updated. When the branch instruction is a return instruction that is predicted taken, the return address indicated by the current RSB_TOS pointer is popped from the RSB and the current RSB_TOS is updated. When a branch is determined to have been mispredicted, the associated C_TOS is adjusted according to the type of branch misprediction and RSB_TOS is updated with the adjusted C_TOS.
Owner:INTEL CORP

Method and system for changing the executable status of an operation following a branch misprediction without refetching the operation

A method and system for changing the executable status of an operation following a branch misprediction. In one embodiment, a method may include predicting an execution path of a first conditional branch operation stored in an entry of a trace cache, and in response to predicting the execution path, if a first operation stored in the entry of the trace cache is not in the execution path according to the prediction, assigning to the first operation a non-executable status indicative that the first operation is not in the execution path. The method may further include detecting that the prediction is incorrect subsequent to assigning the non-executable status to the first operation and assigning an executable status to the first operation in response to detecting the incorrect prediction, where the executable status is indicative that the first operation is in the execution path.
Owner:ADVANCED MICRO DEVICES INC

Apparatus and method for cycle accounting in microprocessors

An apparatus and method for cycle accounting for a microprocessor are disclosed, in which a performance monitor includes a plurality of silos, a prioritizer, and a combiner. The silos receive delay reason signals from the main processor pipeline, and output staged signals. The prioritizer receives the staged signals, and outputs a plurality of prioritized signals. The combiner selectively combines various of the prioritize signals, and provides signals indicative of microprocessor performance. Each silo includes, in series, a plurality of stages, with each stage containing a single latch. The stages of the silo are synchronized with the stages of the main processor pipeline. The performance monitor operates in real-time, at the same frequency as the microprocessor, and in parallel to the main processor pipeline, and correctly accounts for buffering effects of decoupling buffers. Outputted signals include various signals indicative of microprocessor performance, for example, cache misses, branch mispredictions, and so forth, but only for those miss-events that contribute to a program's visible delay, thereby providing an accurate picture of where cycles are being wasted.
Owner:INTEL CORP

Apparatus and method for cycle accounting in microprocessors

An apparatus and method for cycle accounting for a microprocessor are disclosed, in which a performance monitor includes a plurality of silos, a prioritizer, and a combiner. The silos receive delay reason signals from the main processor pipeline, and output staged signals. The prioritizer receives the staged signals, and outputs a plurality of prioritized signals. The combiner selectively combines various of the prioritize signals, and provides signals indicative of microprocessor performance. Each silo includes, in series, a plurality of stages, with each stage containing a single latch. The stages of the silo are synchronized with the stages of the main processor pipeline. The performance monitor operates in real-time, at the same frequency as the microprocessor, and in parallel to the main processor pipeline. Outputted signals include various signals indicative of microprocessor performance, for example, cache misses, branch mispredictions, and so forth, but only for those miss-events that contribute to a program's visible delay, thereby providing an accurate picture of where cycles are being wasted.
Owner:INTEL CORP

Processor core and method for managing branch misprediction in an out-of-order processor pipeline

A processor core and method for managing branch misprediction in an out-of-order processor pipeline. In one embodiment, the pipeline of the processor core includes a front-end instruction fetch portion, a back-end instruction execution portion, and pipeline control logic. Operation of the instruction fetch portion is decoupled from operation of the instruction execution portion. Following detection of a control transfer misprediction, operation of the instruction fetch portion is halted and instructions residing in the instruction fetch portion are invalidated. When the instruction associated with the misprediction reaches a selected pipeline stage, instructions residing: in the instruction execution portion of the pipeline are invalidated and the flow of instructions from the instruction fetch portion to the instruction execution portion of the processor pipeline is restarted. A mispredict instruction identification checker and instruction identification tags are used to determine if a control transfer instruction is permitted to redirect instruction fetching.
Owner:ARM FINANCE OVERSEAS LTD

Redirect recovery cache that receives branch misprediction redirects and caches instructions to be dispatched in response to the redirects

In one embodiment, a processor comprises a branch resolution unit and a redirect recovery cache. The branch resolution unit is configured to detect a mispredicted branch operation, and to transmit a redirect address for fetching instructions from a correct target of the branch operation responsive to detecting the mispredicted branch operation. The redirect recovery cache comprises a plurality of cache entries, each cache entry configured to store operations corresponding to instructions fetched in response to respective mispredicted branch operations. The redirect recovery cache is coupled to receive the redirect address and, if the redirect address is a hit in the redirect recovery cache, the redirect recovery cache is configured to supply operations from the hit cache entry to a pipeline of the processor, bypassing at least one initial pipeline stage.
Owner:MEDIATEK INC

Next fetch predictor training with hysteresis

A system and method for efficient branch prediction. A processor includes two branch predictors. A first branch predictor generates branch prediction data, such as a branch direction and a branch target address. The second branch predictor generates branch prediction data at a later time and with higher prediction accuracy. Control logic may determine whether the branch prediction data from each of the first and the second branch predictors match. If a mismatch occurs, the first predictor may be trained with the branch prediction data generated by the second branch predictor. A stored indication of hysteresis may indicate a given branch instruction exhibits a frequently alternating pattern regarding its branch direction. Such behavior may lead to consistent branch mispredictions due to the training is unable to keep up with the changing branch direction. When such a condition is determined to occur, the control logic may prevent training of the first predictor.
Owner:APPLE INC

Method and apparatus for implementing and maintaining a stack of predicate values with stack synchronization instructions in an out of order hardware software co-designed processor

Embodiments of a method and apparatus for implementing and maintaining a stack of predicate values with stack synchronization instructions. In one embodiment the apparatus is an out of order hardware / software co-designed processor including instructions to explicitly manage the predicate register stack to maintain stack consistency across branches of executing that push a variable number of predicate values onto the predicate stack. In one embodiment the stack-based predicate register implementation enables early branch calculation and early branch misprediction recovery via early renaming of predicate registers.
Owner:INTEL CORP

Processor core and method for managing branch misprediction in an out-of-order processor pipeline

A processor core and method for managing branch misprediction in an out-of-order processor pipeline. In one embodiment, the pipeline of the processor core includes a front-end instruction fetch portion, a back-end instruction execution portion, and pipeline control logic. Operation of the instruction fetch portion is decoupled from operation of the instruction execution portion. Following detection of a control transfer misprediction, operation of the instruction fetch portion is halted and instructions residing in the instruction fetch portion are invalidated. When the instruction associated with the misprediction reaches a selected pipeline stage, instructions residing in the instruction execution portion of the pipeline are invalidated and the flow of instructions from the instruction fetch portion to the instruction execution portion of the processor pipeline is restarted. A mispredict instruction identification checker and instruction identification tags are used to determine if a control transfer instruction is permitted to redirect instruction fetching.
Owner:ARM FINANCE OVERSEAS LTD

Method and apparatus for managing instruction flushing in a microprocessor's instruction pipeline

In one or more embodiments, a processor includes one or more circuits to flush instructions from an instruction pipeline on a selective basis responsive to detecting a branch misprediction, such that those instructions marked as being dependent on the branch instruction associated with the branch misprediction are flushed. Thus, the one or more circuits may be configured to mark instructions fetched into the processor's instruction pipeline(s) to indicate their branch prediction dependencies, directly or indirectly detect incorrect branch predictions, and directly or indirectly flush instructions in the instruction pipeline(s) that are marked as being dependent on an incorrect branch prediction.
Owner:QUALCOMM INC

Systems and methods for reducing branch misprediction penalty

In a processing system capable of single and multi-thread execution, a branch prediciton unit can be configured to detect hard to predict branches and loop instructions. In a dual-threading (simultaneous multi-threading) configuration, one instruction queues (IQ) is used for each thread and instructions are alternately sent from each IQ to decode units. In single thread mode, the second IQ can be used to store the “not predicted path” of the hard-to-predict branch or the “fall-through” path of the loop. On mis-prediction, the mis-prediction penalty is reduced by getting the instructions from IQ instead of instruction cache.
Owner:NXP USA INC

Method and apparatus for managing a link return stack

In one or more embodiments, a processor includes a link return stack circuit used for storing branch return addresses, wherein a link return stack controller is configured to determine that one or more entries in the link return stack are invalid as being dependent on a mispredicted branch, and to reset the link return stack to a valid remaining entry, if any. In this manner, branch mispredictions cause dependent entries in the link return stack to be flushed from the link return stack, or otherwise invalidated, while preserving the remaining valid entries, if any, in the link return stack. In at least one embodiment, a branch information queue used for tracking predicted branches is configured to store a marker indicating whether a predicted branch has an associated entry in the link return stack, and it may store an index value identifying the specific, corresponding entry in the link return stack.
Owner:STEMPEL BRIAN MICHAEL +3

Processor with second jump execution unit for branch misprediction

A secondary jump execution unit (JEU) is incorporated in a micro-processor to operate concurrently with a primary JEU, enabling the execution of simultaneous branch operations with possible detection of multiple branch mispredicts. When branch operations are executed on both JEUs in a same instruction cycle, mispredict processing for the secondary JEU is skidded into the primary JEU's dispatch pipeline such that the branch processing for the secondary JEU occurs after processing of the branch for the primary JEU and while the primary JEU is not processing a branch. Moreover, in cases when a nuke command is also received from a reorder buffer of the processor, the branch processing for the secondary JEU is further delayed to accommodate processing of the nuke on the primary JEU. Further embodiments support the promotion of the secondary JEU to have access to the mispredict mechanisms of the primary JEU in certain circumstances.
Owner:INTEL CORP

Condition indicator for use by a conditional branch instruction

A branch prediction method and system are provided that accurately predict a branch condition early in an instruction pipeline of a data processing system. By accurately predicting the branch condition, the correct target instruction can be fetched early, thereby avoiding many of the inefficiencies associated with branch mispredictions. To accurately predict if a branch condition is satisfied, one or more pre-calculated status bits are stored along with a digital value that is read by the conditional branch instruction to determine if the branch condition is satisfied. By including such a status bit, the condition of the conditional branch instruction may be immediately determined, without waiting for the instruction to be processed by an arithmetic unit or the like in a subsequent pipeline stage.
Owner:UNISYS CORP

Recovering a subordinate strand from a branch misprediction using state information from a primary strand

Embodiments of the present invention provide a system that executes program code in a processor. The system starts by executing the program code in a normal mode using a primary strand while concurrently executing the program code ahead of the primary strand using a subordinate strand in a scout mode. Upon resolving a branch using the subordinate strand, the system records a resolution for the branch in a speculative branch resolution table. Upon subsequently encountering the branch using the primary strand, the system uses the recorded resolution from the speculative branch resolution table to predict a resolution for the branch for the primary strand. Upon determining that the resolution of the branch was mispredicted for the primary strand, the system determines that the subordinate strand mispredicted the branch. The system then recovers the subordinate strand to the branch and restarts the subordinate strand executing the program code.
Owner:ORACLE INT CORP

System and method for mitigating the impact of branch misprediction when exiting spin loops

A computer system may recognize a busy-wait loop in program instructions at compile time and / or may recognize busy-wait looping behavior during execution of program instructions. The system may recognize that an exit condition for a busy-wait loop is specified by a conditional branch type instruction in the program instructions. In response to identifying the loop and the conditional branch type instruction that specifies its exit condition, the system may influence or override a prediction made by a dynamic branch predictor, resulting in a prediction that the exit condition will be met and that the loop will be exited regardless of any observed branch behavior for the conditional branch type instruction. The looping instructions may implement waiting for an inter-thread communication event to occur or for a lock to become available. When the exit condition is met, the loop may be exited without incurring a misprediction delay.
Owner:ORACLE INT CORP

Pipelined microprocessor with fast non-selective correct conditional branch instruction resolution

A microprocessor includes a pipeline of stages for processing instructions and first and second types of conditional branch instruction includable by a program. The microprocessor makes a prediction of conditional branch instructions of the first type and flushes the pipeline of instructions if the prediction is subsequently determined to be incorrect, thereby incurring a branch misprediction penalty related to processing of conditional branch instructions of the first type. The microprocessor always correctly resolves conditional branch instructions of the second type without making a prediction of conditional branch instructions of the second type, thereby avoiding ever incurring a branch misprediction penalty related to processing of conditional branch instructions of the second type.
Owner:VIA TECH INC

Branch misprediction recovery mechanism for microprocessors

A system and method for reducing branch misprediction penalty. In response to detecting a mispredicted branch instruction, circuitry within a microprocessor identifies a predetermined condition prior to retirement of the branch instruction. Upon identifying this condition, the entire corresponding pipeline is flushed prior to retirement of the branch instruction, and instruction fetch is started at a corresponding address of an oldest instruction in the pipeline immediately prior to the flushing of the pipeline. The correct outcome is stored prior to the pipeline flush. In order to distinguish the mispredicted branch from other instructions, identification information may be stored alongside the correct outcome. One example of the predetermined condition being satisfied is in response to a timer reaching a predetermined threshold value, wherein the timer begins incrementing in response to the mispredicted branch detection and resets at retirement of the mispredicted branch.
Owner:ORACLE INT CORP

Method and apparatus for correcting an internal call/return stack in a microprocessor that detects from multiple pipeline stages incorrect speculative update of the call/return stack

An internal call / return stack (CRS) correction apparatus in a pipelined microprocessor is disclosed. Each time the microprocessor updates the CRS in response to a call or return instruction (call / ret), the microprocessor also stores correction information into a first correction stack. The microprocessor includes two distinct stages that detect invalidating events, such as a branch misprediction or exception. Once a call / ret passes the first detecting stage, the correction information associated with that call / ret is moved from the first correction stack to a second correction stack. If an invalidating event is detected at the upper detecting stage, then only the correction information in the first stack is used to correct the CRS. However, if an invalidating event is detected at the lower detecting stage, then the correction information in both the first and second stack is used to correct the CRS.
Owner:IP FIRST

Processor using branch instruction execution cache and method of operating the same

A processor using a branch instruction execution cache and a method of operating the same are disclosed. The processor according to an example embodiment of the present invention includes a fetch unit, a branch prediction unit, an instruction queue, a decoding unit and an execution unit operating in a pipeline manner, and includes a branch instruction execution cache that stores address and decode information of a transferred instruction output from the decoding unit, and provides the stored address and at least some of pieces of the decode information to the execution unit in order to overcome branch misprediction when the execution unit determines the branch misprediction. Therefore, with the processor according to an example embodiment of the present invention, overhead of pipeline initialization can be minimized to prevent performance degradation of the processor and reduce power consumption of the processor.
Owner:ELECTRONICS & TELECOMM RES INST

Preventing update training of first predictor with mismatching second predictor for branch instructions with alternating pattern hysteresis

A system and method for efficient branch prediction. A processor includes two branch predictors. A first branch predictor generates branch prediction data, such as a branch direction and a branch target address. The second branch predictor generates branch prediction data at a later time and with higher prediction accuracy. Control logic may determine whether the branch prediction data from each of the first and the second branch predictors match. If a mismatch occurs, the first predictor may be trained with the branch prediction data generated by the second branch predictor. A stored indication of hysteresis may indicate a given branch instruction exhibits a frequently alternating pattern regarding its branch direction. Such behavior may lead to consistent branch mispredictions due to the training is unable to keep up with the changing branch direction. When such a condition is determined to occur, the control logic may prevent training of the first predictor.
Owner:APPLE INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products