Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

36 results about "Delay slot" patented technology

In computer architecture, a delay slot is an instruction slot that gets executed without the effects of a preceding instruction. The most common form is a single arbitrary instruction located immediately after a branch instruction on a RISC or DSP architecture; this instruction will execute even if the preceding branch is taken. Thus, by design, the instructions appear to execute in an illogical or incorrect order. It is typical for assemblers to automatically reorder instructions by default, hiding the awkwardness from assembly developers and compilers.

Thread suspension system and method using trapping instructions in delay slots

By encoding an exception triggering value in storage referenced by an instruction in the delay slot of a delayed control transfer instruction coinciding with a safe point, an efficient coordination mechanism can be provided for multi-threaded code. Because the mechanism(s) impose negligible overhead when not employed and can be engaged in response to an event (e.g., a start garbage collection event), safe points can be defined at call, return and / or backward branch points throughout mutator code to reduce the latency between the event and suspension of all threads. Though particularly advantageous for thread suspension to perform garbage collection at safe points, the techniques described herein are more generally applicable to program suspension at coordination points coinciding with calls, returns, branches or calls, returns and branches therein.
Owner:ORACLE INT CORP

System and method for state restoration in a diagnostic module for a high-speed microprocessor

A system and method are presented for saving and restoring the state of a diagnostic module in a microprocessor. The diagnostic module contains a complex break state machine, capable of halting the microprocessor at specified breakpoints. These breakpoints are based on combinations of instruction locations and / or data values, along with previous machine states. A problem occurs with prior art diagnostic modules when the processor returns from an exception occurring during a fix-up cycle inserted to handle a data load miss associated with an instruction located in a branch delay slot (the location immediately following a conditional branch instruction). Under these circumstances, the exception handler restores the program counter to the location of the branch instruction, causing the branch to be re-executed. The prior art state machine erroneously updates its internal state a second time when the branch is re-executed. According to the system and method disclosed herein, at each state change the previous machine state saved. Thus, when a branch instruction is re-executed, the complex break state machine of the present invention is restored to its previous state, thereby correcting the error.
Owner:AVAGO TECH WIRELESS IP SINGAPORE PTE

Method and apparatus for suppressing duplicative prefetches for branch target cache lines

A system that suppresses duplicative prefetches for branch target cache lines. During operation, the system fetches a first cache line into in a fetch buffer. The system then prefetches a second cache line, which immediately follows the first cache line, into the fetch buffer. If a control transfer instruction in the first cache line has a target instruction which is located in the second cache line, the system determines if the control transfer instruction is also located at the end of the first cache line so that a corresponding delay slot for the control transfer instruction is located at the beginning of the second cache line. If so, the system suppresses a subsequent prefetch for a target cache line containing the target instruction because the target instruction is located in the second cache line which has already been prefetched.
Owner:ORACLE INT CORP

Thread suspension system and method using trapping instructions

By encoding an exception triggering value in storage referenced by an instruction in an otherwise unused slot (e.g., the delay slot of a delayed control transfer instruction or an unused instruction position in a VLIW-based architecture) coinciding with a safe point, an efficient coordination mechanism can be provided for multi-threaded code. Because the mechanism(s) impose negligible overhead when not employed and can be engaged in response to an event (e.g., a start garbage collection event), safe points can be defined at call, return and / or backward branch points throughout mutator code to reduce the latency between the event and suspension of all threads. Though particularly advantageous for thread suspension to perform garbage collection at safe points, the techniques described herein are more generally applicable to program suspension at coordination points coinciding with calls, returns, branches or calls, returns and branches therein.
Owner:ORACLE INT CORP

System and method for processing jump instruction of microprocessor in branch prediction way

The invention discloses a system and a method for processing a jump instruction of a microprocessor in a branch prediction way. The system comprises a coding module and a transmission module, wherein the coding module comprises a branch predictor used for predicting by adopting a static prediction method when the jump instruction to be processed is in the jump execution type or adopting a dynamicprediction method when the jump instruction to be processed is not in the jump execution type after the coding module judges that an instruction to be processed is the jump instruction and judges thetype of the jump instruction through precoding, and directly writing the jump instruction to be processed and a delay slot instruction thereof in an operational queue in a sequence of the instructions in a program; and the transmission module comprises a prediction result processor used for canceling the instruction executed by error and continue fetching in a correct jump direction when the branch predictor predicts the jump instruction by error after the jump instruction is executed and written back to the transmission module. The system cancels operation by adopting different cancellation methods on the basis that whether the instruction is the jump execution instruction or not when the instruction is cancelled.
Owner:LOONGSON TECH CORP

Instruction branch prediction method and system

ActiveCN104793921ASolve pipeline stallsEliminate executive inefficienciesConcurrent instruction executionParallel computingAssembly line
The invention discloses an instruction branch prediction method and system. After a current instruction is executed, instructions of a preset number to be executed in a sequential executing direction are obtained and cached, whether the first cached instruction is of a jump instruction type or not is judged, and if yes, a jump target address is calculated, and a target jump instruction is cached; when a jump condition is met, cache of the target jump instruction is directly read, the jump instruction can be executed, and if the jump condition cannot be met, the second cached instruction in the sequential executing direction is directly read; when the jump instruction is executed, due to the fact that the subsequent instructions to be executed are all obtained and can be directly read, cached and executed no matter whether the jump condition is met or not, bidirectional instruction branch prediction is achieved in a jump instruction prejudgment, instruction pre-taking, instruction cache and direct reading and caching mode, processor assembly line pauses caused by the jump instruction are eliminated, and the situation that the executing efficiency of a processor is reduced due to the fact that the processor is inserted into a branch delay clearance is avoided.
Owner:SHENZHEN CHIPSBANK TECH

Method and system for scheduling delay slot in very-long instruction word structure

The invention discloses a method and a system for scheduling a delay slot in a very-long instruction word structure. The method comprises the steps of locally scheduling instructions in a current basic block; after the local scheduling is finished, judging whether a residual instruction delay slot exists, if not, ending the scheduling, otherwise, putting an instruction which can be filled into the instruction delay slot and is high in spending into a local standby instruction cache; globally scheduling instructions in a basic block of a branch target, selecting an instruction which can be filled into the instruction delay slot and placing the instruction in a global standby instruction cache; and selecting an instruction from the local standby instruction cache and / or the global standby instruction cache and filling the instruction into the residual instruction delay slot. The system comprises a local scheduling unit, a global scheduling unit and a balanced scheduling unit. According to the method and the system for scheduling the delay slot in the very-long instruction word structure disclosed by the invention, through balance between scheduling of the delay slot and program parallelism, as well as balance between local scheduling and global scheduling, high execution efficiency of programs can be implemented.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI

Information processor having delayed branch function with storing delay slot information together with branch history information

InactiveUS7546445B2Reduce branch penaltyEarly confirmation of number of stepDigital computer detailsNext instruction address formationInformation processorDelay slot
In correspondence with an address of a branch instruction, a branch target address Apb, a valid bit V as branch history information, and delay slot information POS on the last position of delay slot instructions are stored in a branch target buffer 241. A branch prediction circuit 23 outputs hit information H / M as to whether or not an input address Ao is coincident with the branch instruction address, the valid bit which is also a branch prediction bit, the information POS, and the branch target address Apb. When a prediction error signal ERR is inactive, the address selection circuit 22 selectively outputs the output of an incrementer 21 and the branch target address Apb, based on the hit information H / M, the delay slot information POS, and the valid bit V.
Owner:FUJITSU LTD

Method for identifying basic blocks with conditional delay slot instructions

A first tag is assigned to a branch instruction. Dependent on the type of branch instruction, a second tag is assigned to an instruction in the branch delay slot of the branch instruction. If the branch is mispredicted, the first tag is broadcast to pipeline stages that may have speculative instructions, and the first tag is compared to tags in the pipeline stages to determine which instructions to cancel. The assignment of tags for a fetch group of concurrently fetched instructions may be performed in parallel. A plurality of branch sequence numbers may be generated, and one of the plurality may be selected for each instruction responsive to the cumulative number of branch instructions preceding that instruction within the fetch group. The selection may be further responsive to whether or not the instruction is in a conditional delay slot.
Owner:AVAGO TECH INT SALES PTE LTD

Method for cancelling speculative conditional delay slot instructions

A first tag is assigned to a branch instruction. Dependent on the type of branch instruction, a second tag is assigned to an instruction in the branch delay slot of the branch instruction. The second tag may equal the first tag if the branch delay slot is unconditional for that branch, and may equal a different tag if the branch delay slot is conditional for the branch. If the branch is mispredicted, the first tag is broadcast to pipeline stages that may have speculative instructions, and the first tag is compared to tags in the pipeline stages. If the tag in a pipeline stage matches the first tag, the instruction is not cancelled. If the tag mismatches, the instruction is cancelled.
Owner:AVAGO TECH WIRELESS IP SINGAPORE PTE

Jump source list processing method, jump source list processing device and compiler

The invention provides a jump source list processing method, a jump source list processing device and a compiler. The jump source list processing method includes that identify of a jump target which corresponds to n jump instructions is acquired, wherein the n is a positive integer larger than or equal to 2; the identify is taken as a pointer pointing to a delay slot behind each jump instruction in the n jump instructions, and the corresponding jump instructions are stored in address information in a code buffer area when the pointer points to the delay slots. The jump instruction address information in a jump source list is stored in the delay slots, so that memory overhead caused by the fact that a special space is arranged in a memory to store the jump instruction address information of the jump source list in a list structure mode can be avoided; after the address of the jump target is determined, modification of the target address of the n jump instructions can be completed in the code buffer area through one-time traversal; compared with the prior art, the jump source list processing method has the advantages that the number of times of traversal in modifying the n jump instructions according to the address of the jump target can be reduced, and execution efficiency in improving the n instructions can be improved.
Owner:LOONGSON TECH CORP

Processing method for calling subprogram of microprocessor, and device for same

InactiveCN102360283AImprove performanceFlexible allocation of the number of instructionsConcurrent instruction executionParallel computingDelay slot
The invention discloses a processing method for calling a subprogram of a microprocessor, and a device for the same. The method comprises the following steps of: obtaining a subprogram calling instruction which carries an unsigned immediate; extracting the unsigned immediate from the subprogram calling instruction; obtaining the total number of the instructions in a delay slot according to the unsigned immediate; and obtaining the return address of the subprogram calling instruction according to the total number of the instructions in the delay slot. The device comprises the following units: an obtaining unit for obtaining the subprogram calling instruction, an extraction unit for extracting the unsigned immediate from the subprogram calling instruction, and a calculation unit for obtaining the total number of the instructions in the delay slot according to the unsigned immediate and calculating the return address of the subprogram calling instruction according to the total number of the instructions in the delay slot. According to the method and the device for the same, performance of the processor in the aspect of processing the subprogram calling instruction is greatly improved.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI

Apparatus and Method for Branch Instruction Bonding

A processor is configured to identify a branch instruction immediately followed by an architectural delay slot. A single bonded instruction comprising the branch instruction immediately followed by the architectural delay slot is created. The single bonded instruction is loaded into an instruction buffer.
Owner:MIPS TECH INC

Multi-table branch prediction circuit for predicting a branch's target address based on the branch's delay slot instruction address

A first storage unit stores an address of a branching instruction and a branched address. A first detector detects whether or not an instruction of the present address has previously been branched from an output of the first storage unit. When the first detector detects previous branching of the instruction of the present address, the second storage unit stores the branched address corresponding to the address of the instruction to be executed following the branching instruction. When a second detector detects an output of a program counter as the address of the instruction to be executed following the branching instruction, the second storage unit outputs the branched address.
Owner:KK TOSHIBA

System and method for processing jump instruction of microprocessor in branch prediction way

The invention discloses a system and a method for processing a jump instruction of a microprocessor in a branch prediction way. The system comprises a coding module and a transmission module, wherein the coding module comprises a branch predictor used for predicting by adopting a static prediction method when the jump instruction to be processed is in the jump execution type or adopting a dynamic prediction method when the jump instruction to be processed is not in the jump execution type after the coding module judges that an instruction to be processed is the jump instruction and judges the type of the jump instruction through precoding, and directly writing the jump instruction to be processed and a delay slot instruction thereof in an operational queue in a sequence of the instructions in a program; and the transmission module comprises a prediction result processor used for canceling the instruction executed by error and continue fetching in a correct jump direction when the branch predictor predicts the jump instruction by error after the jump instruction is executed and written back to the transmission module. The system cancels operation by adopting different cancellation methods on the basis that whether the instruction is the jump execution instruction or not when the instruction is cancelled.
Owner:LOONGSON TECH CORP

Method for sending information and method and apparatus for receiving information

The invention discloses a method for sending information and a method and device for receiving information. The device includes: a channel estimation unit, a channel deviation correction unit, a demodulation and information hard judgment unit, a channel acquisition unit and a time slot delay unit. The method for sending information is as follows: the uplink power control information unit is set to include the pilot information of the first K time slots and the control information of the last L time slots including the power control command word, K and L are positive integers; the sending end sends the uplink Power Control Information Element. The method of receiving information is as follows: the receiving end performs channel estimation based on the pilot information of the previous K time slots, performs channel correction, demodulation and hard judgment processing on the control information of the current time slot, and obtains the actual control information of the current time slot and channel information; after a time slot is delayed, channel estimation is performed according to the obtained channel information. The invention can reduce the power of sending the uplink power control information unit, thereby reducing the self-interference of the system and improving the capacity of the system.
Owner:XFUSION DIGITAL TECH CO LTD

Array processor capable of avoiding loaduse risk pause under dual-mode instruction set architecture

The invention belongs to the technical field of reconfigurable computing, particularly relates to an array processor under a dual-mode instruction set architecture, which can avoid loaduse adventure pause, and aims at solving the problem that when an existing traditional instruction set architecture faces loaduse-use adventure, the problem must be solved by a method of pausing an assembly line, and provides the following scheme that the array processor comprises an array processor body, the array processor body comprises a global controller, a processing element array, an instruction memory and a data memory, the instruction memory is connected with the global controller, the global controller is connected with the processing element array, and the processing element array is connected with the data memory. According to the array processor, a traditional branch delay slot design is not adopted, and assembly line pause or assembly line scouring cannot be generated for any existing risk, so that the design of a hardware circuit is greatly simplified, area resources are saved, the power consumption of the array processor is reduced, and the performance of the array processor is improved.
Owner:XIAN UNIV OF POSTS & TELECOMM

Effective elimination of delay slot handling from a front section of a processor pipeline

Architectural techniques and implementations that defer enforcement of certain delayed control transfer instruction (DCTI) sequencing constraints or conventions to later stages of an execution pipeline are described. In this way, complexity of a processor pipeline front-end (including fetch sequencing) can be simplified, at least in-part, by fetching instructions generally without regard to such constraints or conventions. Instead, enforcement of such sequencing constraints and / or conventions may be deferred to one or more pipeline stages associated with commitment or retirement of instructions. Higher fetch bandwidth may be achieved in some realizations when, for example, DCTI couples are encountered in an execution sequence.
Owner:ORACLE INT CORP

A dynamic binary translation method and device for a VLIW architecture

The invention discloses a dynamic binary translation method and device for a VLIW architecture. The method comprises the steps of obtaining a basic block; checking whether a delay operation after executing the previous basic block exists in the execution delay slot queue or not; if yes, entering an original mode to translate the basic block; if not, entering a fast mode translation basic block, and checking whether a translation delay slot queue has a delay operation delayed to the period or not; if yes, directly translating the delay operation delayed to the period into a local code of the corresponding operation, and removing the delay operation delayed to the period from the queue; translating the instruction of the current period, and if the instruction of the current period has delayoperation, writing the delay operation into a translation delay slot queue; after basic block translation is finished, if a delay operation is still left, the delay operation is carried to an execution delay slot queue; and executing the translated local code in the fast mode and the original mode. According to the invention, the performance of executing the translation program can be improved.
Owner:康烁

Instruction processing method and device

The invention discloses an instruction processing method and device. The method comprises the steps: determining the number of delay slots corresponding to a transfer instruction when the transfer instruction in a program is decoded at a decoding stage of a pipeline; when an execution stage of the assembly line executes the transfer instruction, determining a target assembly line stage occupied by the delay slots with the corresponding number in the assembly line, and flushing an instruction in the assembly line stage after the target assembly line stage; after the transfer instruction is executed, transferring to a target address; in the process that the target address obtains the second instruction, executing the first instruction in the execution level in sequence, so that the unoccupied pipeline level before the second instruction is executed is utilized. Through the method and the device, the problem of resource waste of the processor caused by difficulty in fully utilizing the pipeline level of the pipeline in the working process of the processor in the related technology is solved.
Owner:北京中科晶上科技股份有限公司

Method and apparatus for jump delay slot control in pipelined processor

A method of managing the configuration, design parameters, and functionality of an integrated circuit (IC) design using a hardware description language (HDL). Instructions can be added, subtracted, or generated by the designer interactively during the design process, and customized HDL descriptions of the IC design are generated through the use of scripts based on the user-edited instruction set and inputs. The customized HDL description can then be used as the basis for generating 'makefiles' for purposes of simulation and / or logic level synthesis. The method further affords the ability to generate an HDL model of a complete device, such as a microprocessor or DSP. A computer program implementing the aforementioned method and a hardware system for running the computer program are also disclosed.
Owner:SYNOPSYS INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products