Method and system for scheduling delay slot in very-long instruction word structure

A technology of super-long instruction word and scheduling method, applied in the direction of concurrent instruction execution, machine execution device, etc., can solve the problems of affecting the compilation speed of the compiler, affecting the actual efficiency of the program, and high cost, and achieve the effect of high execution efficiency.

Inactive Publication Date: 2013-01-16
INST OF ACOUSTICS CHINESE ACAD OF SCI
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If there is no suitable instruction in the basic block to choose from, then fill in the empty operation instruction, and the global scheduling is to select the appropriate instruction from other basic blocks based on certain constraint rules after finding that there is no suitable instruction in the basic block to fill the delay slot. Instructions are filled, that is to say, global scheduling allows code movement across basic block boundaries. If this process also fails, then select no-op instructions for filling, but the cost of implementing global scheduling on the compiler is relatively high, and it directly affects compilation. Compiler speed
[0005] In traditional DSP processor architecture design, usually only local scheduling algorithm is used or limited to the trade-off between global scheduling and compiler efficiency for compilation speed considerations, while local scheduling algorithm is usually limited to optional target instructions for local function fragments The number is very small, when the compiler is fully optimized, the logical enhancement between instructions in the local function fragment leads to low delay slot usage
When the traditional DSP processor architecture design scheme is not used in the architecture based on VLIW technology, the impact on the parallelism of instructions cannot be evaluated, and the use of delay slots may cause damage to the parallelism of instructions. actual efficiency of the program

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for scheduling delay slot in very-long instruction word structure
  • Method and system for scheduling delay slot in very-long instruction word structure
  • Method and system for scheduling delay slot in very-long instruction word structure

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The technical solutions of the present application will be described in further detail below with reference to the drawings and embodiments.

[0021] The embodiment of the present invention proposes a delay slot scheduling method under the VLIW structure, which is used to complete the instruction filling work of the delay slot from the assembly level. This method combines the local scheduling strategy and the global scheduling strategy in the proposed delay slot scheduling algorithm, and according to the requirements of the VLIW structure for instruction parallelism, a balanced scheduling strategy is designed to achieve a trade-off between instruction delay slot scheduling and program parallelism. A trade-off between local and global scheduling to achieve the highest possible instruction pipeline performance.

[0022] Combine below figure 1 and figure 2 The delay slot scheduling method and system thereof under the VLIW structure of the embodiment of the present inven...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a system for scheduling a delay slot in a very-long instruction word structure. The method comprises the steps of locally scheduling instructions in a current basic block; after the local scheduling is finished, judging whether a residual instruction delay slot exists, if not, ending the scheduling, otherwise, putting an instruction which can be filled into the instruction delay slot and is high in spending into a local standby instruction cache; globally scheduling instructions in a basic block of a branch target, selecting an instruction which can be filled into the instruction delay slot and placing the instruction in a global standby instruction cache; and selecting an instruction from the local standby instruction cache and/or the global standby instruction cache and filling the instruction into the residual instruction delay slot. The system comprises a local scheduling unit, a global scheduling unit and a balanced scheduling unit. According to the method and the system for scheduling the delay slot in the very-long instruction word structure disclosed by the invention, through balance between scheduling of the delay slot and program parallelism, as well as balance between local scheduling and global scheduling, high execution efficiency of programs can be implemented.

Description

technical field [0001] The invention relates to an instruction scheduling technology, in particular to a delay slot scheduling method and system thereof under a super long instruction word structure. Background technique [0002] Very Long Instruction Word (VLIW for short) is a very long instruction combination, which connects many instructions together to increase the speed of operation. VLIW technology is one of the main performances to improve processor instruction-level parallelism. It uses software and hardware to jointly develop processor parallelism. The assembly of long instructions is completed by the compiler instead of using a superscalar processor based on hardware dynamic scheduling strategy, thus greatly reducing hardware complexity and chip power consumption. [0003] In the design of digital signal processor (Digital Signal Processing, referred to as DSP) architecture, research is usually carried out in combination with VILW technology under the reduced inst...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/38
Inventor 朱浩彭楚王东辉洪缨侯朝焕
Owner INST OF ACOUSTICS CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products