Instruction processing method and apparatus, device, and storage medium
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING INSTITUTE OF OPEN SOURCE CHIP
- Filing Date
- 2026-02-10
- Publication Date
- 2026-06-19
Smart Images

Figure CN122240177A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, and in particular to a method, apparatus, electronic device, and computer-readable storage medium for processing instructions. Background Technology
[0002] Multi-issue processors can process multiple instructions simultaneously within a single cycle, thereby improving instruction processing efficiency.
[0003] Current multi-issue processors have a fixed number of instructions to process during the design phase, and the core register file (RegFile) in multi-issue processors also needs to determine the number of read / write ports according to this design goal. For example, a dual-issue processor can process two instructions in one cycle, which requires a register file with 4 read ports and 2 write ports. This is because each instruction may have 2 source operands (src) to be read and the execution result to be written back to 1 destination operand (dest).
[0004] However, the multi-issue processors in related technologies have a fixed number of instructions to process, which limits their instruction processing capabilities and reduces their efficiency. To increase their instruction processing efficiency, the hardware specifications of the multi-issue processor need to be upgraded, which increases costs. Summary of the Invention
[0005] This application provides an instruction processing method, apparatus, electronic device, and computer-readable storage medium to solve problems in related technologies.
[0006] In a first aspect, embodiments of this application provide a method for processing instructions, the method comprising: Obtain the number of ports of the multi-channel transmitter processor; Obtain the operand requirement information of the instruction, and compare the requirement information with the number of ports to obtain the comparison result; Based on the comparison results, the instructions are processed by the multi-channel transmitter processor.
[0007] Secondly, embodiments of this application provide an instruction processing apparatus, the apparatus comprising: The first acquisition module is used to acquire the number of ports of the multi-channel transmitter processor; The second acquisition module is used to acquire the instruction's operand requirement information and compare the requirement information with the number of ports to obtain the comparison result; The processing module is used to process the instruction through the multiplexer processor based on the comparison result.
[0008] Thirdly, embodiments of this application also provide an electronic device, including a processor; a memory for storing processor-executable instructions; wherein the processor is configured to execute the instructions to implement the method of the first aspect.
[0009] Fourthly, embodiments of this application also provide a computer-readable storage medium that, when the instructions in the computer-readable storage medium are executed by a processor of an electronic device, enables the electronic device to perform the method of the first aspect.
[0010] In this embodiment, the operand requirement information of the instruction can be obtained, and this requirement information is compared with the number of ports of the multi-channel sender processor to obtain a comparison result. This comparison result reflects whether the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction. Subsequently, the multi-channel sender processor can be used to process the instruction. If the comparison result shows that the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction and there is still spare capacity, a larger number of instructions can be selected for processing in the same cycle. This application increases the number of instructions processed by the multi-channel sender processor in a single cycle without changing the specifications of the multi-channel sender processor itself, thereby improving the instruction processing efficiency, and the overall cost of the solution is low.
[0011] The above description is only an overview of the technical solution of this application. In order to better understand the technical means of this application and to implement it in accordance with the contents of the specification, and to make the above and other objects, features and advantages of this application more obvious and understandable, the following are specific embodiments of this application. Attached Figure Description
[0012] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0013] Figure 1 This is a flowchart illustrating the steps of an instruction processing method provided in an embodiment of this application; Figure 2 This is a flowchart illustrating the specific steps of an instruction processing method provided in an embodiment of this application; Figure 3 This is a flowchart illustrating the specific steps of another instruction processing method provided in this application embodiment; Figure 4 This is a schematic diagram of a front-end hardware provided in an embodiment of this application. Figure 5This is a flowchart illustrating the specific steps of another instruction processing method provided in this application embodiment; Figure 6 This is a schematic diagram of a backend submission stage provided in an embodiment of this application; Figure 7 This is a block diagram of an instruction processing apparatus provided in an embodiment of this application; Figure 8 This is a block diagram of a first electronic device provided in an embodiment of this application; Figure 9 This is a block diagram of a second electronic device according to another embodiment of this application. Detailed Implementation
[0014] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0015] The terms "first," "second," etc., used in this application's specification are used to distinguish similar objects, not to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that embodiments of this application can be implemented in orders other than those illustrated or described herein, and the objects distinguished by "first," "second," etc., are generally of the same class, not limited in number; for example, the first object can be one or more. Furthermore, the term "and / or" in the specification describes the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. The character " / " generally indicates that the preceding and following related objects have an "or" relationship. In the embodiments of this application, the term "multiple" refers to two or more, and other quantifiers are similar.
[0016] Figure 1 This is a flowchart illustrating the steps of an instruction processing method provided in an embodiment of this application, as follows: Figure 1 As shown, the method may include: Step 101: Obtain the number of ports of the multi-channel transmitter processor.
[0017] In this embodiment, a multi-issue processor refers to a processor capable of issuing multiple instructions simultaneously within a single clock cycle. For example, a dual-issue processor can issue two instructions per cycle, and a triple-issue processor can issue three instructions per cycle. This requires sufficient internal hardware resources to process multiple instructions in parallel.
[0018] The register file (RegFile) is a set of high-speed memory locations in the processor used to store instruction operands. Since each instruction may require reading one or two source operands (src) and writing the result back to a destination operand (dest), multi-issue processors can define the number of read / write ports using the register file. The number of read / write ports in the register file limits the number of registers that can be read and written simultaneously within a single cycle.
[0019] Specifically, the multiple-issue processor is used to process the instruction stream. Each instruction in the instruction stream is an instruction from the instruction set of the multiple-issue processor. For example, it can be an addition instruction, a subtraction instruction, a write instruction, a read instruction, a jump instruction, and other instructions.
[0020] For a multiplexer processor to process an instruction, its instruction pipeline includes the following stages: Fetch: Reading instructions from the instruction memory.
[0021] Alignment: Align the fetched instruction stream to facilitate subsequent decoding. This is especially important when instruction lengths are not fixed (e.g., a mix of 32-bit and 16-bit instructions), as instruction boundaries need to be determined.
[0022] Decoding: Parsing the meaning of instructions and determining the operation type, operands, etc.
[0023] Issue: Distributing the decoded instructions to the appropriate execution units (such as arithmetic logic units, memory access units, etc.). In a multi-issue processor, multiple instructions can be issued in one cycle.
[0024] Execute / Memory: Executes instructions, including arithmetic and logical operations. If it is a memory access instruction, it accesses the data memory.
[0025] Commit / Write-back: This stage writes the results of instruction execution back to the register file or memory and updates the processor state. The commit phase ensures that instructions are completed sequentially and is the final stage of the pipeline.
[0026] The instruction buffer (IB) is used to temporarily store the instruction stream fetched from the instruction fetch stage. Because the instruction fetch stage may be affected by factors such as instruction cache hits and branch prediction, the instruction stream may not be continuous. The instruction buffer can smooth the instruction stream and reduce pipeline cavitation (i.e., situations where some stages of the pipeline do not have valid instructions to execute).
[0027] The commit buffer (CB) is used to temporarily store instructions that have been executed but not yet committed, along with their results. The commit phase requires instructions to be committed in order; therefore, the commit buffer can be viewed as an ordered buffer that holds executed instructions awaiting commit.
[0028] In this step, the number of ports provided by the multiplexer can be obtained based on the multiplexer's register file. These ports include read ports and write ports. The number of ports defines the multiplexer's instruction processing capability in a single cycle.
[0029] For example, a dual-issue processor typically requires four read ports (up to two source operands per instruction, so four for two instructions) and two write ports (one destination operand per instruction, so two for two instructions).
[0030] Three-issue processors typically require six read ports and three write ports.
[0031] A four-issue processor typically requires eight read ports and four write ports.
[0032] ... An N-way transmit processor typically requires 2N read ports and N write ports.
[0033] Step 102: Obtain the instruction's operand requirement information and compare the requirement information with the number of ports to obtain the comparison result.
[0034] Step 103: Based on the comparison result, the instruction is processed by the multiplexer.
[0035] In the embodiments of this application, the instructions have actual requirements for operands. For example, some instructions need to read several source operands for calculation and write back the calculated operands. However, there are also some instructions that need to read several source operands for calculation but do not need to write the operands back.
[0036] In related technologies, multi-issue processors fix the number of read / write ports and process a fixed number of instructions without considering the actual operand requirements of the instructions.
[0037] For example, a two-way issue processor has four read ports and two write ports. However, an instruction may actually require reading 0-2 source operands and writing back 0-1 operands. Different instructions may have different requirements for reading source operands and writing back operands. Related technologies do not consider these variations in actual requirements; they only process the issue and commit of two instructions per cycle, ensuring that the four read ports and two write ports can meet the maximum requirements of these two instructions. This results in a waste of port resources. For instance, if the first instruction needs to read one source operand and write back one operand, and the second instruction needs to read one source operand but does not need to write back, then the resources of two read ports and one write port are wasted in that cycle. Furthermore, this limits the instruction processing capacity of the two-way issue processor. To increase the instruction processing efficiency of the two-way issue processor in a single cycle, the hardware specifications of the two-way issue processor need to be upgraded (e.g., to three-way or four-way), increasing costs.
[0038] Specifically, the strategy adopted in this application embodiment is to obtain the actual operand requirements of the instruction and compare this actual requirements with the number of ports of the multi-channel issue processor to obtain a comparison result. This comparison result reflects whether the current port resources of the multi-channel issue processor can meet the actual requirements of the instruction. Subsequently, the multi-channel issue processor can be used to process the instruction. That is, if the comparison result shows that the current port resources of the multi-channel issue processor can meet the actual requirements of the instruction and there is still spare capacity, a larger number of instructions can be selected for processing in the same cycle; if the comparison result shows that the current port resources of the multi-channel issue processor cannot meet the actual requirements of the instruction, a smaller number of instructions can be selected for processing in the same cycle.
[0039] For example, in each cycle, the 2-way issue processor normally decodes 2 instructions and checks how many source operands are needed for these two instructions (maximum 4).
[0040] If one instruction requires one source operand and another instruction requires two source operands, then these two instructions do not occupy all the read ports (only three are used). That is, the current port resources of the 2-way issue processor can meet the actual needs of the first two instructions and there is still room. Then the 2-way issue processor will try to decode the third instruction (an additional instruction taken from the instruction buffer).
[0041] If the third instruction requires one source operand, then this third instruction can be issued along with the first two. In this way, the original 2-way issue processor can process 3 instructions in this cycle, increasing the instruction processing efficiency of the 2-way issue processor without changing its own specifications.
[0042] In summary, in this embodiment, the operand requirement information of the instruction can be obtained, and this requirement information can be compared with the number of ports of the multi-channel sender processor to obtain a comparison result. This comparison result reflects whether the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction. Subsequently, the multi-channel sender processor can be used to process the instruction. If the comparison result shows that the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction and there is still spare capacity, a larger number of instructions can be selected for processing in the same cycle. This application increases the number of instructions processed by the multi-channel sender processor in a single cycle without changing the specifications of the multi-channel sender processor itself, thereby improving the instruction processing efficiency, and the overall cost of the solution is low.
[0043] Figure 2 This is a flowchart illustrating the specific steps of an instruction processing method provided in an embodiment of this application, as follows: Figure 2 As shown, the method may include: Step 201: Obtain the number of ports of the multi-channel transmitter processor.
[0044] For details, please refer to step 101 above; it will not be repeated here.
[0045] Step 202: Transmit the corresponding instruction from each of the basic output channels to the corresponding basic decoder for decoding to obtain the instruction's requirement information.
[0046] The multi-channel transmitter processor includes: a basic output channel, a new output channel, a basic decoder, and a new decoder.
[0047] Step 203: If the next instruction is a target instruction, the target instruction is transmitted from the new output channel to the new decoder for decoding to obtain the requirement information of the target instruction.
[0048] The target instruction is an instruction whose required number of source operands is less than or equal to 1.
[0049] In this embodiment, regarding steps 202-203, the N-way issue processor itself has N basic output channels and N basic decoders. The N basic output channels and N basic decoders correspond one-to-one. Each basic output channel is used to input an instruction into its corresponding basic decoder for decoding. This allows the N-way issue processor to decode N instructions within one cycle. The decoder's role is to parse the meaning of the instruction, determining the operation type, operands, etc. Thus, the basic decoder can decode the instruction to obtain its requirement information, such as the number of source operands required and whether the instruction needs to write back operands.
[0050] In this embodiment of the application, in order to improve the instruction processing efficiency of the multi-channel sender processor, an additional output channel can be added after the instruction buffer, and an additional decoder can be added during the decoding stage. The additional output channel is used to input the target instruction that meets the requirements into the additional decoder for decoding, thereby obtaining the requirement information of the target instruction.
[0051] For example, in a certain clock cycle, the first instruction requires 1 src (source operand), the second instruction requires 2 src, and if the third instruction only requires 0 or 1 src, then the third instruction (i.e., the compressed instruction) can be decoded (uncompressed instructions are also not within the recognition range of the new decoder, and the new decoder cannot recognize uncompressed instructions).
[0052] For example, if the first instruction requires two source codes (src) and the second instruction requires two source codes (src) in a certain clock cycle, then the third instruction, if it does not require a source code (src), can be decoded by a new decoder.
[0053] For example, if the first instruction requires 0 src and the second instruction requires 1 src in a certain clock cycle, then the third instruction can be decoded by the new decoder if it does not require src or only requires 1 src, because the new decoder only decodes instructions that require 0 and 1 src.
[0054] Specifically, the target instruction is also known as a compressed instruction. A compressed instruction is a short-coded instruction format defined in the processor instruction set architecture. Standard instructions are typically 32 bits long, while compressed instructions are typically 16 bits long. A 16-bit compressed instruction is functionally equivalent to a 32-bit standard instruction. For example, in RISC-V, `c.addi rd, imm` (16-bit) and `addi rd, rd, imm` (32-bit) perform the same operation (adding an immediate value to a register value).
[0055] In this embodiment of the application, shortening the instruction from 32 bits to 16 bits can significantly reduce the size of the compiled binary program, save storage space, improve instruction cache efficiency, and reduce instruction fetch bandwidth pressure.
[0056] Furthermore, because compressed instructions are only 16 bits, the processor's front-end fetch and instruction buffer can physically hold more instructions per cycle (64-bit bandwidth would normally correspond to two 32-bit instructions, but could potentially correspond to four 16-bit instructions). This makes it much easier for this solution to output an extra instruction from the instruction buffer.
[0057] Furthermore, the newly added decoder in this application is specifically designed to identify and process compressed instructions. Compressed instructions encode simple operations and are typically "lightweight instructions" (requiring only 0 or 1 source operands). Due to their lightweight nature, compressed instructions are more suitable for being submitted or issued along with regular instructions (similar to how a thin person can more easily squeeze out of a crowded car to hitch a ride). Compressed instructions allow the front end to provide more instructions, enabling multi-issue processors to issue these instructions more efficiently, thus fully leveraging the advantages of the compressed instruction set architecture.
[0058] Step 204: Compare the requirement information with the number of ports to obtain the comparison result.
[0059] For details, please refer to step 102 above; it will not be repeated here.
[0060] Step 205: When the comparison result shows that the number of ports can meet the requirements, while processing the first instruction to be processed among all instructions in the current cycle, select at least one additional second instruction for synchronous processing.
[0061] Step 206: When the comparison result indicates that the number of ports cannot meet the required information, only the first instruction to be processed among all instructions will be processed in the current cycle.
[0062] The second instruction is an instruction whose required number of source operands is less than or equal to 1.
[0063] In this embodiment, for steps 205-206, the comparison result reflects whether the current port resources of the multi-channel send processor can meet the actual requirements of the instructions. The comparison result includes two cases: If the comparison results reflect that the current port resources of the multi-issue processor can meet the actual needs of the instructions and there is still room, at least one additional second instruction can be selected for synchronous processing while the first instruction to be processed among all instructions in the current cycle is being processed. If the comparison results show that the current port resources of the multi-issue processor cannot meet the actual needs of the instructions, then only the first instruction to be processed among all instructions can be processed in the current cycle.
[0064] For example, a 2-way transmit processor has 4 read ports and 2 write ports.
[0065] In the instruction issuance phase, assuming that the first instruction requires 2 srcs and the second instruction requires 1 src in a certain clock cycle, then if the third instruction only requires 0 or 1 src, the third instruction can be issued together with the first two instructions.
[0066] For example, if the first instruction requires two source codes (src) and the second instruction requires two source codes (src) in a certain clock cycle, then if the third instruction does not require a source code (src), the third instruction can be issued together with the first two instructions.
[0067] For example, in a given clock cycle, if the first instruction requires 0 reads and the second instruction requires 1 read, then the third instruction, if it requires no read or only 1 read, can be issued along with the first two instructions. Even the fourth instruction, regardless of whether it requires 1 or 2 reads, can be issued along with the first three instructions because the four read ports of the 2-way issue processor are not yet fully utilized. Issuing more than two instructions simultaneously does not exceed the limit of four read ports on a two-way issue processor, thus avoiding resource waste. In contrast, related technologies can only issue the first two instructions.
[0068] Regarding the instruction commit phase, assuming that in a given clock cycle, the first two normally committed instructions need to be written back to `dest` (the destination operand), while the third instruction (already executed) does not need to be written back to `dest`, they can be committed together. Similarly, the fourth instruction (already executed) also does not need to be written back to `dest`, so it can also be committed together. This achieves 2 normally committed instructions + 0 to N trailing commit instructions.
[0069] For example, the first normal commit instruction does not need to be written back to dest, the second commit instruction does need to be written back to dest, and the third instruction (which has been executed) also needs to be written back to dest. In these cases, they can all be committed together. Even the fourth instruction (which has been executed) does not need to be written back to dest, so it can also be committed together.
[0070] For example, the first, second, and third instructions (which have been executed) do not need to be written back to dest, the fourth and fifth instructions (which have been executed) do need to be written back to dest, the sixth and seventh instructions (which have been executed) do not need to be written back to dest, but the eighth instruction (which has been executed) does need to be written back to dest. In this case, the first seven instructions can be committed together, but the eighth instruction cannot be committed because the dest write-back channel is already full.
[0071] Submitting more than two instructions simultaneously does not exceed the limit of two write ports on the two-way issue processor, thus avoiding resource waste. In contrast, related technologies can only submit the first two instructions.
[0072] In summary, in this embodiment, the operand requirement information of the instruction can be obtained, and this requirement information can be compared with the number of ports of the multi-channel sender processor to obtain a comparison result. This comparison result reflects whether the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction. Subsequently, the multi-channel sender processor can be used to process the instruction. If the comparison result shows that the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction and there is still spare capacity, a larger number of instructions can be selected for processing in the same cycle. This application increases the number of instructions processed by the multi-channel sender processor in a single cycle without changing the specifications of the multi-channel sender processor itself, thereby improving the instruction processing efficiency, and the overall cost of the solution is low.
[0073] Figure 3 This is a flowchart illustrating the specific steps of another instruction processing method provided in this application embodiment, as follows: Figure 3 As shown, the method may include: Step 301: Obtain the number of ports of the multi-channel transmitter processor.
[0074] For details, please refer to step 101 above; it will not be repeated here.
[0075] Step 302: Determine the first instruction to be launched from all instructions.
[0076] The multi-channel transmitter processor includes: a basic output channel, a new output channel, a basic decoder, and a new decoder.
[0077] Step 303: Obtain the first number of source operands that each of the first instructions needs to read.
[0078] The number of transmitters in the multi-channel transmitter processor is N, and the number of read ports in the register file of the multi-channel transmitter processor is 2N.
[0079] Step 304: Compare the first quantity with 2N to obtain the comparison result.
[0080] In this embodiment, regarding steps 302-304, the initial design of the N-way transmitter processor sets the number of read ports to 2N. This number of read ports is specifically defined and fixed in the register file. The first instruction is the instruction that the N-way transmitter processor can process under its original processing capabilities. The number of the first instructions is N, which is because the N-way transmitter processor is designed to process N instructions in one cycle. Furthermore, by parsing the first instruction with a decoder, the first number of source operands that the first instruction needs to read can be obtained.
[0081] In the process of comparing the demand information with the number of ports to obtain the comparison result, the first quantity can be specifically compared with 2N to obtain the comparison result.
[0082] Step 305: If the comparison result is that the first number is less than 2N, select at least one additional second instruction from the instruction buffer.
[0083] Step 306: The second instruction is transmitted together with the first instruction through the multiplexer.
[0084] The second instruction is an instruction whose required number of source operands is less than or equal to 1. Each selected second instruction needs to read the sum of the second number and the first number of source operands, which is less than or equal to 2N.
[0085] In this embodiment, when the first number is less than 2N, it means that the N first instructions processed by the N-way transmitter processor under its original capacity do not fully occupy the 2N read ports of the N-way transmitter processor, and there are still read port resources remaining. Therefore, at least one additional second instruction (compressed instruction) can be selected from the instruction buffer and issued together with the original N first instructions. However, the sum of the second and first number of source operands that each selected second instruction needs to read is less than or equal to 2N. This avoids the number of read ports required by the instructions to be issued exceeding the inherent number of read ports of the N-way transmitter processor, which is 2N.
[0086] When the first quantity equals 2N, it means that the N first instructions processed by the N-way issuer processor under its original capacity just fill the 2N number of read ports of the N-way issuer processor, and there are no read port resources left. Therefore, in the current cycle, only N first instructions can be issued.
[0087] For example, for a 2-way issue processor with 4 read ports, if the first instruction requires 2 read ports and the second instruction requires 1 read port in a certain clock cycle, then if the third instruction only requires 0 or 1 read ports, the third instruction can be issued together with the first two instructions.
[0088] add t0 a0 a1 / / First instruction: R-type instruction, requires 2 source operands (a0 and a1), and stores the result in t0; add t1 t0 -5 / / Second instruction: Type I instruction, requires one source operand (t0) and one immediate value (-5), the result is stored in t1; jalr ra t1 0 / / Third instruction: Type I instruction, requires one source operand (t1) and one immediate value (0), returns the address stored in ra.
[0089] Further reference Figure 4 It shows a front-end hardware schematic diagram of the above example. In addition to the normal decoding of instructions from the instruction buffer, the embodiments of this application also include a new decoder with a specific instruction list, in which the instructions conform to the instruction characteristics described in the above method.
[0090] For example, for a 2-way issuer processor with 4 read ports, if the first instruction requires 2 srcs and the second instruction requires 2 srcs in a certain clock cycle, then if the third instruction only requires 0 or 1 srcs, then only the first two instructions will be issued.
[0091] In summary, in this embodiment, the operand requirement information of the instruction can be obtained, and this requirement information can be compared with the number of ports of the multi-channel sender processor to obtain a comparison result. This comparison result reflects whether the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction. Subsequently, the multi-channel sender processor can be used to process the instruction. If the comparison result shows that the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction and there is still spare capacity, a larger number of instructions can be selected for processing in the same cycle. This application increases the number of instructions processed by the multi-channel sender processor in a single cycle without changing the specifications of the multi-channel sender processor itself, thereby improving the instruction processing efficiency, and the overall cost of the solution is low.
[0092] Figure 5 This is a flowchart illustrating the specific steps of another instruction processing method provided in this application embodiment, as follows: Figure 5 As shown, the method may include: Step 401: Obtain the number of ports of the multi-channel transmitter processor.
[0093] For details, please refer to step 101 above; it will not be repeated here.
[0094] Step 402: Identify the third instruction to be submitted among all instructions.
[0095] Step 403: Obtain the third number of operands that each of the third instructions needs to write back.
[0096] Wherein, the number of transmit ports of the multi-channel transmit processor is M, the number of read ports of the register file of the multi-channel transmit processor is M, and the number of write ports is M.
[0097] Step 404: Compare the third quantity with M to obtain the comparison result.
[0098] In this embodiment, regarding steps 402-404, the initial design of the N-way transmitter processor sets the number of write ports to M. This number of write ports is specifically defined and fixed in the register file. The third instruction is the instruction that the N-way transmitter processor can process under its original processing capabilities. The number of third instructions is N, reflecting the N-way transmitter processor's design to process N instructions in one cycle. Furthermore, by parsing the first instruction using a decoder, the number of operands to be written back for the third instruction can be obtained.
[0099] In the process of comparing the demand information with the number of ports to obtain the comparison result, the third quantity can be specifically compared with M to obtain the comparison result.
[0100] Step 405: If the comparison result shows that the third quantity is less than M, select at least one additional fourth instruction that has been executed according to the instruction completion order.
[0101] Step 406: Submit the third instruction together with the fourth instruction through the multiplexer.
[0102] Each of the selected fourth instructions must read the sum of the fourth and third quantities of the source operands, which is less than or equal to M.
[0103] In this embodiment, when the third quantity is less than M, it means that the N third instructions processed by the N-way transmitter processor under its original capacity have not filled all M write ports of the N-way transmitter processor, and there are still write port resources remaining. Therefore, at least one additional fourth instruction can be selected according to the instruction completion order and submitted together with the original N third instructions. However, the sum of the fourth and third quantities of source operands that each selected fourth instruction needs to read is less than or equal to M. This avoids the number of write ports required by the instructions to be submitted exceeding the inherent number of write ports M of the N-way transmitter processor.
[0104] When the third quantity equals M, it means that the N third instructions processed under the original capacity of the N-way issuer processor just fill the M number of write ports of the N-way issuer processor, and there are no write port resources left. Therefore, in the current cycle, only N third instructions can be submitted.
[0105] For example, consider a 2-way issue processor with two write ports. In a given clock cycle, the first normally committed instruction needs to be written back to `dest`, the second commit instruction needs to be written back to `dest`, while the third (already executed) and fourth (already executed) instructions do not need to be written back to `dest`; only the fifth (already executed) instruction needs to be written back to `dest`. The first four instructions can be committed together in the same clock cycle. The fifth instruction is not included in the current commit window because the first and second instructions have already occupied all the `dest` write ports. If the `dest` field of the fifth instruction is also 0, then it can also be included in the commit window and committed together.
[0106] addi t0 a0 5 / / First instruction, needs to be written back slli t1 t0 2 / / The second instruction needs to be written back beq t0 t1 skip_label / / Third instruction, no need to write back jal ra function_call / / Fourth instruction, no need to write back add t2 t0 t1 / / Fifth instruction, needs to be written back Reference Figure 6 It illustrates the backend commit phase for the example above, with the commands to be committed in the example shown in the boxes. The commit buffer can be viewed as a record table from the perspective of storing content. Figure 6 The area above the black dotted line is a table header listed for explanation purposes and does not actually exist in the hardware. Figure 6 The document displays eight entries contained in the hardware, with each row representing one entry. From left to right, the fields are valid, done, dest_addr, dest, and result, representing whether the entry is valid; whether the instruction represented by the row has been executed; the destination address of the instruction in the RegFile; whether the instruction needs to be written back to the RegFile; and the result value of the instruction. Related technologies' submission windows only include the two oldest entries, while the submission window using the method of this application includes the four oldest entries.
[0107] In summary, in this embodiment, the operand requirement information of the instruction can be obtained, and this requirement information can be compared with the number of ports of the multi-channel sender processor to obtain a comparison result. This comparison result reflects whether the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction. Subsequently, the multi-channel sender processor can be used to process the instruction. If the comparison result shows that the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction and there is still spare capacity, a larger number of instructions can be selected for processing in the same cycle. This application increases the number of instructions processed by the multi-channel sender processor in a single cycle without changing the specifications of the multi-channel sender processor itself, thereby improving the instruction processing efficiency, and the overall cost of the solution is low.
[0108] Figure 7 This is a block diagram of an instruction processing apparatus provided in an embodiment of this application. The apparatus includes: The first acquisition module 501 is used to acquire the number of ports of the multi-channel transmitter processor; The second acquisition module 502 is used to acquire the instruction's operand requirement information and compare the requirement information with the number of ports to obtain a comparison result; The processing module 503 is used to process the instruction through the multiplexer processor according to the comparison result.
[0109] Optionally, the multiplexer processor includes: a basic output channel, a new output channel, a basic decoder, and a new decoder; The second acquisition module 502 includes: The first decoding submodule is used to transmit the corresponding instruction from each of the basic output channels to the corresponding basic decoder for decoding, so as to obtain the requirement information of the instruction. The second decoding submodule is used to transmit the target instruction to the new decoder via the new output channel when the next instruction is the target instruction, so as to obtain the requirement information of the target instruction. The target instruction is an instruction whose required number of source operands is less than or equal to 1.
[0110] Optionally, the processing module 503 includes: The first processing submodule is used to, when the comparison result indicates that the number of ports can meet the requirement information, simultaneously process the first instruction to be processed among all instructions in the current cycle, and select at least one additional second instruction for synchronous processing. The second processing submodule is used to process only the first instruction to be processed among all instructions in the current cycle when the comparison result shows that the number of ports cannot meet the requirements. The second instruction is an instruction whose required number of source operands is less than or equal to 1.
[0111] Optionally, the number of transmissions of the multiplexer is N, the number of read ports of the register file of the multiplexer is 2N, and the second acquisition module 502 includes: The first determination submodule is used to determine the first instruction to be launched among all instructions; The first acquisition submodule is used to acquire the first number of source operands to be read for each of the first instructions; The first comparison submodule is used to compare the first quantity with 2N to obtain the comparison result.
[0112] Optionally, the processing module 503 includes: The third processing submodule is used to select at least one additional second instruction from the instruction buffer when the comparison result is that the first number is less than 2N. The fourth processing submodule is used to transmit the second instruction together with the first instruction through the multiplexer processor. The second instruction is an instruction whose required number of source operands is less than or equal to 1. Each selected second instruction needs to read the sum of the second number and the first number of source operands, which is less than or equal to 2N.
[0113] Optionally, the number of transmit ports of the multi-channel transmit processor is M, the number of read ports of the register file of the multi-channel transmit processor is M, and the number of write ports is M. The second acquisition module 502 includes: The second determination submodule is used to determine the third instruction to be submitted among all instructions; The second acquisition submodule is used to acquire the third number of operands that each of the third instructions needs to write back; The second comparison submodule is used to compare the third quantity with M to obtain the comparison result.
[0114] Optionally, the processing module 503 includes: The fifth processing submodule is used to select at least one additional executed fourth instruction according to the instruction completion order when the comparison result is that the third quantity is less than M. The sixth processing submodule is used to submit the third instruction together with the fourth instruction through the multiplexer processor. Each of the selected fourth instructions must read the sum of the fourth and third quantities of the source operands, which is less than or equal to M.
[0115] In summary, in this embodiment, the operand requirement information of the instruction can be obtained, and this requirement information can be compared with the number of ports of the multi-channel sender processor to obtain a comparison result. This comparison result reflects whether the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction. Subsequently, the multi-channel sender processor can be used to process the instruction. If the comparison result shows that the current port resources of the multi-channel sender processor can meet the actual requirements of the instruction and there is still spare capacity, a larger number of instructions can be selected for processing in the same cycle. This application increases the number of instructions processed by the multi-channel sender processor in a single cycle without changing the specifications of the multi-channel sender processor itself, thereby improving the instruction processing efficiency, and the overall cost of the solution is low.
[0116] As the device embodiment is basically similar to the method embodiment, the description is relatively simple, and relevant parts can be found in the description of the method embodiment.
[0117] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other.
[0118] Regarding the apparatus in the above embodiments, the specific manner in which each module performs its operation has been described in detail in the embodiments related to the method, and will not be elaborated upon here.
[0119] This application provides an instruction processing apparatus, including a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising methods for performing the methods described in one or more of the above embodiments.
[0120] Figure 8 This is a block diagram illustrating a first electronic device 400 according to an exemplary embodiment. For example, the first electronic device 400 may be a mobile phone, computer, digital broadcasting terminal, messaging device, game console, tablet device, medical device, fitness equipment, personal digital assistant, etc.
[0121] Reference Figure 8 The first electronic device 400 may include one or more of the following components: a processing component 402, a first memory 404, a power supply component 406, a multimedia component 408, an audio component 410, an input / output (I / O) interface 412, a sensor component 414, and a communication component 416.
[0122] Processing component 402 typically controls the overall operation of the first electronic device 400, such as operations associated with display, telephone calls, data communication, camera operation, and recording operations. Processing component 402 may include one or more processors 420 to execute instructions to perform all or part of the steps of the methods described above. Furthermore, processing component 402 may include one or more modules to facilitate interaction between processing component 402 and other components. For example, processing component 402 may include a multimedia module to facilitate interaction between multimedia component 408 and processing component 402.
[0123] The first memory 404 is used to store various types of data to support the operation of the first electronic device 400. Examples of such data include instructions for any application or method operating on the first electronic device 400, contact data, phone book data, messages, pictures, multimedia, etc. The first memory 404 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk.
[0124] Power supply component 406 provides power to various components of the first electronic device 400. Power supply component 406 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to the first electronic device 400.
[0125] Multimedia component 408 includes a screen that provides an output interface between the first electronic device 400 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touchscreen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensors may sense not only the boundaries of touch or swipe actions but also the duration and pressure associated with the touch or swipe operation. In some embodiments, multimedia component 408 includes a front-facing camera and / or a rear-facing camera. When the first electronic device 400 is in an operating mode, such as a shooting mode or a multimedia mode, the front-facing camera and / or the rear-facing camera may receive external multimedia data. Each front-facing camera and rear-facing camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
[0126] Audio component 410 is used to output and / or input audio signals. For example, audio component 410 includes a microphone (MIC) used to receive external audio signals when the first electronic device 400 is in an operating mode, such as a call mode, recording mode, or voice recognition mode. The received audio signals may be further stored in the first memory 404 or transmitted via communication component 416. In some embodiments, audio component 410 also includes a speaker for outputting audio signals.
[0127] I / O interface 412 provides an interface between processing component 402 and peripheral interface modules, such as keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to, home buttons, volume buttons, power buttons, and lock buttons.
[0128] Sensor assembly 414 includes one or more sensors for providing state assessments of various aspects of the first electronic device 400. For example, sensor assembly 414 may detect the on / off state of the first electronic device 400, the relative positioning of components such as the display and keypad of the first electronic device 400, changes in position of the first electronic device 400 or a component of the first electronic device 400, the presence or absence of user contact with the first electronic device 400, the orientation or acceleration / deceleration of the first electronic device 400, and temperature changes of the first electronic device 400. Sensor assembly 414 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 414 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, sensor assembly 414 may also include an accelerometer, a gyroscope, a magnetometer, a pressure sensor, or a temperature sensor.
[0129] Communication component 416 facilitates wired or wireless communication between the first electronic device 400 and other devices. The first electronic device 400 can access wireless networks based on communication standards, such as WiFi, carrier networks (such as 2G, 3G, 4G, or 5G), or combinations thereof. In one exemplary embodiment, communication component 416 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, communication component 416 also includes a near-field communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
[0130] In an exemplary embodiment, the first electronic device 400 may be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to implement the methods provided in the embodiments of this application.
[0131] In an exemplary embodiment, a non-transitory computer-readable storage medium including instructions is also provided, such as a first memory 404 including instructions that can be executed by a processor 420 of a first electronic device 400 to perform the above-described method. For example, the non-transitory storage medium may be a ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, and optical data storage device, etc.
[0132] Figure 9 This is a block diagram illustrating a second electronic device 500 according to an exemplary embodiment. For example, the second electronic device 500 may be provided as a server. (Refer to...) Figure 9 The second electronic device 500 includes a processing component 522, which further includes one or more processors, and memory resources represented by a second memory 532 for storing instructions, such as application programs, that can be executed by the processing component 522. The application programs stored in the second memory 532 may include one or more modules, each corresponding to a set of instructions. Furthermore, the processing component 522 is configured to execute instructions to perform the methods provided in the embodiments of this application.
[0133] The second electronic device 500 may also include a power supply component 526 configured to perform power management of the second electronic device 500, a wired or wireless network interface 550 configured to connect the second electronic device 500 to a network, and an input / output (I / O) interface 558. The second electronic device 500 may operate on an operating system stored in a second memory 532, such as Windows Server™, MacOSX™, Unix™, Linux™, FreeBSD™, or similar.
[0134] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the methods described in the above embodiments.
[0135] Other embodiments of this application will readily occur to those skilled in the art upon consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of this application that follow the general principles of this application and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only.
[0136] It should be understood that this application is not limited to the precise structure described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope.
Claims
1. A method for processing instructions, characterized in that, The method includes: Obtain the number of ports of the multi-channel transmitter processor; Obtain the operand requirement information of the instruction, and compare the requirement information with the number of ports to obtain the comparison result; Based on the comparison results, the instructions are processed by the multi-channel transmitter processor.
2. The method according to claim 1, characterized in that, The multi-channel transmitter processor includes: a basic output channel, a new output channel, a basic decoder, and a new decoder; The information regarding the operand requirements of the acquisition instruction includes: Each of the basic output channels transmits the corresponding instruction to the corresponding basic decoder for decoding, thereby obtaining the instruction's requirement information; If the next instruction is a target instruction, the target instruction is transmitted from the new output channel to the new decoder for decoding to obtain the requirement information of the target instruction. The target instruction is an instruction whose required number of source operands is less than or equal to 1.
3. The method according to claim 1, characterized in that, The step of processing the instruction through the multiplexer based on the comparison result includes: When the comparison result indicates that the number of ports can meet the required information, while processing the first instruction to be processed among all instructions in the current cycle, at least one additional second instruction is selected for synchronous processing. When the comparison result indicates that the number of ports cannot meet the required information, only the first instruction to be processed among all instructions will be processed in the current cycle; The second instruction is an instruction whose required number of source operands is less than or equal to 1.
4. The method according to claim 1, characterized in that, The number of sends by the multiplexer is N, the number of read ports for the register file of the multiplexer is 2N, and the acquisition of the operand requirement information for the instruction includes: Determine the first command to be launched from all commands; Obtain the first number of source operands that each of the first instructions needs to read; The step of comparing the demand information with the number of channels to obtain the comparison result includes: The first quantity is compared with 2N to obtain the comparison result.
5. The method according to claim 4, characterized in that, The step of processing the instruction through the multiplexer based on the comparison result includes: If the comparison result is that the first number is less than 2N, at least one additional second instruction is selected from the instruction buffer. The second instruction is transmitted together with the first instruction via the multiplexer processor. The second instruction is an instruction whose required number of source operands is less than or equal to 1. Each selected second instruction needs to read the sum of the second number and the first number of source operands, which is less than or equal to 2N.
6. The method according to claim 1, characterized in that, The number of sends by the multiplexer is M, the number of read ports of the register file of the multiplexer is M, and the number of write ports is M. The step of obtaining the operand requirements of the instruction includes: Identify the third instruction to be submitted among all instructions; Obtain the third number of operands that each of the third instructions needs to write back; The step of comparing the demand information with the number of channels to obtain the comparison result includes: The third quantity is compared with M to obtain the comparison result.
7. The method according to claim 6, characterized in that, The step of processing the instruction through the multiplexer based on the comparison result includes: If the comparison result shows that the third quantity is less than M, at least one additional fourth instruction that has been executed is selected according to the instruction completion order. The third instruction is submitted together with the fourth instruction through the multiplexer; Each of the selected fourth instructions must read the sum of the fourth and third quantities of the source operands, which is less than or equal to M.
8. An instruction processing apparatus, characterized in that, The device includes: The first acquisition module is used to acquire the number of ports of the multi-channel transmitter processor; The second acquisition module is used to acquire the instruction's operand requirement information and compare the requirement information with the number of ports to obtain the comparison result; The processing module is used to process the instruction through the multiplexer processor based on the comparison result.
9. An electronic device, characterized in that, include: processor; Memory used to store the processor's executable instructions; The processor is configured to execute the instructions to implement the method as described in any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that, When the instructions in the computer-readable storage medium are executed by the processor of the electronic device, the electronic device is enabled to perform the method as described in any one of claims 1 to 7.