A method and system for outputting data streams at the front end of a pipeline
By acquiring prediction information from virtual addresses and converting it into instruction codes and micro-operations, and utilizing prediction pipelines and cache pipelines for rapid output, the complexity of pipeline front-end design and the problem of erroneous data flow are solved, thereby improving the processor's performance and accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NAT UNIV OF DEFENSE TECH
- Filing Date
- 2022-12-26
- Publication Date
- 2026-06-30
Smart Images

Figure CN115878189B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of microprocessor design technology, and in particular to a pipeline front-end data stream output method and system. Background Technology
[0002] In modern microprocessor design, to reduce the actual complexity of the pipeline backend and improve the utilization of execution units, instructions are further broken down into one or more micro-operations for scheduling and execution. In high-performance out-of-order pipeline design, the pipeline frontend typically obtains the instruction encoding to be executed through address translation and instruction buffering. Then, the decoding unit breaks the instruction down into micro-operations, which are recorded in a reorder buffer and simultaneously scheduled and executed out of order. After all micro-operations have been executed, the reorder buffer is responsible for committing the entire instruction.
[0003] To increase processor frequency while supporting complex instruction functions, current high-performance microprocessors have pipeline stages of ten or more. This excessively long pipeline design means that when a branch jump occurs, the processor's execution unit needs many stages to re-execute a valid operation, severely impacting processor performance. Furthermore, to maintain and improve processor performance, current processors are continuously increasing their issue width, which directly leads to a corresponding increase in the width of the decoding unit. As a unit containing a large amount of combinational logic, the increase in the width of the decoding unit will result in a rapid expansion of the number of logic gates, thereby greatly increasing the area and power consumption of the processor chip.
[0004] Therefore, micro-operation caching was proposed. It allows direct lookup of the corresponding micro-operation based on the virtual address. On the one hand, micro-operation caching can reduce the cost of pipeline flushing after a branch occurs; on the other hand, it can also avoid the rapid expansion of logic resources caused by decoding units. Currently, micro-operation caching is widely used in high-performance microprocessors. However, current pipeline front-end designs that support micro-operation lookup generally suffer from problems such as design complexity, high implementation difficulty, low accuracy of micro-operation lookup queries, and coarse-grained micro-operation flow switching.
[0005] Therefore, providing a pipeline front-end data stream output method and system that can both achieve rapid output of micro-operation data streams to the pipeline back end and effectively avoid performance loss caused by erroneous data streams is a problem that urgently needs to be solved by those skilled in the art. Summary of the Invention
[0006] The purpose of this invention is to provide a method and system for outputting data streams at the front end of a pipeline. This method is logically clear, safe, effective, reliable, and easy to operate. It can not only achieve rapid output of micro-operation data streams to the back end of the pipeline, but also effectively avoid performance loss caused by erroneous data streams.
[0007] Based on the above objectives, the technical solution provided by the present invention is as follows:
[0008] A method for outputting a pipeline front-end data stream includes the following steps:
[0009] Obtain and output the prediction information based on the virtual address;
[0010] The instruction code is obtained and output based on the virtual address and the prediction information;
[0011] The instruction code is converted into a first micro-operation and output;
[0012] A second micro-operation is output based on the virtual address and the prediction information;
[0013] The first micro-operation and the second micro-operation are determined and selected according to preset rules, and then output to the back end of the pipeline.
[0014] Preferably, the prediction information includes: branch direction information and branch position information.
[0015] Preferably, obtaining the prediction information based on the virtual address includes the following steps:
[0016] Pre-define several branches and hierarchize them;
[0017] Based on the virtual address, obtain the branch direction information and branch position information for each branch;
[0018] The branch direction information and branch position information of each branch are output respectively to obtain the instruction code and the second micro-operation.
[0019] Preferably, the step of obtaining and outputting the instruction code based on the virtual address and the prediction information includes the following steps:
[0020] Obtain the physical address based on the virtual address;
[0021] Based on the physical address index, determine whether the instruction cache has been hit;
[0022] If a hit occurs, obtain the first hit information;
[0023] Based on the first hit information and the branch direction information and branch position information of each branch, the corresponding instruction code is obtained and output.
[0024] Preferably, before outputting the second micro-operation based on the virtual address and the prediction information, the method further includes: caching the virtual address and the micro-operation cache query request.
[0025] Preferably, the step of outputting the second micro-operation based on the cached virtual address and the prediction information includes the following steps:
[0026] Based on the virtual address, determine whether the micro-operation cache has been hit;
[0027] If a hit occurs, then obtain the second hit information;
[0028] The second micro-operation is generated and output based on the second hit information, the branch direction information, and the branch position information of each branch.
[0029] Preferably, the step of determining and selecting the first micro-operation and the second micro-operation according to preset rules, and outputting them to the back end of the pipeline, includes the following steps:
[0030] Preset and initialize the stream identifier;
[0031] Based on the hit results of the instruction cache and the micro-operation cache, the stream identifier is injected.
[0032] Preset local stream identifier;
[0033] Determine whether the stream identifier and the local stream identifier are the same. If they are the same, select the final micro-operation from the first micro-operation and the second micro-operation and output it to the pipeline back end.
[0034] A pipeline front-end data stream output system includes: a prediction pipeline, an instruction pipeline, a micro-operation buffer pipeline, a transformation module, a selection module, and an output module;
[0035] The prediction pipeline is used to obtain and output prediction information based on the virtual address;
[0036] The instruction pipeline is used to obtain and output instruction codes based on the virtual address and the prediction information;
[0037] The conversion module is used to convert the instruction encoding into a first micro-operation;
[0038] The micro-operation pipeline is used to output a second micro-operation based on the virtual address and the prediction information;
[0039] The selection module is used to determine and select the first micro-operation and the second micro-operation to the back end of the production line according to preset rules.
[0040] Preferably, the prediction pipeline includes a multi-stage predictor;
[0041] Each level of the predictor is used to obtain the branch direction information and branch position information for each corresponding branch.
[0042] Preferably, the instruction cache pipeline includes: an address translation submodule, an instruction cache tag lookup submodule, and an instruction cache data lookup submodule;
[0043] The address translation submodule is used to obtain the physical address based on the virtual address;
[0044] The instruction cache tag submodule is used to determine whether the instruction cache is hit based on the physical address index, and if it is hit, to obtain the first hit information.
[0045] The instruction cache data submodule is used to obtain and output the corresponding instruction code based on the first hit information and the branch direction information and branch position information of each branch;
[0046] The micro-operation cache pipeline includes: a request cache submodule, a micro-operation cache tag lookup submodule, and a micro-operation cache data lookup submodule;
[0047] The request cache submodule is used to align the micro-operation cache pipeline with the prediction pipeline and the instruction cache pipeline;
[0048] The micro-operation cache tag lookup submodule is used to determine whether the micro-operation cache has been hit based on the cached virtual address. If it has been hit, the second hit information is obtained.
[0049] The micro-operation cache data lookup submodule is used to generate and output the second micro-operation based on the second hit information and the branch direction information and the branch position information of each branch;
[0050] The selection module includes: a setting submodule, an injection submodule, a judgment submodule, and an output submodule;
[0051] The settings submodule is used to set the stream identifier and the local stream identifier;
[0052] The injection submodule is used to inject the flow identifier into the instruction cache pipeline and / or the micro-operation cache pipeline according to the hit result of the instruction cache and the hit result of the micro-operation cache;
[0053] The judgment submodule is used to determine whether the flow identifier and the local flow identifier are the same. If they are the same, the final micro-operation is selected from the first micro-operation and the second micro-operation.
[0054] The output submodule is used to output the final micro-operation to the back end of the pipeline.
[0055] The pipeline front-end data stream output method provided by this invention obtains prediction information through a virtual address, then obtains and outputs instruction codes using the virtual address and prediction information, converting the instruction codes into a first micro-operation; a second micro-operation is directly output based on the virtual address and prediction information; and a final micro-operation is selected from the first and second micro-operations according to preset rules and output to the pipeline back-end. During operation, by aligning the first and second micro-operations and setting preset rules, the final micro-operation is determined and selected, thereby achieving rapid output to the pipeline back-end. Furthermore, because the determination and selection are based on preset rules, performance loss caused by erroneous data streams in the generated micro-operations can be avoided, improving processor efficiency and accuracy.
[0056] The present invention also provides a pipeline front-end data stream output system. This pipeline front-end data stream output system adopts the same technical concept and solves the same technical problem as the pipeline front-end data stream output method. Therefore, this pipeline front-end data stream output system should have the same beneficial effects as the pipeline front-end data stream output method, which will not be elaborated here. Attached Figure Description
[0057] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0058] Figure 1 A flowchart of a pipeline front-end data stream output method provided in an embodiment of the present invention;
[0059] Figure 2 This is a flowchart of the method for step S1 provided in an embodiment of the present invention;
[0060] Figure 3 This is a schematic diagram of the predictive pipeline provided in an embodiment of the present invention;
[0061] Figure 4 This is a flowchart of the method for step S2 provided in an embodiment of the present invention;
[0062] Figure 5 This is a schematic diagram of the instruction cache pipeline provided in an embodiment of the present invention;
[0063] Figure 6 This is a flowchart of the method for step S4 provided in an embodiment of the present invention;
[0064] Figure 7 This is a schematic diagram of the micro-operation buffer pipeline provided in an embodiment of the present invention;
[0065] Figure 8 This is a flowchart of step S5 provided in an embodiment of the present invention;
[0066] Figure 9 This is a schematic diagram of a pipeline front-end data stream output system provided in an embodiment of the present invention. Detailed Implementation
[0067] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0068] The embodiments of this invention are written in a progressive manner.
[0069] This invention provides a method and system for outputting data streams at the front end of a pipeline. It primarily addresses the technical problem in the prior art of rapidly outputting micro-operation data streams to the back end of the pipeline while effectively avoiding performance loss caused by erroneous data streams.
[0070] like Figure 1 As shown, a method for outputting a pipeline front-end data stream includes the following steps:
[0071] S1. Obtain and output the prediction information based on the virtual address;
[0072] S2. Obtain and output the instruction code based on the virtual address and prediction information;
[0073] S3. Convert the instruction code into the first micro-operation and output it;
[0074] S4. Output the second micro-operation based on the virtual address and prediction information;
[0075] S5. Based on preset rules, determine and select the first micro-operation and the second micro-operation, and output them to the back end of the pipeline.
[0076] In step S1, the virtual address is obtained, the prediction pipeline makes a prediction based on the virtual address, obtains the prediction information, and outputs the prediction information to the instruction cache pipeline and the micro-operation cache pipeline respectively.
[0077] In step S2, the instruction cache pipeline queries the virtual address and obtains and outputs the instruction binary code based on the prediction information;
[0078] In step S3, the conversion module converts the instruction binary code into the first micro-operation and outputs it to the selection module. In this embodiment, the conversion module can select the decoding unit.
[0079] In step S4, the micro-operation pipeline directly generates the second micro-operation based on the virtual address and prediction information and outputs it to the selection module;
[0080] In step S5, the selection module judges and selects the micro-operations that meet the requirements from the first micro-operation and the second micro-operation according to the preset judgment rules and sends them to the back end of the pipeline.
[0081] Preferably, the prediction information includes: branch direction information and branch position information.
[0082] In practical applications, the prediction information includes branch direction information and branch position information. After determining the branch direction information and branch position information, the label array search and data array search are performed in stages based on the branch direction information and branch position information. Through the above operations, instruction codes are generated or the second micro-operation is directly generated.
[0083] like Figure 2 As shown, preferably, step S1 specifically includes the following steps:
[0084] A1. Pre-define several branches and hierarchize them;
[0085] A2. Based on the virtual address, obtain the branch direction information and branch position information for each branch;
[0086] A3. Output the branch direction information and branch position information of each branch to obtain the instruction code and the second micro-operation.
[0087] In step A1, the predicted pipeline has several branches pre-set and graded;
[0088] In step A2, the prediction pipeline obtains the branch direction information and branch position information of each branch based on the virtual address;
[0089] In step A3, the prediction pipeline outputs the branch direction information and branch position information of each branch to the instruction buffer pipeline and the micro-operation buffer pipeline, respectively, to obtain the instruction encoding and the second micro-operation.
[0090] In this embodiment, the prediction pipeline is equipped with multiple stages of predictors, each stage including direction prediction and branch position prediction. For example... Figure 3As shown, taking a three-stage predictor as an example, the prediction pipeline consists of three stages: p1, p2, and p3. The first-stage branch predictor outputs information about the p1 branch direction and p1 branch position; the second-stage branch predictor outputs information about the p2 branch direction and p2 branch position; and the third-stage branch predictor outputs information about the p3 branch direction and p3 branch position. This branch direction and position information is then output to the corresponding stages of the instruction buffer pipeline and the micro-operation buffer pipeline, respectively.
[0091] like Figure 4 As shown, preferably, step S2 specifically includes the following steps:
[0092] B1. Obtain the physical address based on the virtual address;
[0093] B2. Determine if the instruction cache has been hit based on the physical address index;
[0094] B3. If a hit occurs, obtain the first hit information;
[0095] B4. Obtain the corresponding instruction code based on the first hit information and the branch direction and branch position information of each branch, and output it.
[0096] In step B1, the instruction cache pipeline translates the virtual address into a physical address;
[0097] In step B2, the instruction cache pipeline determines whether the instruction cache has been hit based on the physical address index;
[0098] In step B3, if an instruction cache hit occurs, the instruction cache pipeline obtains the first hit information;
[0099] In step B4, the instruction buffer pipeline obtains and outputs the corresponding instruction code based on the first hit information and the branch direction and branch position information of each branch that has been acquired.
[0100] like Figure 5As shown, in this embodiment, the instruction cache pipeline needs to perform address translation, instruction cache tag lookup (i.e., ICache tag array lookup), and instruction cache data lookup (i.e., ICache data array lookup). Specifically, the address translation submodule translates the virtual address into a physical address; the instruction cache tag lookup submodule (i.e., the ICache tag array lookup submodule) determines whether the ICache has a hit and retrieves the hit information based on the physical address index; finally, the instruction cache data lookup submodule (i.e., the ICache data array lookup submodule) retrieves the corresponding instruction binary code information from the data array based on the hit information. The figure illustrates this using a three-level instruction cache pipeline as an example. To reduce unnecessary operations, the address translation submodule subsequently determines whether a subsequent tag array lookup is needed based on the p1 branch information: if p1 branch is canceled, the request in stage f1 will not be passed to stage f2; otherwise, if p1 branch is accepted, the request in stage f1 will modify the number of values retrieved based on the p1 branch position. Similarly, for stages f2 and f3, response operations are performed based on the branch information of stages p2 and p3, respectively. Furthermore, based on the result of the tag array lookup, the instruction cache pipeline passes the ICache hit information to the micro-operation cache pipeline, which then passes the MCache hit information to stage f2, thus obtaining the final hit result of stage f2. If the ICache tag array is a hit, and the MCache also indicates a hit, then stage f2 will not pass the query request to stage f3.
[0101] Preferably, before step S4, the method further includes: caching virtual addresses and micro-operation cache query requests.
[0102] In practical applications, before step S4, it is also necessary to cache the query requests for virtual addresses and micro-operation caches. In this embodiment, the request cache submodule in the micro-operation cache pipeline requests the query requests for virtual addresses and micro-operation caches in order to align the micro-operation cache pipeline with the prediction pipeline and the instruction cache pipeline.
[0103] like Figure 6 As shown, preferably, step S4 specifically includes the following steps:
[0104] C1. Determine whether the micro-operation cache has been hit based on the virtual address;
[0105] C2. If a hit occurs, obtain the second hit information;
[0106] C3. Generate and output the second micro-operation based on the second hit information and the branch direction and branch position information of each branch.
[0107] In step C1, the micro-operation cache pipeline determines whether the micro-operation cache has been hit based on the virtual address;
[0108] In step C2, if a hit occurs, the micro-operation buffer pipeline obtains the second hit information;
[0109] In step C3, the micro-operation buffer pipeline generates and outputs the second micro-operation based on the acquired second hit information and the branch direction and branch position information of each branch.
[0110] like Figure 7 As shown, in this embodiment, the micro-operation cache pipeline needs to perform request caching, (micro-operation cache tag lookup) MCache tag array lookup, and (micro-operation cache data lookup) MCache data array lookup. As illustrated, after the micro-operation cache pipeline aligns with the prediction pipeline and the instruction cache pipeline, the remaining two stages correspond to MCache tag lookup and MCache data array lookup, respectively. Similar to the instruction cache pipeline, stages m1, m2, and m3 cancel and adjust queries based on information from stages p1-p3 of the prediction pipeline. Furthermore, it needs to interact with the instruction cache pipeline in stage m2 to exchange ICache hit information and MCache hit information. If ICache hits in stage f2 and the MCache tag array lookup also hits, the request continues into stage m3; otherwise, if only one of them hits or neither hits, the request no longer flows into stage m3.
[0111] like Figure 8 As shown, preferably, step S5 specifically includes the following steps:
[0112] D1. Preset and initialize the stream identifier;
[0113] D2. Based on the hit results of the instruction cache and the micro-operation cache, inject the stream identifier;
[0114] D3. Preset local stream identifier;
[0115] D4. Determine if the stream identifier and the local stream identifier are the same. If they are the same, select the final micro-operation from the first micro-operation and the second micro-operation and output it to the pipeline back end.
[0116] In step D1, the flow identifier is set and initialized;
[0117] In step D2, the stream identifier is injected into the instruction cache pipeline and / or the micro-operation cache pipeline respectively, based on the hit results of the instruction cache and the micro-operation cache.
[0118] In step D3, set the local stream identifier;
[0119] In step D4, it is determined whether the stream identifier and the local stream identifier are the same according to the preset rules. If they are the same, the final micro-operation is selected from the first micro-operation and the second micro-operation and output to the pipeline back end.
[0120] In this embodiment, a two-bit stream identifier (i.e., StreamID) is set. StreamID is initialized to 0. The instruction buffer pipeline and the micro-operation buffer pipeline inject StreamID information into the pipeline as needed:
[0121] a. Case 1: If both the IICache tag and MCache tag array lookups are successful during the m2 / f2 phase, then the StreamID is injected into the micro-operation during the m2 phase, causing the MCache data array to acquire additional StreamID information within the micro-operation data. The specific injected StreamID value and its changes are as follows:
[0122] a) If the current StreamID is 2'bx1, then the StreamID remains unchanged, and the injected StreamID is the current StreamID;
[0123] b) If the current StreamID is 2'bx0, then the StreamID becomes 2'bx1, and the injected value is also 2'bx1 (the x value remains the same).
[0124] b. Case 2: In the f2 / m2 stage, the ICache tag array lookup is successful, but the MCache tag array lookup is unsuccessful. In this case, the StreamID is injected into the instruction encoding in the f2 stage, and the micro-operation data obtained after subsequent decoding carries additional StreamID information. The specific injected StreamID value and StreamID changes are as follows:
[0125] a) If the current StreamID is 2'bx0, then the StreamID remains unchanged, and the injected StreamID is the current StreamID;
[0126] b) If the current StreamID is 2'bx1, then the StreamID becomes 2'bx1+1, and the injected StreamID is also the previous StreamID+1.
[0127] c. Case 3: In the m2 / f2 stage, both the IICache tag and MCache tag array lookup are missing, the pipeline restarts, no StreamID injection is needed, and the StreamID remains unchanged.
[0128] Following this, a two-digit local stream identifier (LStreamID) is set. Initially, LStreamID is 0. Then, the corresponding micro-operation is selected for output and LStreamID is changed based on the following conditions:
[0129] a. Case 1: If the StreamID and LStreamID values in the first and second micro-operations are the same, then output the micro-operations that carry the StreamID value in this path.
[0130] b. Case 2: The StreamID and LStreamID in the first and second micro-operations are both different. In this case, increment LStreamID by one, and then perform the judgment again. At this time, the micro-operation that matches the changed LStreamID can be output.
[0131] A pipeline front-end data stream output system includes: a prediction pipeline, an instruction pipeline, a micro-operation buffer pipeline, a transformation module, a selection module, and an output module;
[0132] The prediction pipeline is used to obtain and output prediction information based on virtual addresses.
[0133] The instruction pipeline is used to obtain and output instruction codes based on virtual addresses and prediction information;
[0134] The conversion module is used to convert instruction codes into first micro-operations;
[0135] A micro-operation pipeline is used to output a second micro-operation based on the virtual address and prediction information;
[0136] The selection module is used to determine and select the first and second micro-operations to the back end of the pipeline according to preset rules.
[0137] In practical applications, the pipeline front-end data stream output system is equipped with a prediction pipeline, an instruction pipeline, a micro-operation buffer pipeline, a conversion module, a selection module, and an output module. The prediction pipeline obtains prediction information based on the virtual address and outputs it to the instruction buffer pipeline and the micro-operation buffer pipeline respectively. The instruction buffer pipeline obtains the instruction code based on the virtual address and the prediction information and outputs the instruction code to the conversion module. The conversion module converts the instruction code into a first micro-operation and outputs it to the selection module. The micro-operation pipeline generates a second micro-operation based on the virtual address and the prediction information and outputs it to the selection module. The selection module judges according to preset rules and selects the final micro-operation from the first and second micro-operations, and outputs the final micro-operation to the pipeline back end.
[0138] Preferably, the prediction pipeline includes multi-stage predictors;
[0139] Each predictor level is used to obtain the branch direction information and branch position information for each corresponding branch.
[0140] In practical applications, the prediction pipeline is equipped with multiple predictors; each predictor includes direction prediction and branch position prediction.
[0141] Preferably, the instruction cache pipeline includes: an address translation submodule, an instruction cache tag lookup submodule, and an instruction cache data lookup submodule;
[0142] The address translation submodule is used to obtain the physical address based on the virtual address.
[0143] The instruction cache tag submodule is used to determine whether the instruction cache has been hit based on the physical address index. If it has been hit, the first hit information is obtained.
[0144] The instruction cache data submodule is used to obtain and output the corresponding instruction code based on the first hit information and the branch direction and branch position information of each branch;
[0145] The micro-operation caching pipeline includes: a request caching submodule, a micro-operation cache tag lookup submodule, and a micro-operation cache data lookup submodule;
[0146] The request cache submodule is used to align the micro-operation cache pipeline with the prediction pipeline and the instruction cache pipeline;
[0147] The micro-operation cache tag lookup submodule is used to determine whether the micro-operation cache has been hit based on the cached virtual address. If it has been hit, the second hit information is obtained.
[0148] The micro-operation cache data lookup submodule is used to generate and output the second micro-operation based on the second hit information and the branch direction and branch position information of each branch;
[0149] The selection module includes: setting submodule, injection submodule, judgment submodule, and output submodule;
[0150] The settings submodule is used to set the stream identifier and the local stream identifier;
[0151] The injection submodule is used to inject flow identifiers into the instruction cache pipeline and / or the micro-operation cache pipeline based on the hit results of the instruction cache and the micro-operation cache.
[0152] The judgment submodule is used to determine whether the stream identifier and the local stream identifier are the same. If they are the same, the final micro-operation is selected from the first micro-operation and the second micro-operation.
[0153] The output submodule is used to output the final micro-operations to the back end of the pipeline.
[0154] In practical applications, the instruction cache pipeline includes an address translation submodule, an instruction cache tag lookup submodule, and an instruction cache data lookup submodule. The instruction cache pipeline performs address translation through the address translation submodule, tag lookup through the instruction cache tag lookup submodule, and data lookup through the instruction cache data submodule to retrieve the corresponding instruction binary code information.
[0155] The micro-operation pipeline includes a request cache submodule, a micro-operation cache tag lookup submodule, and a micro-operation cache data lookup submodule. The micro-operation pipeline uses the request cache submodule to query virtual addresses and micro-operation caches, aligning the micro-operation pipeline with the prediction pipeline and the instruction cache pipeline. After alignment, the micro-operation cache tag lookup submodule performs tag lookup, and the micro-operation cache data lookup submodule performs data lookup to generate the second micro-operation.
[0156] The selection module includes a setting submodule, an injection submodule, a judgment submodule, and an output submodule. The selection module sets the stream identifier and the local stream identifier through the setting submodule; injects the stream identifier into the instruction buffer pipeline and / or micro-operation buffer pipeline through the injection submodule; judges whether the stream identifier and the local stream identifier are the same, and if they are the same, selects the final micro-operation from the first micro-operation and the second micro-operation; and outputs the final micro-operation to the pipeline back end through the output submodule.
[0157] In the embodiments provided in this application, it should be understood that the disclosed methods and systems can be implemented in other ways. The apparatus embodiments described above are merely illustrative. For example, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods, such as: multiple modules or components can be combined, or integrated into another system, or some features can be ignored or not executed. In addition, the coupling, direct coupling, or communication connection between the various components shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or modules, and can be electrical, mechanical, or other forms.
[0158] Furthermore, in the various embodiments of the present invention, each functional module can be fully integrated into a processor, or each module can be a separate device, or two or more modules can be integrated into a device; each functional module in the various embodiments of the present invention can be implemented in hardware or in the form of hardware plus software functional units.
[0159] Those skilled in the art will understand that all or part of the steps of the above method embodiments can be implemented by program instructions and related hardware. The aforementioned program instructions can be stored in a computer-readable storage medium. When the program instructions are executed, they perform the steps of the above method embodiments. The aforementioned storage medium includes various media that can store program code, such as mobile storage devices, read-only memory (ROM), magnetic disks, or optical disks.
[0160] It should be understood that the use of terms such as "system," "device," "unit," and / or "module" in this application is merely one method of distinguishing different components, elements, parts, sections, or assemblies at different levels. However, if other terms can achieve the same purpose, they may be replaced by other expressions.
[0161] As indicated in this application and claims, unless the context clearly indicates otherwise, the words "a," "an," "a," and / or "the" are not specifically singular and may include the plural. Generally, the terms "comprising" and "including" only indicate the inclusion of expressly identified steps and elements, which do not constitute an exclusive list, and the method or apparatus may also include other steps or elements. An element defined by the phrase "comprising an..." does not exclude the presence of other identical elements in the process, method, product, or apparatus that includes the element.
[0162] In the description of the embodiments of this application, unless otherwise stated, " / " means "or", for example, A / B can mean A or B; "and / or" in this document is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. Furthermore, in the description of the embodiments of this application, "multiple" refers to two or more.
[0163] Hereinafter, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature.
[0164] If a flowchart is used in this application, it is used to illustrate the operations performed by the system according to embodiments of this application. It should be understood that the preceding or following operations are not necessarily performed in exact order. Instead, the steps can be processed in reverse order or simultaneously. Furthermore, other operations can be added to these processes, or one or more steps can be removed from them.
[0165] The foregoing has provided a detailed description of a pipeline front-end data stream output method and system provided by the present invention. The above description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the present invention is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims
1. A method for outputting a data stream at the front end of a pipeline, characterized in that, Includes the following steps: Obtain and output the prediction information based on the virtual address; The instruction code is obtained and output based on the virtual address and the prediction information; The instruction code is converted into a first micro-operation and output; A second micro-operation is output based on the virtual address and the prediction information; The first micro-operation and the second micro-operation are determined and selected according to preset rules, and then output to the back end of the pipeline. The step of obtaining prediction information based on the virtual address includes the following steps: Pre-define several branches and hierarchize them; Based on the virtual address, obtain the branch direction information and branch position information for each branch; The branch direction information and the branch position information of each branch are output respectively to obtain the instruction code and the second micro-operation; The step of obtaining and outputting the instruction code based on the virtual address and the prediction information includes the following steps: Obtain the physical address based on the virtual address; Based on the physical address index, determine whether the instruction cache has been hit; If a hit occurs, obtain the first hit information; Based on the first hit information and the branch direction information and branch position information of each branch, obtain the corresponding instruction code and output it; The second micro-operation is output based on the cached virtual address and the prediction information, including the following steps: Based on the virtual address, determine whether the micro-operation cache has been hit; If a hit occurs, then obtain the second hit information; The second micro-operation is generated and output based on the second hit information, the branch direction information, and the branch position information of each branch; The process involves determining and selecting the first micro-operation and the second micro-operation according to preset rules, and then outputting them to the back end of the pipeline, including the following steps: Preset and initialize the stream identifier; Based on the hit results of the instruction cache and the micro-operation cache, the preset local stream identifier is injected; Determine whether the stream identifier and the local stream identifier are the same. If they are the same, select the final micro-operation from the first micro-operation and the second micro-operation and output it to the pipeline back end.
2. The pipeline front-end data stream output method as described in claim 1, characterized in that, The prediction information includes: branch direction information and branch position information.
3. The pipeline front-end data stream output method as described in claim 2, characterized in that, Before outputting the second micro-operation based on the virtual address and the prediction information, the method further includes: caching the virtual address and the micro-operation cache query request.
4. A pipeline front-end data stream output system, characterized in that, The pipeline front-end data stream output method described in claim 1 includes: a prediction pipeline, an instruction pipeline, a micro-operation buffer pipeline, a conversion module, and a selection module; The prediction pipeline is used to obtain and output prediction information based on the virtual address; The instruction pipeline is used to obtain and output instruction codes based on the virtual address and the prediction information; The conversion module is used to convert the instruction encoding into a first micro-operation; The micro-operation buffer pipeline is used to output a second micro-operation based on the virtual address and the prediction information. The selection module is used to determine and select the first micro-operation and the second micro-operation to the back end of the production line according to preset rules.
5. The pipeline front-end data stream output system as described in claim 4, characterized in that, The prediction pipeline includes multiple stages of predictors; Each level of the predictor is used to obtain the branch direction information and branch position information for each corresponding branch.