A method for automatically generating an instruction matching template based on an LLVM intermediate representation

By automatically generating instruction matching templates for LLVM intermediate representations, the problems of error-prone manual template writing and high cross-architecture adaptation costs are solved, achieving efficient and automated instruction matching template generation, thus improving compilation and development efficiency.

CN122240078APending Publication Date: 2026-06-19FALCON TECHNOLOGY (GUANGZHOU) CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
FALCON TECHNOLOGY (GUANGZHOU) CO LTD
Filing Date
2026-03-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing technologies, the instruction matching templates of the LLVM compiler are prone to errors when written manually, have high debugging costs, require significant investment in cross-architecture adaptation development, and lack flexibility in obtaining instruction matching conditions, resulting in low compilation efficiency and high development costs.

Method used

By parsing the LLVM IR instruction description file to generate a directed acyclic graph, performing legalization and optimization operations, and automatically generating instruction matching templates in TableGen format to meet the target architecture constraints, the system reduces manual intervention and errors.

Benefits of technology

It achieves efficient automatic generation of instruction matching templates, reduces human error, improves development efficiency, reduces verification costs, and adapts to different architectures with varying degrees of automation.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240078A_ABST
    Figure CN122240078A_ABST
Patent Text Reader

Abstract

This invention discloses an automatic generation method for instruction matching templates based on LLVM intermediate representation, relating to the field of computer technology. It includes the following steps: Step 1, obtaining and parsing the LLVM IR instruction description file to generate a directed acyclic graph (DAG) of instruction operation behavior; Step 2, performing instruction behavior legalization processing on the DAG, analyzing types and operations incompatible with the target architecture in the DAG, and replacing them with operation sequences supported by the target architecture. This invention automatically generates instruction selection rules matching the target machine instruction set by analyzing the function semantic features, operand dependencies, and target architecture instruction set constraints of LLVM IR. These rules are used in the code generation stage of the LLVM compiler backend, solving the problems of low development efficiency, poor cross-hardware platform adaptability, and compiler code performance degradation caused by manual coding errors in existing technologies that involve manually writing instruction matching templates.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of computer technology, specifically to a method for automatically generating instruction matching templates based on LLVM intermediate representation. Background Technology

[0002] In the LLVM (Low-Level Virtual Machine) compiler framework, instruction selection is a crucial step in the compiler backend, responsible for converting machine-independent intermediate representations (IRs) into machine instructions for the target architecture. This process relies on instruction matching patterns, which describe how to map machine-independent IRs to target architecture instructions using predefined rules. Currently, LLVM instruction matching patterns are primarily written manually using TableGen (a declarative language) and integrated into instruction selectors such as SelectionDAG or GlobalISel. However, existing technologies suffer from the following problems: 1. Manually writing templates is prone to errors and has high debugging costs. Manually written templates are prone to issues such as missing operand constraints, incorrect instruction matching conditions, and incorrect output register bindings, leading to compilation failures or the generation of incorrect machine code. Furthermore, template errors typically only surface in complex code paths, requiring debugging to combine source code analysis and instruction behavior analysis, resulting in lengthy and costly repair processes.

[0003] Second, high development investment and high cross-architecture adaptation costs. Writing high-performance templates requires a thorough understanding of the target instruction set architecture and LLVM internal mechanisms (such as SelectionDAG node topology rules), resulting in a high development threshold. Furthermore, instruction optimization for different architectures (such as ARM Neon vs. x86 AVX) requires rewriting templates, and adapting to new instruction set architectures (such as RISC-VV extensions) necessitates repeated investment of development personnel.

[0004] Xi'an University of Electronic Science and Technology proposed a compiler for extended instruction sets in its patent application, "A compiler and compilation method for extensible instruction sets" (patent application number: 201911298413.1, patent publication number: CN111078290A). This compiler supports semantic matching when selecting instructions for newly added extended instruction sets. The method involves obtaining a compilation strategy file (.policy format) containing instruction matching conditions from a specified path, reading the matching conditions from the strategy file, setting a corresponding matching function for each matching condition, and saving it to a function table. This matching function identifies special code segments through pattern matching and replaces the special code segments with extended instructions.

[0005] The compiler can extract extended instruction information and register information from the instruction set description file for its support of extended instruction sets. However, the instruction matching condition information still requires input from a specified path file. That is, the instruction matching template of the extended instruction set is not extracted and parsed from the instruction set description file, but is instead obtained from the input of an external file. The method of obtaining instruction matching conditions is not flexible enough, which also increases the debugging cost and the investment of human resources in development. Therefore, this paper proposes an automatic generation method of instruction matching template based on LLVM intermediate representation to solve this problem. Summary of the Invention

[0006] Technical problems to be solved To address the shortcomings of existing technologies, this invention provides an automatic generation method for instruction matching templates based on LLVM intermediate representation, which solves the problems mentioned in the background section.

[0007] Technical solution To achieve the above objectives, the present invention provides the following technical solution: an automatic generation method for instruction matching templates based on LLVM intermediate representation, comprising the following steps: Step 1: Obtain and parse the LLVM IR instruction description file to generate a directed acyclic graph of instruction operation behavior; This step involves parsing the LLVM IR instruction description file to provide basic data for subsequent processing. The LLVM IR instruction description file is the behavioral carrier for the corresponding new instruction and includes at least the following: the name of the new instruction, the name and number of the instruction operands, and the LLVM IR function expression of the instruction behavior. Furthermore, the function name in the file is consistent with the target instruction name, and the number and name of the function parameters correspond one-to-one with the operands of the target instruction.

[0008] The instruction description file is parsed using LLVM IR functions to convert the instruction's behavioral logic into a directed acyclic graph (DAG) representing the instruction's operational behavior. The nodes of this DAG correspond to the instruction's operational steps, and the edges correspond to the operand dependencies.

[0009] Step 2: Perform instruction behavior legalization processing on the directed acyclic graph to obtain legal LLVM IR functions; This step involves replacing content in the DAG that is incompatible with the target architecture with a sequence of supported operations. Specifically, this includes analyzing the types and operations in the DAG that are incompatible with the target architecture and replacing them with a sequence of operations supported by the target architecture. The detailed processing method is as follows: For the absolute value (ABS) operation, replace it with a combination of SELECT and ICMP instructions; for the minimum value (MIN) / maximum value (MAX) operation, replace it with a combination of CMP and SELECT instructions. For 64-bit integer operations that are not supported by the target architecture, they are broken down into multiple 32-bit operation sequences. Repeat the validity check until all nodes in the directed acyclic graph conform to the instruction set constraints of the target architecture.

[0010] After completing the above processing, a valid LLVM IR function compatible with the target architecture is obtained.

[0011] Step 3: Input the valid LLVM IR function into the LLVM optimization pipeline to generate a simplified intermediate representation; This step simplifies the LLVM IR function through optimization operations. Specifically, the valid LLVM IR function is input into the LLVM optimization pipeline, and optimization operations are performed according to the preset optimization level (configurable as -O1, -O2, or -O3). Optimization operations include, but are not limited to, constant propagation optimization, dead code elimination optimization, loop invariant code hoisting, common subexpression elimination, and instruction merging.

[0012] These optimizations eliminate redundant computations, simplify control flow, and ultimately generate a more concise and efficient intermediate representation, reducing the complexity of subsequent template generation.

[0013] Step 4: Perform secondary legalization and data flow graph checks on the simplified intermediate representation to generate a TableGen format instruction matching template; This step converts the intermediate representation into an instruction matching template for the target architecture, specifically including the following sub-operations: Secondary legalization: The intermediate representation after LLVM optimization pipeline is subjected to a second type and operation legalization. This mainly handles incompatible operations and types added during the optimization process (such as immediate value types generated by constant propagation and type compatibility issues exposed by dead code elimination), ensuring that the final DAG fully conforms to the target architecture constraints.

[0014] Data flow graph inspection: Check if the data flow graph is a single basic block (a linear execution path without branches, loops, or exception handling). Check if the termination instruction of a single basic block is a combination of "store instruction + return instruction"; If the check fails, the corresponding error type is returned (such as multiple basic block error, illegal store instruction format error, missing return instruction error).

[0015] Operand input / output attribute analysis: Identify the type of operands, such as memory operands, register operands, and immediate operands; Distinguish operand attributes, such as input operands, output operands, and input / output operands; Mark the data type and bit width of the operand.

[0016] Root node recursive extraction: Starting from the store operation node in the data flow graph, trace the data dependencies in reverse to determine the root node; Recursively analyze operand-dependent nodes upwards, converting node operations and operands into TableGen syntax; When encountering an input parameter node, a constant node, or an atomic operation node that is already supported by the target architecture, terminate the recursion and output the instruction matching condition. The command matching conditions are integrated with the command name and parameter information to generate a complete TableGen format command matching template.

[0017] Step 5: Embed the instruction matching template into the instruction description code of the LLVM backend target architecture; This step involves implementing the template in practice. Specifically, the TableGen format instruction matching template is embedded as a patch into the instruction description code of the LLVM backend target architecture. After being compiled by the TableGen tool, the template is integrated into the LLVM compiler backend, enabling semantic matching and machine code generation of newly added instructions during the software compilation process.

[0018] Beneficial effects The present invention has the following beneficial effects: (1) The automatic generation method of instruction matching template based on LLVM intermediate representation takes a general LLVM intermediate representation as input, optimizes and performs two-level legalization processing (initial legalization and optimized legalization) on the LLVM intermediate representation to meet the operation type restrictions of the target architecture, and then automatically generates instruction matching template. Compared with the traditional method of manually writing instruction matching template, this invention greatly reduces manual intervention, avoids human error, improves development efficiency, and reduces verification costs by automatically parsing LLVM IR functions and generating instruction matching template.

[0019] Of course, any product implementing this invention does not necessarily need to achieve all of the advantages described above at the same time. Attached Figure Description

[0020] Figure 1 This is a flowchart of an automatic generation method for instruction matching templates based on LLVM intermediate representation according to the present invention; Figure 2 This is a schematic diagram illustrating the application implementation process of an instruction matching template based on LLVM intermediate representation according to the present invention. Figure 3This is a schematic diagram of the internal processing flow of the instruction matching template generator of the present invention; Figure 4 This is a flowchart of the instruction matching template generation process in an embodiment of the present invention. Detailed Implementation

[0021] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0022] An automatic generation method for instruction matching templates based on LLVM intermediate representation, the specific implementation steps of which are as follows: like Figures 1-2 As shown, this embodiment takes "add_custom (addition instruction for 32-bit integers with overflow flag)" as an example. First, execute... Figure 1 The "Prepare LLVM IR Function" operation in step S101 is then performed via... Figure 2 Step S102 of the detailed process calls the generator to execute the entire process completely: Step S101: Prepare the LLVM IR function for the instruction behavior. Write the LLVM IR function (instruction description file) corresponding to the add_custom instruction. The function name is the same as the instruction name, and the parameters correspond to the instruction operands. ; LLVM IR behavior description function of the add_custom instruction define i32 @add_custom(i32 %a, i32 %b) { %temp = add nsw i32 %a, %b ; Signed addition without overflow %overflow = icmp slt i32 %temp, %a ; Detect overflow %res = select i1 %overflow, i32 -1, i32 %temp ; Returns -1 if overflow occurs, otherwise returns the result. The result is written to memory using the command: `store i32 %res, i32* @out_mem;` ret i32 %res } Step S102: Call the instruction matching template generator, input the above LLVM IR function file into the instruction matching template generator (a tool developed based on LLVM15.0), and let the generator perform "validity check, optimization, and template conversion" operations.

[0023] Step S103: Generate instruction matching template. The generator outputs the TableGen format instruction matching template corresponding to add_custom.

[0024] Step S104: Template embedding software compilation process (execute llvm-tblgen -gen-instr-info -Iinclude RISCVInstrInfo.td -o RISCVInstrInfo.inc), the generated template is added to RISCVInstrInfo.td in the LLVM backend as a code patch (taking the RISC-V architecture as an example); after the compiled software is compiled into the LLVM intermediate representation by Clang, LLVM will call the template to complete the matching of add_custom instructions and the generation of target machine code.

[0025] Instruction matching template generator process like Figure 2 As shown, the specific operation of the instruction matching template generator process (i.e., the internal refinement process of step S102) is as follows: Step S201: Obtain and load the instruction description file. The generator reads the LLVM IR function file in step S101, loads it into the LLVM IR memory container, and verifies that the syntax conforms to the LLVM IR15.0 specification.

[0026] Step S202: Parse LLVM IR function information and extract function information: function name add_custom, number of parameters 2 (%a / %b, both of i32 type), return value type i32.

[0027] Step S203: Check the validity of the LLVM IR function. If the addnsw operation in the function has no direct corresponding instruction in the target RISC-V architecture, it is determined to be "illegal operation".

[0028] Step S204: LLVM IR function validation process, replacing the incompatible addnsw+icmpslt combination with an operation sequence supported by RISC-V: Replace `add nsw i32 %a, %b` with `add i32 %a, %b`. Replace the original overflow detection logic with sub i32 %temp, %a+icmp sgt i32 %sub_res, %b.

[0029] Step S205: Execute the LLVM optimization pipeline, configure the optimization level to -O2 (the preset optimization level can be flexibly selected as -O1, -O2 or -O3 according to actual needs), and execute the following on the legalized IR function: Constant propagation: Eliminates redundant temporary variables within functions; Dead code elimination: Removes unreferenced intermediate compute nodes; Instruction merging: Merge consecutive sub+icmp into a single-node logic.

[0030] Step S206: Execute instruction matching template generation, convert the optimized IR into a SelectionDAG data flow graph, and execute subsequent core generation logic.

[0031] Step S207: Check the validity of the template and verify that the generated TableGen template conforms to the instruction constraints of the RISC-V architecture (such as operand width of 32 bits and memory operand address alignment).

[0032] Step S208: If the template validity check fails, the tool outputs detailed error information, including the error type (such as operand type mismatch, illegal instruction format) and error location (such as the specific operation node or operand in the template), so that the developer can adjust the LLVM IR function or target architecture constraints and re-execute the process.

[0033] Step S209: Output instructions match the template, and the generator writes the valid template to the file add_custom_template.td.

[0034] III. Core Generation Process of Instruction Matching Template like Figure 3 As shown, the core generation process of the instruction matching template (i.e., the refinement process of step S206) performs the following operations on the data flow graph of the add_custom instruction: Step S301: Input the optimized data flow graph and obtain the optimized SelectionDAG in step S205: the nodes include add, sub, icmp, store, and return.

[0035] Step S302: Legalize types and operations by mapping "i32 type" in the data flow graph to the GPR32 register type of RISC-V to ensure that the operation is compatible with the target architecture bit width.

[0036] Step S303: Check if it is a single basic block. If the data flow graph has no branches or loop nodes, it is determined to be a "single basic block".

[0037] Step S304: Check the termination instruction. The termination instruction of the basic block is store i32 %res,i32* @out_mem+ret i32 %res, which meets the requirement of "store+return combination".

[0038] Step S306: Operand Input / Output Analysis: Identify operand types: %a / %b are register operands, and @out_mem are memory operands; Distinguishing attributes: %a / %b are input operands, %res are output operands; Marker width: All operands are 32 bits.

[0039] Step S307: Root node analysis and extraction. Tracing back from the "value operand %res" of the store instruction, the root node is determined to be the select instruction (corresponding to the result after overflow judgment).

[0040] Steps S308-S309: Node instruction operation and dependency tracing: Convert the root node select to TableGen syntax: (select i1:$overflow,i32:-1,i32:$temp); Tracing back up the dependency node add of %temp, it is transformed into: (add i32:$a,i32:$b); Tracing the dependency node icmp of $overflow, it is transformed into: (icmp slt i32:$temp,i32:$a).

[0041] Steps S310-S311: Nested output template. The recursion terminates when the input parameters %a / %b (leaf nodes) are reached. The nodes are nested according to their dependencies, and the final TableGen template is output. def : Pat < (i32(select(icmp slt(i32(add i32:$a,i32:$b)),i32:$a),i32:-1,(i32(addi32:$a,i32:$b)))), (add_custom i32:$a,i32:$b) > In this embodiment, the add_custom instruction matching template automatically generated through the above process can be directly embedded into the LLVM backend and compiled through TableGen, ultimately achieving automatic matching and machine code generation of the add_custom instruction in the software to be compiled.

[0042] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such process, method, article, or apparatus.

[0043] The preferred embodiments of the present invention disclosed above are merely illustrative of the invention. These preferred embodiments do not exhaustively describe all details, nor do they limit the invention to the specific implementations described. Clearly, many modifications and variations can be made based on the content of this specification. This specification selects and specifically describes these embodiments to better explain the principles and practical applications of the invention, thereby enabling those skilled in the art to better understand and utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims

1. A method for automatically generating instruction matching templates based on LLVM intermediate representation, characterized in that, Includes the following steps: Step 1: Obtain and parse the LLVM IR instruction description file to generate a directed acyclic graph of instruction operation behavior; Step 2: Perform instruction behavior legalization processing on the directed acyclic graph, analyze the types and operations in the directed acyclic graph that are not compatible with the target architecture, and replace them with operation sequences supported by the target architecture to obtain legal LLVM IR functions; Step 3: Input the valid LLVM IR function into the LLVM optimization pipeline, perform optimization operations according to the preset optimization level, and generate a simplified intermediate representation; Step 4: Perform secondary legalization and data flow graph checks on the simplified intermediate representation, analyze the operand input and output attributes, and recursively extract and generate instruction matching templates in TableGen format through the root node; Step 5: Embed the instruction matching template as a patch into the instruction description code of the LLVM backend target architecture, compile it using the TableGen tool, and then integrate it into the LLVM compiler backend.

2. The method for automatically generating instruction matching templates based on LLVM intermediate representation according to claim 1, characterized in that, The LLVM IR instruction description file contains the name of the new instruction, the name and number of operands, and the LLVM IR function expression of the instruction behavior.

3. The method for automatically generating instruction matching templates based on LLVM intermediate representation according to claim 1, characterized in that, The function names in the LLVM IR instruction description file are consistent with the target instruction names, and the number and names of the function parameters correspond one-to-one with the operands of the target instruction.

4. The method for automatically generating instruction matching templates based on LLVM intermediate representation according to claim 1, characterized in that, The legalization process for the command behavior includes the following steps: Step 1: For absolute value operations, replace them with a combination of SELECT and ICMP instructions; for minimum / maximum value operations, replace them with a combination of CMP and SELECT instructions. Step 2: For 64-bit integer operations that are not supported by the target architecture, break them down into multiple 32-bit operation sequences; Step 3: Repeat the validity check until all nodes in the directed acyclic graph conform to the target architecture constraints.

5. The method for automatically generating instruction matching templates based on LLVM intermediate representation according to claim 1, characterized in that, The optimization operations performed by the LLVM optimization pipeline include at least one of constant propagation, dead code elimination, loop invariant code hoisting, common subexpression elimination, and instruction merging.

6. The method for automatically generating instruction matching templates based on LLVM intermediate representation according to claim 1, characterized in that, The secondary legalization process replaces incompatible operations and types newly generated during the optimization process in the simplified intermediate representation. These incompatible operations and types include immediate value types generated by constant propagation and type compatibility issues exposed by dead code elimination.

7. The method for automatically generating instruction matching templates based on LLVM intermediate representation according to claim 1, characterized in that, The data flow graph inspection includes the following steps: Step 1: Check whether the data flow graph is a single basic block, where the single basic block is a linear execution path without branches, loops, or exception handling; Step 2: Check whether the termination instruction of the single basic block is a combination of the store instruction and the return instruction; Step 3: If the check fails, return the corresponding error type, which includes multiple basic block errors, illegal store instruction format errors, and missing return instruction errors.

8. The method for automatically generating instruction matching templates based on LLVM intermediate representation according to claim 1, characterized in that, The operand input / output attribute analysis includes the following steps: Step 1: Identify the operand type, which includes memory operands, register operands, and immediate operands; Step 2: Distinguish the operand attributes, which include input operands, output operands, and input / output operands; Step 3: Mark the data type and bit width of the operand.

9. The method for automatically generating instruction matching templates based on LLVM intermediate representation according to claim 1, characterized in that, The recursive extraction of the root node includes the following steps: Step 1: Starting from the store operation node in the data flow graph, trace the data dependencies in reverse to determine the root node; Step 2: Recursively analyze operand-dependent nodes upwards, converting node operations and operands into TableGen syntax; Step 3: When encountering an input parameter node, a constant node, or an atomic operation node that is already supported by the target architecture, terminate the recursion and output the instruction matching condition. Step 4: Integrate the command matching conditions with the command name and parameter information to generate a complete TableGen format command matching template.