A vulnerability detection method, device and equipment for ARM architecture UEFI firmware and a medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By developing a UEFI vulnerability analysis plugin and decompilation tool, and combining symbol recovery and vulnerability detection rule base, the accuracy problem of vulnerability detection in ARM architecture UEFI firmware was solved, achieving efficient and accurate vulnerability identification.

CN122241716APending Publication Date: 2026-06-19北京大学长沙计算与数字经济研究院 +2

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: 北京大学长沙计算与数字经济研究院
Filing Date: 2026-03-23
Publication Date: 2026-06-19

Application Information

Patent Timeline

23 Mar 2026

Application

19 Jun 2026

Publication

CN122241716A

IPC: G06F21/57; G06F8/53

AI Tagging

Application Domain

Decompilation/disassemblyPlatform integrity maintainance

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Code processing method and apparatus, and device
EP4209896B1Decompilation/disassemblyProgram code adaption
A binary code similarity detection method and system based on source code remodeling
CN122173058ADecompilation/disassemblyCreation/generation of source codeControl flowPseudocode
Fuzzy testing methods, apparatus and electronic equipment
CN122086759AImprove efficiency improve accuracyDecompilation/disassemblyError detection/correction Test sample Theoretical computer science
A multi-architecture instruction analysis method, device, equipment, medium and product
CN122132086ADecompilation/disassemblyCreation/generation of source codeStatistical ReportDependability
Large Language Model Training and Program Analysis Methods, Devices and Electronic Equipment
CN122132043ADecompilation/disassemblyBiological models Linguistic model Theoretical computer science

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies cannot accurately detect vulnerabilities in ARM architecture UEFI firmware, especially due to the lack of decompilation tools and symbol information recovery methods for ARM architecture, resulting in incomplete vulnerability detection and a high false positive rate.

Method used

By developing a UEFI vulnerability analysis plugin, combined with decompilation tools to identify the instruction characteristics of ARM architecture UEFI firmware, using a preset UEFI symbol rule library to recover symbol information, and identifying vulnerable code based on a vulnerability detection rule library, generating decompiled intermediate language code, and verifying the existence of vulnerabilities by combining taint analysis and path constraint conditions.

Benefits of technology

It achieves accurate vulnerability detection of ARM architecture UEFI firmware, reduces false alarm rate, improves detection efficiency and accuracy, and can identify real vulnerabilities in closed-source firmware.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122241716A_ABST

Patent Text Reader

Abstract

This application relates to a method, apparatus, device, and medium for vulnerability detection of ARM architecture UEFI firmware. During the decompilation process of ARM architecture UEFI firmware, the method identifies the instruction characteristics of the UEFI firmware, performs decompilation based on these characteristics to obtain decompiled code, recovers the symbol information of the decompiled code based on a UEFI symbol rule library, identifies vulnerable code from the assembly code containing the recovered symbol information based on a vulnerability detection rule library, generates decompiled IR code based on the decompiled code, identifies taint sources and dangerous sinks in the vulnerable code based on the decompiled intermediate language code, and determines the set of propagating taints corresponding to the taint sources. If the set includes dangerous sinks, for each dangerous sink, the method extracts path constraints from the taint source to the dangerous sink based on the decompiled IR code, and verifies whether all path constraints are satisfied using a constraint solver, thereby determining whether the dangerous sink contains a real vulnerability. This method accurately detects vulnerabilities in ARM architecture UEFI firmware.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of firmware detection technology, and in particular to a method, apparatus, device and medium for vulnerability detection of ARM architecture UEFI firmware. Background Technology

[0002] UEFI (Unified Extensible Firmware Interface) defines the interface between the operating system and system firmware. It is responsible for initializing hardware and loading the operating system into memory during computer startup, replacing the traditional BIOS. UEFI firmware is firmware that implements the UEFI standard. Vulnerabilities in UEFI firmware can have significant impacts; therefore, the industry is actively researching how to detect UEFI firmware vulnerabilities.

[0003] Traditional methods typically perform static vulnerability detection on UEFI firmware, but these are limited to x86 / x64 architecture UEFI firmware and cannot handle ARM architecture (a widely used RISC processor architecture) UEFI firmware. Therefore, it is necessary to propose a method that can accurately detect vulnerabilities in ARM architecture UEFI firmware. Summary of the Invention

[0004] Therefore, it is necessary to provide a method, apparatus, computer device, computer-readable storage medium, and computer program product for detecting vulnerabilities in ARM architecture UEFI firmware, which can accurately detect vulnerabilities in ARM architecture UEFI firmware, in response to the above-mentioned technical problems.

[0005] Firstly, this application provides a vulnerability detection method for ARM architecture UEFI firmware. The method includes: During the decompilation process of ARM architecture UEFI firmware, the instruction characteristics of the ARM architecture UEFI firmware are identified, and decompilation is performed based on the instruction characteristics to obtain decompiled code; The decompiled code is processed to recover symbol information based on a preset UEFI symbol rule base; Based on the vulnerability detection rule library set for the ARM architecture UEFI firmware, vulnerable code is identified from the assembly code after recovering symbol information; Generate decompiled intermediate language code based on the decompiled code; Based on the identification of taint sources and dangerous sinks in the vulnerability code through the decompiled intermediate language code, the set of propagation taints corresponding to the taint sources in the decompiled intermediate language code is determined. If the set of propagated taints includes the dangerous sink, for each dangerous sink, the path constraints from the taint source to the dangerous sink are extracted based on the decompiled intermediate language code, and each path constraint is converted into a corresponding logical expression. The logical expression is then fed into the constraint solver to verify whether all the path constraints are satisfied. If so, the dangerous sink is determined to have a vulnerability; otherwise, the dangerous sink is determined not to have a vulnerability.

[0006] In one embodiment, the instruction features include at least one of the value of the least significant bit of the parameter register of a branch swap instruction or the value of the least significant bit of the function entry address; The decompilation process based on the instruction features to obtain decompiled code includes: Based on the value of the least significant bit of the parameter register of the branch exchange instruction or the value of the least significant bit of the function entry address, the target mode of each code block to be decompiled is distinguished; the target mode is either Thumb mode or ARM mode. Based on the code features corresponding to the target pattern, the code segment to be decompiled is decompiled to obtain the corresponding decompiled code.

[0007] In one embodiment, the UEFI symbol rule base includes the GUID and semantic feature information corresponding to each symbol object; The symbol information recovery process for the decompiled code based on a preset UEFI symbol rule base includes: The GUIDs in the UEFI symbol rule base are matched with the decompiled code to identify the GUIDs in the decompiled code; For the symbol object corresponding to the identified GUID in the decompiled code, add the corresponding semantic feature information and the corresponding structure information to restore the symbol information of the symbol object.

[0008] In one embodiment, the detection rules in the vulnerability detection rule base are set based on the characteristics of ARM assembly instructions; the vulnerability detection rule base includes at least one of the following: detection rules for memory operations without boundary checks, detection rules for SMI handlers lacking permission verification, or detection rules for communication protocols with structural mismatches.

[0009] In one embodiment, generating decompiled intermediate language code based on the decompiled code includes: Based on the recovered symbol information, target assembly code segments that conform to the characteristics of embedded data are marked in the decompiled code; The embedded data in the target assembly code segment that was misidentified as assembly instruction code during the decompilation stage is corrected, and decompiled intermediate language code is generated based on the corrected decompiled code.

[0010] In one embodiment, the taint source identified from the vulnerable code is at least one; Determining the set of propagated taints corresponding to the taint source in the decompiled intermediate language code includes: For each taint source, add a taint tag to the variable corresponding to the taint source in the decompiled intermediate language code; The instructions in the decompiled intermediate language code are traversed. For each target instruction that uses a variable with the added pollution label as input, the output result corresponding to the target instruction is added with the pollution label to track each propagation taint corresponding to the taint source, thereby obtaining a set of propagation taints; wherein the propagation taints are added with the pollution label.

[0011] In one embodiment, the step of extracting path constraints from the taint source to the dangerous sink based on the decompiled intermediate language code for each dangerous sink includes: Identify conditional jump instructions in the decompiled intermediate language code; For each of the dangerous sinks, the target branch conditions corresponding to the conditional jump instructions on the target path are collected to obtain the path constraint conditions; wherein, the target path is the entire path from the taint source to the dangerous sink; and the target branch condition is the branch condition where the condition corresponding to the conditional jump instruction is true.

[0012] Secondly, this application also provides a vulnerability detection device for ARM architecture UEFI firmware. The device includes: The decompilation module is used to identify the instruction characteristics of the ARM architecture UEFI firmware during the decompilation process, and to perform decompilation based on the instruction characteristics to obtain decompiled code; The symbol information recovery module is used to perform symbol information recovery processing on the decompiled code based on a preset UEFI symbol rule library; The vulnerability code identification module is used to identify vulnerability code from the assembly code after recovering symbol information, based on the vulnerability detection rule library set for the ARM architecture UEFI firmware. An intermediate language code generation module is used to generate decompiled intermediate language code based on the decompiled code; The real vulnerability identification module is used to identify taint sources and dangerous sinks in the vulnerable code based on the decompiled intermediate language code, and to determine the set of propagation taints corresponding to the taint sources in the decompiled intermediate language code. If the dangerous sinks are included in the set of propagation taints, for each dangerous sink, path constraints from the taint source to the dangerous sink are extracted based on the decompiled intermediate language code, and each path constraint is converted into a corresponding logical expression. The logical expression is then sent to a constraint solver to verify whether all path constraints are satisfied. If yes, the dangerous sink is determined to have a vulnerability; otherwise, the dangerous sink is determined not to have a vulnerability.

[0013] Thirdly, this application also provides a computer device. The computer device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to implement the various steps involved in the first aspect and its embodiments described above.

[0014] Fourthly, this application also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program thereon, which, when executed by a processor, implements the various steps involved in the first aspect and its embodiments described above.

[0015] Fifthly, this application also provides a computer program product. The computer program product includes a computer program that, when executed by a processor, implements the various steps involved in the first aspect and its embodiments described above.

[0016] The aforementioned vulnerability detection method, apparatus, computer device, storage medium, and computer program product for ARM architecture UEFI firmware, during the decompilation process of ARM architecture UEFI firmware, identifies the instruction characteristics of the ARM architecture UEFI firmware. Based on these instruction characteristics, the ARM architecture UEFI firmware can be accurately decompiled to obtain decompiled code. Then, based on a preset UEFI symbol rule library, symbol information recovery processing is performed on the decompiled code, and based on a vulnerability detection rule library set for the ARM architecture UEFI firmware, vulnerable code is identified from the assembly code after symbol information recovery. Next, intermediate decompilation language code is generated based on the decompiled code, and further analysis is performed on the initially identified vulnerable code using the intermediate decompilation language code to identify actual vulnerabilities. Specifically, based on the identification of taint sources and dangerous sinks in the vulnerable code using the decompiled intermediate language code, a set of propagating taints corresponding to the taint sources in the decompiled intermediate language code is determined. If the dangerous sinks are included in the set of propagating taints, for each dangerous sink, path constraints from the taint source to the dangerous sink are extracted based on the decompiled intermediate language code. Each path constraint is then converted into a corresponding logical expression, which is fed into a constraint solver to verify whether all path constraints are satisfied. If satisfied, the dangerous sink is determined to have a vulnerability; otherwise, it is determined that the dangerous sink does not have a vulnerability. This achieves vulnerability detection for symbolically closed-source ARM architecture UEFI firmware while ensuring the accuracy of vulnerability detection. Attached Figure Description

[0017] Figure 1 This is a flowchart illustrating a vulnerability detection method for ARM architecture UEFI firmware in one embodiment. Figure 2 This is a schematic diagram of vulnerability detection results in one embodiment; Figure 3 This is a schematic diagram of GUIDs in the UEFI symbol rule base in one embodiment; Figure 4 This is a schematic diagram of semantic recovery in one embodiment; Figure 5 This is a schematic diagram of structure information recovery in one embodiment; Figure 6 This is a schematic diagram illustrating the decompilation of IR code in one embodiment; Figure 7 This is a schematic diagram illustrating the principle of a vulnerability detection method for ARM architecture UEFI firmware in one embodiment; Figure 8 This is a structural block diagram of a vulnerability detection device for ARM architecture UEFI firmware in one embodiment; Figure 9 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation

[0018] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0019] In one embodiment, such as Figure 1 As shown, a vulnerability detection method for ARM architecture UEFI firmware is provided, and the method is illustrated using a computer device as an example. The method includes the following steps: S11 identifies the instruction characteristics of the ARM architecture UEFI firmware during the decompilation process and performs decompilation based on these characteristics to obtain decompiled code.

[0020] ARM architecture UEFI firmware is closed-source, unsigned UEFI firmware based on the ARM architecture. The method in this application fills the gap in the existing technology for security analysis of closed-source ARM firmware, enabling vulnerability detection of closed-source, unsigned ARM architecture UEFI firmware without relying on the firmware source code, thus demonstrating strong practicality.

[0021] It should be understood that due to the complexity of the ARM architecture, it includes two instruction sets: the Thumb mode instruction set and the ARM mode instruction set. Switching between these two instruction sets exists throughout the UEFI firmware. However, the code characteristics of the instructions in these two modes differ; for example, the instruction lengths differ between Thumb mode and ARM mode. Failure to distinguish and identify these two modes of instructions will lead to inaccurate decompilation. Therefore, during the decompilation process, instruction parsing can be performed on the ARM architecture UEFI firmware to identify instruction characteristics. For example, these instruction characteristics can include mode switching characteristics, which are used to distinguish and identify the switch between Thumb mode and ARM mode. Furthermore, decompilation can be performed based on these instruction characteristics, resulting in more accurate decompiled code.

[0022] In some embodiments, the instruction features include at least one of the value of the least significant bit of the parameter register of the branch-swap instruction or the value of the least significant bit of the function entry address; the branch-swap instruction is used to indicate switching between Thumb mode and ARM mode. For example, the branch-swap instruction may include a BLX instruction or a BX instruction. The value of the least significant bit of the function entry address, i.e., the LSB flag bit of the function entry address (also referred to as the least significant bit value of the function address), is equivalent to the value of its least significant bit bit 0 when the function address is treated as a pointer value.

[0023] Specifically, computer devices can accurately distinguish the boundaries between Thumb and ARM mode code blocks by analyzing the LSB flags of BLX / BX instructions and function entry addresses, thus avoiding disassembly errors. For example, in Thumb mode, the least significant bit (LSB) of the parameter register for BLX / BX instructions is 1, while in ARM mode, it is 0. Similarly, in Thumb mode, the LSB of a function address is typically 1, while in ARM mode, it is typically 0. Therefore, computer devices can distinguish the target mode of each code block to be decompiled—whether it is Thumb mode or ARM mode—based on the value of the LSB of the parameter register of branch-swapping instructions and / or the value of the LSB of function entry addresses. Furthermore, based on the code characteristics corresponding to the target mode, the code segment to be decompiled can be decompiled to obtain the corresponding decompiled code.

[0024] In some examples, code characteristics may include byte alignment length (or instruction length), constant pool addressing offset, etc. For instance, ARM processor instructions require 4-byte alignment in ARM mode and 2-byte alignment in Thumb mode. Also, the constant pool is typically identified (i.e., located) by the LDR[PC,#offset] instruction pattern, and the offset differs between ARM and Thumb modes.

[0025] In some embodiments, a customized plugin called UEFI_Vul_Analysis, or UEFI vulnerability analysis plugin, has been developed based on traditional decompilation tools (such as IDA Pro, Ghidra, Radare2, etc.). This plugin is specifically designed for vulnerability detection in ARM architecture UEFI firmware. Once running, this UEFI vulnerability analysis plugin can output relevant detection information to the decompilation tool's interface, including instruction identification, symbol recovery, function identification, and vulnerability detection information for the ARM architecture UEFI firmware.

[0026] It should be understood that traditional decompilation tools can only simply convert binary code into unsigned decompiled code, without providing additional functions or information, and often suffer from decompilation errors. In this application, by customizing a UEFI vulnerability analysis plugin for ARM architecture UEFI firmware within a decompilation tool, it is possible to both decompile ARM architecture UEFI firmware and equip the decompilation tool with entirely new functions such as symbol recovery and vulnerability detection. This effectively endows traditional decompilation tools with entirely new capabilities, broadening their applicability and significantly improving vulnerability detection efficiency.

[0027] like Figure 2 As shown, the UEFI vulnerability analysis plugin not only identified and restored the GteVariable function, but also recovered variable names such as DRIVER_SAMPLE_FORM_SET_GUID_1, aSetup_2, and aFailedToEnable_4, as well as the ultimately detected vulnerabilities. It should be understood that the specific location of the detected vulnerability code can be output. For example, the vulnerability type and the specific location of the vulnerability code are output in the area marked 201. The "Overflow may occur here:0x3bfc10" indicated by 201 means that the vulnerability type is a buffer overflow, and the vulnerability may occur at location 0x3bfc10. The

Pseudoced-A

[0028] S12 performs symbol information recovery processing on the decompiled code based on the preset UEFI symbol rule library.

[0029] It should be understood that symbol information is typically removed during UEFI firmware releases. Therefore, in this application, a UEFI symbol rule base can be extracted and predefined from various publicly available general UEFI projects (such as Intel's open-source UEFI firmware and driver development framework EDK2, Microsoft's open-source UEFI project Project Mu, AMD's open-source library openSIL, etc.) using a UEFI vulnerability analysis plugin. This UEFI symbol rule base includes the GUID and semantic feature information corresponding to each symbol object. A symbol object can be at least one of the following: a service function, a variable, a protocol, or a structure. Semantic feature information refers to the features that characterize the relevant semantics of the symbol object, such as function parameter features. Figure 3 The guids.json file shown records the GUIDs of various symbol objects extracted from publicly available general UEFI projects in the UEFI symbol rule base.

[0030] Computer devices can match GUIDs in the UEFI symbol rule base with decompiled code to identify the GUIDs in the decompiled code. That is, if a GUID in the UEFI symbol rule base matches in the decompiled code, then that matched code portion is the GUID. Furthermore, for the symbol object corresponding to the identified GUID in the decompiled code, corresponding semantic feature information and structure information are added, thereby restoring the symbol information of that symbol object. In this way, the semantics of key service functions can be recovered, yielding the assembly code after the symbol information has been restored.

[0031] It should be understood that the solution proposed in this application builds a UEFI symbol rule base by integrating multiple open-source UEFI projects. Compared with the limitations of some methods that can only detect some vulnerabilities, the UEFI symbol rule base has a wide coverage, thus enabling more accurate and complete symbol recovery.

[0032] like Figure 4 As shown, the semantics of key service functions can be recovered through GUID matching and semantic feature information. For example, the UEFI vulnerability analysis plugin can detect and identify service functions and recover their corresponding semantics, outputting the results to the

UEFI_Vul_Analysis:services

[0033] Figure 5In the code, “$FAAA7B02052316094C472795C6F3F283” represents the GUID of a custom structure. Because it is custom, its corresponding name is not available in the UEFI symbol rule base, so it is still represented by the GUID. The “MaxMode” and “Mode” fields within the structure contain specific information about the structure.

[0034] S13 identifies vulnerable code from the assembly code after recovering symbol information, based on a vulnerability detection rule library set for ARM architecture UEFI firmware.

[0035] Specifically, by analyzing historical UEFI CVE vulnerability data, a vulnerability pattern library, also known as a vulnerability detection rule base, can be built targeting components specific to ARM architecture UEFI. These specific components include Boot Services, Runtime Services, and SMM (System Management Mode) modules. This vulnerability detection rule base includes various detection rules set for ARM architecture UEFI firmware to detect vulnerable code.

[0036] In some embodiments, key security points such as GUID verification logic and CommBuffer boundary checks can be focused on. At least one of the following rules can be defined in the vulnerability detection rule base: detection rules for memory operations without boundary checks, detection rules for SMI handlers lacking permission verification, or detection rules for communication protocols with mismatched structures. This is to check typical vulnerability patterns such as memory operations without boundary checks, SMI handlers lacking permission verification, and communication protocols with mismatched structures, thereby identifying potential vulnerabilities such as buffer overflows, reuse after release, and privilege escalation.

[0037] The detection rules in the vulnerability detection rule base (i.e., vulnerability detection rules) are set based on the characteristics of ARM assembly instructions. This can improve the accuracy of vulnerability detection.

[0038] It should be understood that due to the characteristics of the ARM architecture instruction set, the vulnerability detection rule base includes relevant detection rules written specifically for the characteristics of ARM assembly instructions. Taking common vulnerability patterns as an example, to check and identify vulnerabilities such as "user input is passed to dangerous functions without being checked" (e.g., `system`, `strcpy`) or "buffer copy length is controlled externally," specific detection rule logic can be written for these vulnerability patterns by combining the characteristics of ARM assembly instructions (e.g., using registers `r0`~`r3` to pass parameters, using the `LDR` instruction for memory loading, using the `STR` instruction for storage, and using the `BL` instruction for function calls).

[0039] Taking the detection of a "command injection" vulnerability as an example, a detection rule can be written as follows: When a program calls the `system` function (usually implemented as `BL system` in ARM architecture), check whether its first parameter (stored in register `r0`) comes from an external input instruction (such as `recv`, `read`, or `argv`). If so, further analyze whether length or content checks are performed on the data flow path from the input to register `r0`. If not, then a "command injection" vulnerability is determined to exist. In this way, by combining the characteristics of the ARM architecture instruction set, the above detection rule can accurately identify "command injection" vulnerabilities in ARM architecture UEFI firmware.

[0040] For example, to improve accuracy, the detection rules can also consider instruction combinations specific to the ARM architecture, such as: - Load user data from the stack using `LDR r0, [sp, #offset]`; - Calculate the structure offset using `ADD r1, r2, #8` (e.g., `argv[1]`); - Does the conditional jump implemented via `CMP` + `BNE` / `BEQ` impose restrictions on the input?

[0041] Therefore, corresponding detection rules are set for instruction combinations specific to the ARM architecture to avoid missed detections.

[0042] After obtaining the recovered symbol information from the assembly code, vulnerable code can be identified from the assembly code based on a vulnerability detection rule base. For example, these detection rule logics can be written in a form recognizable by static analysis tools (such as Ghidra scripts, IDA Microcode analysis plugins, or custom rule engines) to automatically find vulnerable code fragments that match the corresponding vulnerability patterns.

[0043] by Figure 2 Taking the vulnerability detection results shown as an example, Figure 2This example illustrates a common type of buffer overflow vulnerability in UEFI: the Double GetVariable function. The GetVariable function in UEFI can read and write NVRAM variables, but it carries a risk. For instance, the first execution of GetVariable reads the variable aSetup_2 buffer according to the specified DataSize length of 1200. However, after reading, it returns the actual length of the aSetup_2 variable, overwriting the DataSize. The second call to GetVariable does not reinitialize DataSize; instead, it uses the previously overwritten DataSize length to reread the aSetup_2 variable. If the overwritten DataSize length significantly exceeds the actual length of the aSetup_2 variable, a buffer overflow will occur. Therefore, based on a vulnerability detection rule base, the relevant code for the Double GetVariable buffer overflow vulnerability is identified, as indicated by code location 202, and the vulnerability type and the location of the vulnerable code are output at location 201.

[0044] S14 generates decompiled intermediate language code based on decompiled code.

[0045] In this context, decompiling intermediate language code refers to architecture-independent, statically single-assignment (SSA) intermediate language code, hereinafter referred to as decompiling IR (Intermediate Representation) code. For example, decompiling IR code can be Microcode IR code.

[0046] In this context, "architecture-independent" means that in the decompiled IR code, logical variables (such as var_4, var_10, etc.) replace physical registers (such as rbp, rsp) / stack addresses. These logical variables do not correspond to the actual hardware registers, but are storage units abstracted at the IR level, representing temporary storage locations, thus detaching themselves from the influence of architecture (such as x86, ARM architecture, etc.).

[0047] In statically typed intermediate language code, each new assignment of the same logical variable (such as register rdi) is assigned a unique version number (subscript) to ensure that each variable is assigned only once throughout the entire program. For example, rdi#0 (input) and rdi#1 (before the call) are rdi registers with different labels (#0, #1), preventing confusion. Thus, decompiling the IR code can clearly and intuitively describe the data flow and instruction execution process.

[0048] Furthermore, in decompiling IR code, the implicit processes within a relatively long and complete compound instruction in the original decompiled code are made explicit and broken down into steps for calculation; this can also be called explicit conditional calculation. It's equivalent to revealing the implicit logic behind the compound instruction. For example, `cmp + jle` is a compound instruction, which can be broken down into four steps: "arithmetic comparison → flags → Boolean condition → conditional jump". In this way, subsequent operations can directly manipulate the clear Boolean conditions without needing to understand the complex combination logic of flags.

[0049] Furthermore, in decompiling IR code, memory operations are described using a unified instruction or memory access model (i.e., memory abstraction). In the original decompiled code, memory access operations are hidden within complex addressing modes. In the decompiled IR code, LOAD(ram, ptr) uniformly represents reading a value from memory address ptr, PTR_ADD uniformly represents pointer arithmetic operations, etc. This allows the decompiler to focus more on the logic of the operations, while ignoring the memory addressing details of different architectures. This makes the data flow clearer.

[0050] Figure 6 Used to illustrate the decompiling of IR code. Figure 6 Taking the code marked 601 in the code as an example, "rdi#0", "rsi#0", "var_10", and "var_4" are all decompiled IR code. It should be noted that... Figure 6 The example of 601 is only used to illustrate the decompilation of IR code and is not limited to "rdi#0", "rsi#0", "var_10", and "var_4" as decompiled IR code.

[0051] Based on the aforementioned characteristics of decompiling IR code, it can be used for convenient, simplified, and accurate data flow analysis, such as easily tracing "where the value comes from." Secondly, decompiling IR code avoids confusion between "old RDIs" and "new RDIs," eliminating ambiguity and preventing analysis errors. Furthermore, the decompiling IR code in this embodiment is well-matched with commonly used analysis tools, facilitating subsequent optimization and verification. Moreover, the decompiling IR code in this embodiment can be combined with the initially analyzed vulnerability code to determine the actual vulnerability; see the relevant descriptions in steps S15 and S16 for details, which will not be elaborated here.

[0052] In some embodiments, because decompilation during the decompilation stage may not be accurate enough, some embedded data that conforms to UEFI characteristics such as GUIDs and protocol pointers may be incorrectly identified as assembly instruction code during the initial decompilation stage. Therefore, after recovering the symbol information, the computer device can mark target assembly code segments that conform to the characteristics of embedded data in the decompiled code based on the recovered symbol information; and correct the embedded data in the target assembly code segments that were mistakenly identified as assembly instruction code during the decompilation stage to avoid misidentifying embedded data as instruction code. Furthermore, decompiled IR code can be generated based on the corrected decompiled code to prevent errors in the generation of decompiled IR code, thereby improving the accuracy and reliability of subsequent closed-source UEFI firmware vulnerability detection. In other embodiments, decompiled IR code can be directly generated based on the decompiled code obtained in the initial decompilation stage. This is not limited.

[0053] S15, based on the identification of taint sources and dangerous sinks in the decompiled intermediate language code, determine the set of propagation taints corresponding to the taint sources in the decompiled intermediate language code.

[0054] It should be understood that, in the embodiments of this application, a real vulnerability refers to a vulnerability that can be exploited by external attacks. In binary programs, many "defects" do not necessarily constitute real vulnerabilities; that is, the vulnerable code identified in step S13 may not necessarily have a real vulnerability, but may be identified as vulnerable code due to defects in the code itself. Therefore, the initially identified vulnerable code can be further tested by decompiling the IR code to determine the real vulnerability.

[0055] In reality, an exploitable vulnerability only exists when tainted input, which is controlled by the attacker, can flow unimpeded into the dangerous sink and the execution path conditions are met. Therefore, the propagation path of the taint can be traced at the microcode level, path constraints can be extracted, and finally, a solver can be used to verify the feasibility of the vulnerability.

[0056] Specifically, the UEFI vulnerability analysis plugin can identify tainted sources based on decompiled IR code and identify dangerous sinks in the vulnerable code. In other words, the dynamic marking mechanism of tainted sources and dangerous sinks can achieve accurate vulnerability identification.

[0057] A taint source is the origin of tainted data. For example, a taint source can include at least one of the following: a hard-coded dangerous address, an unverified communication buffer, and an external input interface (i.e., a location in the program that receives external input). For example, recv(), read(), fscanf(), argv, envp, etc. can be taint sources.

[0058] A dangerous sink is a sensitive function or operation that may be abused. For example, dangerous sinks can include dangerous operations such as memory access violations, privileged function calls, and pointer dereferencing. Examples include: `system(cmd)`, `execve(path, ...)`, `strcpy(dst, src)`, and `memcpy(..., size)` (if `size` is controllable).

[0059] In some examples, the IR language SSA form (i.e., decompiling IR code) is used to mark and trace register / memory dependency chains for taint analysis. That is, during taint propagation, context-aware analysis is performed based on the decompiled IR code to identify coding defects propagating from the taint, tracing the propagation path of unverified external inputs (such as communication buffers) in ARM memory operation instructions (such as STR / LDR). For example, taking the CommBuffer boundary checkpoint as an example, by identifying all instructions affecting buffer size verification and identifying whether there are externally marked inputs (such as SMI trigger values), secure operations are distinguished from potential overflow vulnerabilities, invalid defects are eliminated to reduce false alarms, and ultimately, high-precision static vulnerability detection is achieved in unsigned ARM architecture UEFI firmware.

[0060] Specifically, there can be one or more taint sources, and all taint sources can form a taint source set (also known as an initial taint set). Furthermore, the UEFI vulnerability analysis plugin can add a taint label to the variable corresponding to each taint source in the taint source set in the decompiled IR code, that is, mark the corresponding variable as "tainted".

[0061] Furthermore, the UEFI vulnerability analysis plugin can traverse the instructions in the decompiled IR code. For each target instruction that uses a variable with a tainted label as input, the plugin adds a tainted label to the output of that target instruction. That is, if an instruction's input variable is tainted, then that instruction is a target instruction, and its output will also be tainted, thus becoming a propagation taint. Based on this, the various propagation taints corresponding to the taint source can be traced to obtain a set of propagation taints.

[0062] The following examples illustrate the target instructions and the addition of pollution markers.

[0063] 1. The instruction `m_mov a, b` -> if `b` is marked as tainted, then `a` is marked as tainted; 2. The m_add a, b, c instruction -> if b or c is marked tainted, then a is marked tainted; 3. The instruction `m_ld a, [ptr]` -> If `ptr` is marked as tainted, then `a` is marked as tainted (pointer taint). 4. The `m_st [ptr], val` instruction -> If `val` is marked as tainted, then the memory `[ptr]` is marked as tainted; 5. The m_call func, args instruction -> If args[i] is marked tainted, then the corresponding parameter is marked tainted; if func returns a value marked tainted, then the output result is marked tainted.

[0064] It should be understood that assembly-level analysis is limited by hardware details (registers, byte-level operations) and cannot naturally express high-level logic such as argv[1] that relies on pointers / arrays. Compared with assembly-level analysis, decompiling IR code directly restores the essence of these logics, can accurately express operations such as argv[1], can naturally handle pointer arithmetic and memory dereferencing, and allows analysis tools to more accurately track data flow (such as taint propagation), thereby avoiding missed detections.

[0065] For example, taking the following decompiled IR code as an example, if cmd is tainted, it can more directly and accurately reflect the existence of vulnerabilities.

[0066] “1ptr = PTR_ADD argv, 8; argv + 8 → argv [1] address 2cmd = LOAD ram, ptr; Read argv [1] 3CALL system, cmd; if cmd tainted → vulnerability!

[0067] S16, if the set of propagated taints includes dangerous sinks, for each dangerous sink, extract the path constraints from the taint source to the dangerous sink based on the decompiled intermediate language code, convert each path constraint into a corresponding logical expression, and send the logical expression to the constraint solver to verify whether all path constraints can be satisfied. If so, it is determined that the dangerous sink has a vulnerability; otherwise, it is determined that the dangerous sink does not have a vulnerability.

[0068] Specifically, if the set of propagation taints corresponding to a taint source includes a dangerous sink, it indicates that contaminated data can flow from the taint source to the dangerous sink. Therefore, even if tainted data is found to be flowing to the sink, it is still necessary to verify whether the path is reachable.

[0069] For example, the sample code is: 1 if (len<100) 2. `system(user_input);` / / Even if `user_input` is marked as tainted, it will only be executed when `len < 100`.

[0070] Note: Even if user_input is marked as tainted, the system(user_input) instruction on this path will only be executed if the conditional branch "len<100" is followed, and only then will the vulnerability truly exist. If the conditional branch "len≥100" is followed, the system(user_input) instruction on this path will not be executed, and therefore there is no vulnerability on this path.

[0071] Therefore, for each dangerous sink, the path constraints from the taint source to the dangerous sink can be extracted by decompiling the IR code, and it can be verified whether all path constraints can be satisfied. This determines whether the path from the taint source to the dangerous sink is reachable, thereby determining whether the dangerous sink has a vulnerability.

[0072] In some embodiments, the UEFI vulnerability analysis plugin can identify conditional jump instructions in decompiled IR code. For example, JZ (jump if zero) or JNZ instructions in decompiled IR code contain explicit conditional predicates, thus allowing the identification of conditional jump instructions based on these predicates. Furthermore, path constraints can be constructed from the entry point (i.e., the taint source) to the sink. Specifically, for each sink, the target branch conditions corresponding to the conditional jump instructions on the target path are collected to obtain the path constraints. Here, the target path is the entire path from the taint source to the sink; the target branch condition is the branch condition where the condition corresponding to the conditional jump instruction is true. For example, to execute `system`, the condition `!(argc<= 2)` must be satisfied, meaning `argc>2` is a true branch condition.

[0073] Furthermore, the extracted path constraints can be converted into corresponding logical expressions. For example, the logical expression could be: (argc>2) ∧ (input_len>200).

[0074] Furthermore, the feasibility of the vulnerability is verified using a constraint solver. Specifically, the logical expression can be fed into a constraint solver (such as an SMT solver) to verify whether all path constraints are satisfied. If satisfied (SAT): this indicates the existence of an input set that satisfies all path constraints and allows malicious data (i.e., tainted data) to be transmitted from the tainted source to the dangerous sink. In other words, the path from the tainted source to the dangerous sink is reachable, thus indicating that the dangerous sink has a real vulnerability. If not satisfied (UNSAT): this indicates the path from the tainted source to the dangerous sink is unreachable, thus indicating that the dangerous sink does not have a real vulnerability.

[0075] Based on the above analysis, for a dangerous endpoint (i.e., a defect) in the vulnerable code to be considered a real vulnerability, the following conditions must be met simultaneously: 1. Stain accessibility Analysis based on decompiling IR code shows that attacker input (i.e., taint sources) can propagate to a dangerous sink.

[0076] 2. Path Feasibility All branch conditions (i.e. all path constraints) from the taint source to the dangerous sink can be satisfied.

[0077] 3. Semantic validity The use of a dangerous sink is indeed dangerous (i.e., a dangerous sink exists within the set of propagating taints of a taint source). If any of the conditions are not met, the dangerous sink is merely a code defect, not a real, exploitable vulnerability.

[0078] In the above scheme, after initially detecting vulnerable code using a vulnerability detection rule base, it combines SSA-formatted intermediate language code (i.e., decompiling IR code) to precisely track whether user input can unimpededly reach the dangerous operation (i.e., the dangerous sink in the vulnerable code). By extracting and solving path constraints, it ultimately determines whether the flaw in the initially detected vulnerable code can be actually exploited. This achieves vulnerability detection for symbolically closed-source ARM architecture UEFI firmware.

[0079] Furthermore, compared to traditional CFG (Control Flow Graph) + assembly-level taint analysis, the proposed solution significantly reduces the false positive rate (because traditional CFG + assembly-level taint analysis is often difficult to model, and indirect jumps can easily lead to incomplete CFG control flow graphs, resulting in higher false positive and false negative rates), improving the accuracy of vulnerability detection and representing advanced practices in modern binary vulnerability auditing. Moreover, this solution combines preliminary vulnerability detection results with decompiled IR code for advanced analysis of real vulnerabilities. It does not require covering all complex paths in the firmware program; instead, it conveniently determines whether dangerous sinks in the initially detected vulnerable code are propagation taints based on the decompiled IR code. Then, it constructs corresponding path constraints for dangerous sinks that belong to propagation taints, enabling targeted detection, which is more convenient and accurate.

[0080] Figure 7 This is a schematic diagram illustrating the principle of a vulnerability detection method for ARM architecture UEFI firmware in one embodiment. From Figure 7 As can be seen, inputting the image file of the ARM architecture UEFI firmware into the UEFI vulnerability analysis plugin allows for the following steps: 1. Decompilation. During decompilation, ARM architecture instructions can be parsed, instruction features extracted, and decompilation can be performed more accurately based on these features to obtain decompiled code. Then, 2. Symbol recovery processing can be performed to recover symbol information from the decompiled code. Symbol recovery is based on the UEFI symbol rule library, which is extracted from open-source UEFI projects. After recovering the symbol information, 3. Vulnerability pattern matching can be performed. Specifically, vulnerability pattern matching is based on a vulnerability pattern library, which is extracted from historical UEFI vulnerability data. It should be understood that the vulnerability pattern library can also be called a vulnerability detection rule library, and vulnerability pattern matching can also be called vulnerability detection rule matching. A list of coding defects is generated based on the vulnerability pattern matching results. This list of coding defects is the list of identified vulnerable code.

[0081] It should be understood that after obtaining the decompiled code, in addition to performing step 2, symbol recovery processing, step 4, generating decompiled IR code, can also be performed. That is, generating decompiled IR code based on the decompiled code. Furthermore, step 5, taint analysis and path constraints based on the decompiled code IR, can be performed. Specifically, taint sources are identified and dangerous sinks in the vulnerability code identified in the coding defect list are determined. The set of propagating taints corresponding to the taint sources is analyzed in conjunction with the decompiled code IR. For dangerous sinks belonging to propagating taints, it is analyzed whether all path constraints from the taint source to the dangerous sink are met. If they are met, the dangerous sink is determined to be a real vulnerability; if not, it is determined to be a false vulnerability, not a real vulnerability. Further, a vulnerability report can be output, that is, the final vulnerability detection results for the ARM architecture UEFI firmware can be obtained.

[0082] It should be understood that although the steps in the flowcharts of the above embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the above embodiments may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.

[0083] Based on the same inventive concept, this application also provides a vulnerability detection device for ARM architecture UEFI firmware, which implements the aforementioned vulnerability detection method for ARM architecture UEFI firmware. The solution provided by this device is similar to the implementation described in the above method. Therefore, the specific limitations of one or more vulnerability detection device embodiments for ARM architecture UEFI firmware provided below can be found in the limitations of the vulnerability detection method for ARM architecture UEFI firmware described above, and will not be repeated here.

[0084] In one embodiment, such as Figure 8 As shown, a vulnerability detection device for ARM architecture UEFI firmware is provided, the device comprising: The decompilation module 802 is used to identify the instruction characteristics of the ARM architecture UEFI firmware during the decompilation process and to perform decompilation based on the instruction characteristics to obtain decompiled code. The symbol information recovery module 804 is used to perform symbol information recovery processing on decompiled code based on a preset UEFI symbol rule library; The vulnerability code identification module 806 is used to identify vulnerability code from the assembly code after recovering symbol information, based on the vulnerability detection rule library set for ARM architecture UEFI firmware. Intermediate language code generation module 808 is used to generate decompiled intermediate language code based on decompiled code; The real vulnerability identification module 810 is used to identify taint sources and dangerous sinks in the vulnerable code based on the decompiled intermediate language code, and to determine the set of propagation taints corresponding to the taint sources in the decompiled intermediate language code. If the set of propagation taints includes dangerous sinks, for each dangerous sink, the path constraints from the taint source to the dangerous sink are extracted based on the decompiled intermediate language code, and each path constraint is converted into a corresponding logical expression. The logical expression is sent to the constraint solver to verify whether all path constraints can be satisfied. If so, the dangerous sink is determined to have a vulnerability; otherwise, the dangerous sink is determined not to have a vulnerability.

[0085] In some embodiments, the instruction characteristics include at least one of the value of the least significant bit of the parameter register of the branch exchange instruction or the value of the least significant bit of the function entry address. The decompilation module 802 is further configured to distinguish the target mode of each code block to be decompiled based on the value of the least significant bit of the parameter register of the branch exchange instruction or the value of the least significant bit of the function entry address; the target mode is either Thumb mode or ARM mode; based on the code characteristics corresponding to the target mode, the code segment to be decompiled is decompiled to obtain the corresponding decompiled code.

[0086] In some embodiments, the UEFI symbol rule base includes the GUID and semantic feature information corresponding to each symbol object. The symbol information recovery module 804 is further configured to match the GUID in the UEFI symbol rule base with the decompiled code to identify the GUID in the decompiled code; for the symbol object corresponding to the identified GUID in the decompiled code, add the corresponding semantic feature information and the corresponding structure information to recover the symbol information of the symbol object.

[0087] In some embodiments, the detection rules in the vulnerability detection rule base are set based on the characteristics of ARM assembly instructions; the vulnerability detection rule base includes at least one of the following: detection rules for memory operations without boundary checks, detection rules for SMI handlers lacking permission verification, or detection rules for communication protocols with structural mismatches.

[0088] In some embodiments, the intermediate language code generation module 808 is further configured to mark target assembly code segments that conform to embedded data characteristics in the decompiled code based on the recovered symbol information; correct the embedded data in the target assembly code segments that were misidentified as assembly instruction code during the decompilation stage; and generate decompiled intermediate language code based on the corrected decompiled code.

[0089] In some embodiments, at least one taint source is identified from the vulnerable code; the real vulnerability identification module 810 is further configured to add a taint label to the variable corresponding to the taint source in the decompiled intermediate language code for each taint source; traverse the instructions in the decompiled intermediate language code, and add a taint label to the output result corresponding to each target instruction that uses a variable with an added taint label as input, so as to track each propagation taint corresponding to the taint source and obtain a set of propagation taints; wherein, the propagation taints have added taint labels.

[0090] In some embodiments, the real vulnerability identification module 810 is further used to identify conditional jump instructions in the decompiled intermediate language code; for each dangerous sink, it collects the target branch conditions corresponding to the conditional jump instructions on the target path to obtain path constraint conditions; wherein, the target path is the entire path from the taint source to the dangerous sink; and the target branch condition is the branch condition where the condition corresponding to the conditional jump instruction is true.

[0091] The modules in the aforementioned vulnerability detection device for ARM architecture UEFI firmware can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in the processor of a computer device in hardware form or independent of it, or stored in the memory of the computer device in software form, so that the processor can call and execute the corresponding operations of each module.

[0092] In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as follows: Figure 9As shown, this computer device includes a processor, memory, input / output interfaces (I / O), and a communication interface. The processor, memory, and I / O interfaces are connected via a system bus, and the communication interface is also connected to the system bus via the I / O interfaces. The processor provides computational and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides the environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The I / O interfaces are used for exchanging information between the processor and external devices. The communication interface is used for communicating with external terminals via a network connection. When executed by the processor, the computer program implements a vulnerability detection method for ARM architecture UEFI firmware.

[0093] Those skilled in the art will understand that Figure 9 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0094] In one embodiment, a computer device is provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps of the embodiments of the present application.

[0095] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the steps of the embodiments of this application.

[0096] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the steps of the embodiments of this application.

[0097] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of related data must comply with the relevant laws, regulations and standards of the relevant countries and regions.

[0098] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments described above. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.

[0099] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0100] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.

Claims

1. A vulnerability detection method for ARM architecture UEFI firmware, characterized in that, The method includes: During the decompilation process of ARM architecture UEFI firmware, the instruction characteristics of the ARM architecture UEFI firmware are identified, and decompilation is performed based on the instruction characteristics to obtain decompiled code; The decompiled code is processed to recover symbol information based on a preset UEFI symbol rule base; Based on the vulnerability detection rule library set for the ARM architecture UEFI firmware, vulnerable code is identified from the assembly code after recovering symbol information; Generate decompiled intermediate language code based on the decompiled code; Based on the identification of taint sources and dangerous sinks in the vulnerability code through the decompiled intermediate language code, the set of propagation taints corresponding to the taint sources in the decompiled intermediate language code is determined. If the set of propagated taints includes the dangerous sink, for each dangerous sink, the path constraints from the taint source to the dangerous sink are extracted based on the decompiled intermediate language code, and each path constraint is converted into a corresponding logical expression. The logical expression is then fed into the constraint solver to verify whether all the path constraints are satisfied. If so, the dangerous sink is determined to have a vulnerability; otherwise, the dangerous sink is determined not to have a vulnerability.

2. The method according to claim 1, characterized in that, The instruction features include at least one of the values of the least significant bit of the parameter register of a branch exchange instruction or the least significant bit of the function entry address; The decompilation process based on the instruction features to obtain decompiled code includes: Based on the value of the least significant bit of the parameter register of the branch exchange instruction or the value of the least significant bit of the function entry address, the target mode of each code block to be decompiled is distinguished; the target mode is either Thumb mode or ARM mode. Based on the code features corresponding to the target pattern, the code segment to be decompiled is decompiled to obtain the corresponding decompiled code.

3. The method according to claim 1, characterized in that, The UEFI symbol rule base includes the GUID and semantic feature information corresponding to each symbol object; The symbol information recovery process for the decompiled code based on a preset UEFI symbol rule base includes: The GUIDs in the UEFI symbol rule base are matched with the decompiled code to identify the GUIDs in the decompiled code; For the symbol object corresponding to the identified GUID in the decompiled code, add the corresponding semantic feature information and the corresponding structure information to restore the symbol information of the symbol object.

4. The method according to claim 1, characterized in that, The detection rules in the vulnerability detection rule base are set based on the characteristics of ARM assembly instructions; the vulnerability detection rule base includes at least one of the following: detection rules for memory operations without boundary checks, detection rules for SMI handlers lacking permission verification, or detection rules for communication protocols with structural mismatches.

5. The method according to claim 1, characterized in that, The process of generating intermediate decompiled language code based on the decompiled code includes: Based on the recovered symbol information, target assembly code segments that conform to the characteristics of embedded data are marked in the decompiled code; The embedded data in the target assembly code segment that was misidentified as assembly instruction code during the decompilation stage is corrected, and decompiled intermediate language code is generated based on the corrected decompiled code.

6. The method according to claim 1, characterized in that, At least one taint was identified from the vulnerable code; Determining the set of propagated taints corresponding to the taint source in the decompiled intermediate language code includes: For each taint source, add a taint tag to the variable corresponding to the taint source in the decompiled intermediate language code; The instructions in the decompiled intermediate language code are traversed. For each target instruction that uses a variable with the added pollution label as input, the output result corresponding to the target instruction is added with the pollution label to track each propagation taint corresponding to the taint source, thereby obtaining a set of propagation taints; wherein the propagation taints are added with the pollution label.

7. The method according to any one of claims 1 to 6, characterized in that, For each dangerous sink, the path constraints extracted from the taint source to the dangerous sink based on the decompiled intermediate language code include: Identify conditional jump instructions in the decompiled intermediate language code; For each of the dangerous sinks, the target branch conditions corresponding to the conditional jump instructions on the target path are collected to obtain the path constraint conditions; wherein, the target path is the entire path from the taint source to the dangerous sink; and the target branch condition is the branch condition where the condition corresponding to the conditional jump instruction is true.

8. A vulnerability detection device for ARM architecture UEFI firmware, characterized in that, The device includes: The decompilation module is used to identify the instruction characteristics of the ARM architecture UEFI firmware during the decompilation process, and to perform decompilation based on the instruction characteristics to obtain decompiled code; The symbol information recovery module is used to perform symbol information recovery processing on the decompiled code based on a preset UEFI symbol rule library; The vulnerability code identification module is used to identify vulnerability code from the assembly code after recovering symbol information, based on the vulnerability detection rule library set for the ARM architecture UEFI firmware. An intermediate language code generation module is used to generate decompiled intermediate language code based on the decompiled code; The real vulnerability identification module is used to identify taint sources and dangerous sinks in the vulnerable code based on the decompiled intermediate language code, and to determine the set of propagation taints corresponding to the taint sources in the decompiled intermediate language code. If the dangerous sinks are included in the set of propagation taints, for each dangerous sink, path constraints from the taint source to the dangerous sink are extracted based on the decompiled intermediate language code, and each path constraint is converted into a corresponding logical expression. The logical expression is then sent to a constraint solver to verify whether all path constraints are satisfied. If yes, the dangerous sink is determined to have a vulnerability; otherwise, the dangerous sink is determined not to have a vulnerability.

9. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 7.

10. A computer-readable storage medium storing a computer program thereon, characterized in that, When a computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 7.