A large language model-based static code analysis alarm false alarm filtering method
By constructing control flow graphs and data flow graphs, and combining them with large language models for structured input, the problems of high false alarm rates in static code analysis tools and heavy manual review burdens have been solved. This has enabled high-precision alarm filtering and automated processing, improving the efficiency of software development and security review.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HOHAI UNIV
- Filing Date
- 2026-03-20
- Publication Date
- 2026-06-19
AI Technical Summary
Existing static code analysis tools have a high false alarm rate, a heavy workload for manual review, and large language models lack sufficient understanding of code logic, resulting in inaccurate alarm judgments and making it difficult to achieve high-precision automated filtering.
By parsing the abstract syntax tree, extracting complete grammatical units, constructing control flow graphs and data flow graphs, forming structured input, and combining it with a large language model for reasoning, the system outputs structured results, enabling high-precision filtering of static analysis alarms.
It significantly reduces false alarm rates, improves the accuracy and automation of alarm processing, reduces manual review workload, has good scalability and adaptability, and provides decision interpretability.
Smart Images

Figure CN122240489A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of software quality assurance and defect detection technology, specifically involving a static code analysis alarm and false alarm filtering method based on a large language model. Background Technology
[0002] As software systems become increasingly larger and more complex, static code analysis has become an indispensable tool in software development and security assurance. This technology can detect potential defects, vulnerabilities, and non-standard implementations by analyzing the syntax structure, control flow, and data flow of source code without executing the program. This allows for the discovery and fix of problems before software delivery, which is crucial for ensuring code quality and security.
[0003] However, existing static analysis tools generally suffer from high false positive rates. This is due to factors such as overly broad detection rules, insufficient understanding of code context, and inadequate handling of cross-function or cross-module dependencies. To reduce false positive rates, existing technologies typically employ the following methods: first, optimizing detection rules to reduce misjudgments, but this is costly to maintain and has poor adaptability; second, relying on manual review, but this is expensive in terms of manpower and time; and third, using traditional machine learning models for auxiliary judgment, but this relies on manual feature engineering and struggles to capture complex code semantics.
[0004] In recent years, large language models have demonstrated strong capabilities in code understanding, providing new methods for improving the accuracy of static analysis results. However, in direct application, due to the lack of sufficient deep logical information such as code context, control flow, and data flow, as well as the lack of standardized structured output, the judgment results still fall short in terms of accuracy and automatability, making it difficult to fundamentally alleviate the review burden and efficiency bottleneck caused by high false positive rates.
[0005] Therefore, whether it is traditional rule-based methods, machine learning-assisted methods, or existing simple alarm judgment schemes based on large language models, they are all difficult to meet the requirements of industrial applications in terms of reducing false alarm rates, improving judgment accuracy, and achieving automated utilization. Summary of the Invention
[0006] Purpose of the Invention: To address the problems of high false positive rates, heavy manual review burden, and inaccurate alarm judgments in large language model applications due to a lack of effective understanding of deep code logic in existing static code analysis techniques, this invention proposes a false positive filtering method for static code analysis alarms based on large language models. This invention parses the abstract syntax tree and, within a preset maximum character limit, prioritizes extracting the outermost complete syntactic units (such as complete function or class definitions) containing defect line numbers, ensuring the semantic integrity of the context and avoiding model input truncation. Simultaneously, it constructs control flow graphs and data flow graphs that support cross-function execution paths and cross-function variable dependency tracing to compensate for the insufficient understanding of cross-function semantics in traditional static analysis. This invention further employs a preset structured prompt template to encapsulate source code fragments, control flow graphs, and data flow graphs into a unified structured input, enabling large language models to perform reasoning based on complete program semantic information and output structured results in JSON or key-value format for easy automatic parsing and judgment. This invention achieves high-precision filtering of static analysis alarms, thereby significantly reducing the false positive rate while improving the accuracy and automation level of alarm processing.
[0007] Compared to existing methods based on large language models, this invention not only provides textual information from the source code but also offers cross-function execution paths, cross-function data dependencies, and abstract syntactic structures to the large language model in a structured manner. This enables the model to perform accurate reasoning based on the complete program semantics, thereby significantly improving the accuracy of alarm authenticity judgment. Furthermore, this invention employs a unified structured prompt template and structured output format, without limiting the type of large language model. It can be adapted to any pre-trained model that supports structured input and output, achieving good versatility and scalability.
[0008] Technical solution: A method for filtering false alarms in static code analysis based on a large language model, comprising the following steps: Step S1: Obtain the original alarm information.
[0009] Static code analysis tools are used to scan the target source code, obtaining raw scan results containing multiple alert records. Each alert record includes at least the source file path, the line number where the defect is located, a general vulnerability enumeration identifier, and an alert type description.
[0010] By parsing the above fields, they are organized into structured data objects—structured alarm information—so that subsequent steps can quickly locate the corresponding syntax node in the abstract syntax tree based on the location of the defect.
[0011] Step S2: Extract and build the extended context of the code.
[0012] First, based on the defect location in step S1, the abstract syntax tree of the corresponding source code file is parsed to locate the syntax node containing the defect line number, and the outermost complete syntax unit (such as a function body or class definition) to which it belongs is determined. The defect location refers to the source code file path and the line number where the defect is located.
[0013] Without exceeding the preset character limit, source code context fragments covering defect locations are automatically extracted from the grammatical units, thereby ensuring the semantic context integrity of the input large language model and avoiding input truncation problems.
[0014] Furthermore, an extended context is constructed based on the abstract syntax tree, including: (1) Construct a control flow graph covering the scope of syntactic units to represent the execution path of the code. The execution path includes sequential structure, conditional branch structure, loop structure, and cross-function call path within the same source file. By tracing function call relationships, this invention supports control flow extension across function execution chains, thereby capturing cross-function semantics that are difficult to identify by traditional static analysis.
[0015] (2) Based on the definition-use relationship of code nodes, a data flow graph reflecting the dependency of variables and data is constructed. The data flow graph supports bidirectional tracking of variable definition and use points, cross-function dependency extension of parameters and return values, and identification of cross-function data risks such as potential uninitialization and pollution propagation.
[0016] By jointly modeling control flow graphs and data flow graphs, a deep semantic context covering cross-function execution and data dependencies is formed.
[0017] To further quantify the implementation process in step S2, this invention introduces the following mathematical formulations. These formulations begin with the extraction of code snippets and gradually expand to the path representation of control flow and data flow, ensuring that the context covers the execution logic and dependencies at the defect location.
[0018] (1) Obtaining defective code fragment formulas based on abstract syntax trees (character limitations and coverage optimization for semantically complete fragment extraction): in, : Contains defect line number A collection of function / class definitions; : function The length of the characters; Set a maximum number of characters (e.g., 2000). This formula optimizes the selection of the maximum coverage segment under constraints (the defect location must be within the function range), ensuring semantic integrity while avoiding truncation problems caused by exceeding the input limit of a large model.
[0019] (2) Formula for constructing a control flow graph (CFG) (control flow path representation, supporting cross-function tracing): Basic path representation: in, Defect node; : Executable path; Basic block (statement sequence); Control structures (branching / looping / calling). This basic formula defines the execution sequence within the current syntactic unit, starting from the entry point. To the defect node This includes control elements such as conditional branches and loops. In the formula... In the diagram, n0 and n1 represent two nodes in the program execution process, and the arrow → indicates that the program's control flow proceeds from node n0 to node n1. Specifically, n0 is the starting point of the path, which may be the beginning of the program or a specific statement, while n1 is the next step in the path, which can be a basic block (such as a piece of code executed sequentially) or a control structure (such as a conditional statement or a loop). This formula describes the execution order from one step to the next in the program, reflecting the direction of code control flow and the program execution path.
[0020] Cross-function expansion formulas (if the current path involves function calls) ): in, The function being called (within the same source file scope); : Path The set of function calls involved in the process; Graph structure set union operation (CFG of recursive fusion subfunctions). This extended formula supports cross-function tracing: when When a call is encountered, recursively build. The control flow graph is merged to ensure that the complete path captures the impact of the call chain on defects (such as cross-function branch conditions).
[0021] (3) Data Flow Graph (DFG) Construction Formula (Data Dependencies): in, Defect-related variables; : Variable Definition / Usage Points; Program sequence or control dependencies (indicating forward dependencies); : The set of relevant variables. This formula defines the data flow chain from definition to defect usage, supporting cross-function tracing: If It happened during the call In the (called function), the dependency is extended to The return value or parameter stream can be used to identify potential uninitialized or contamination propagation.
[0022] Step S3: Construct analysis hints for large language models.
[0023] The structured alarm information obtained in step S1 is combined with the source code context fragments, control flow graphs, and data flow graphs obtained in step S2 in a structured manner to form analysis prompts for large language models. Specifically, this includes: (1) Encapsulate the alarm basic fields, source code fragments, CFG structure, and DFG structure according to the field mapping relationship; (2) Use a preset structured prompt template to organize the information in (1) above into a unified structured input (such as JSON format). (3) Embed the analysis task description, constraints and expected structured output format in the structured prompts.
[0024] This structured hint ensures that the large language model can reason incorporating source code context, execution path, and data dependencies.
[0025] Step S4: Obtain the structured response of the large language model.
[0026] Input the analysis prompts generated in step S3 into the large language model and obtain the structured response generated by the model, which includes false alarm judgment results and defect classification.
[0027] Step S5: Filter and output high-confidence alarms.
[0028] Based on the structured response results described in step S4, the original alarm information is automatically filtered, and a high-confidence alarm list is finally output.
[0029] Preferably, the specific implementation of step S1 includes: a) Scanning sub-step: Call the static code analysis tool to scan the target source code and generate scan results containing one or more alarm records.
[0030] b) Extraction sub-step: From the scan results, extract the key fields of each alarm record, including the source code file path, the line number where the defect is located, the general weakness enumeration identifier, and the defect type description.
[0031] c) Parsing sub-step: Parse the extracted field information and organize it into a structured data object so that subsequent steps can quickly retrieve the corresponding source code content based on the defect location information.
[0032] Preferably, the specific implementation of step S2 includes: a) Parsing sub-step: Based on the source code file path and the line number where the defect is located obtained in step S1, call the abstract syntax tree parser that supports the target programming language to parse the source code file, thereby locating the syntax node containing the line number where the defect is located, and determining the scope of the syntax unit to which it belongs, which is usually a function body or class definition.
[0033] b) Source code extraction sub-step: Under the condition of not exceeding the preset character limit, extract the source code context fragment containing the defect location from the scope of the syntax unit, and prioritize ensuring that it constitutes a complete function or class definition.
[0034] c) Control Flow Analysis Sub-step: Construct a control flow graph within the syntactic unit. This process supports cross-function control path tracing within the same source file. Generate the corresponding control flow graph structure to represent the execution order, conditional branches, loops, and relationships between cross-function calls.
[0035] d) Data Flow Analysis Sub-step: Within the syntactic unit, the definition and usage relationships of variables, parameters, and return values related to the defective code are traced. This process supports cross-function data dependency tracing within the same source file scope. A corresponding data flow graph structure is generated to represent the definition, reference, and dependency relationships of variables and data in the context.
[0036] e) Fusion sub-step: Integrate the information obtained through source code extraction, control flow analysis, and data flow analysis to generate an extended context containing source code context fragments, control flow graph structures, and data flow graph structures, which will serve as input for subsequent steps.
[0037] Preferably, the specific implementation of step S3 includes: a) Information integration sub-step: Map and match the alarm information obtained in step S1 with the extended context obtained in step S2, and establish the association between the alarm location and the code node, control flow node and data flow node in the context segment according to the preset fields.
[0038] b) Format Encapsulation Sub-step: Based on the prompt template adapted to the large language model input, the integrated information is converted into a unified structured representation. This structured representation includes text description fields, source code fields, control flow graph fields, and data flow graph fields for describing alarms. The "integrated information" refers to the result of summarizing and merging various data from different sources, including original alarm information, code context extracted from the source code, and constructed control flow graphs (CFG) and data flow graphs (DFG). Specifically, alarm information refers to raw alarm data from static code analysis tools, containing defect locations, alarm types, etc. Code context refers to relevant code fragments extracted from the source code, typically complete syntactic units (such as function bodies or class definitions) containing defect locations. Control flow graph (CFG) is a graph representing the program execution path, describing the control flow in the program (such as branches, loops, etc.). Data flow graph (DFG) is a graph representing the definition, reference, and dependencies of variables and data in the program. This processed information forms a unified structured data set, facilitating input into the large language model for further analysis and processing.
[0039] c) Hint generation sub-step: Embed the analysis task description and expected output format into the structured representation to form the final analysis hint for the large language model, so as to guide the model to make accurate false alarm judgment and defect classification based on the provided multi-source information.
[0040] Preferably, the specific implementation of step S4 includes: a) Model inference sub-step: Input the analysis hints generated in step S3 into a large language model that supports structured text input to trigger the model to infer based on the source code context, control flow graph structured representation and data flow graph structured representation in the hints.
[0041] b) Result Generation Sub-step: The large language model generates a structured response according to the expected output format defined in the prompt. The structured response includes at least the false alarm judgment result of the alarm record and the corrected defect classification label.
[0042] c) Result parsing sub-step: Parse the structured response, map the judgment results and defect classifications back to the corresponding original alarm records, and generate a structured judgment dataset that can be used for subsequent automatic filtering.
[0043] Preferably, the specific implementation of step S5 includes: a) Analysis and Decision Sub-step: Align the structured decision dataset generated in step S4 with the original alarm records, and apply the following decision rules to each alarm record: i. If the false alarm judgment result is true (i.e., it is determined to be a false alarm), then the alarm record is marked as to be removed; ii. If the false alarm judgment result is false (i.e. determined to be a real defect), the alarm record is marked as to be retained, and the defect type information of the alarm record may be updated selectively based on the defect classification label generated by the large language model.
[0044] b) List generation and output sub-step: Based on the application results of the aforementioned decision rules, remove all records marked as "to be removed" from the original alarm set, and integrate all records marked as "to be retained" (including their updated defect types) to finally generate and output a high-confidence alarm list.
[0045] A computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the steps of the static code analysis alarm false alarm filtering method based on a large language model as described above.
[0046] A computer-readable storage medium having a computer program / instructions stored thereon, which, when executed by a processor, implements the steps of the static code analysis alarm false alarm filtering method based on a large language model as described above.
[0047] Beneficial effects: Compared with the prior art, the present invention has the following beneficial effects: (1) Effectively reduce false alarm rate By structurally integrating defect code context, control flow graph, and data flow graph, and using them as prompts to input a large language model, the model can simultaneously utilize code semantics and execution logic information during inference, thereby reducing the misjudgments that may occur based solely on surface syntax analysis and achieving the effect of reducing the false alarm rate of static analysis alarms.
[0048] (2) Improve the accuracy of judgment and the consistency of results By adopting structured input and structured output methods, the large language model has a stable output format and strong contextual understanding ability when generating results, reducing the fluctuation of judgment caused by differences in natural language expression, thereby improving the accuracy and consistency of false alarm filtering and defect classification.
[0049] (3) Implement automatic correction of defect types While filtering false alarms, the system uses a large language model to automatically adjust the defect type of alarm records, thereby automatically improving alarm information and reducing the need for manual verification of defect types.
[0050] (4) Reduce the workload of manual review By automatically filtering high-confidence alert lists, the workload of manual review is reduced, thereby saving manpower and time costs, which is especially suitable for continuous integration and delivery environments with large-scale codebases.
[0051] (5) Possesses good scalability and adaptability It can be used with a variety of static code analysis tools and is compatible with abstract syntax tree parsers, control flow analyzers and data flow analyzers of different programming languages. At the same time, it can adjust the prompt template according to the input requirements of different large language models, thus having cross-language and cross-platform adaptability.
[0052] (6) Provide interpretability for decision-making By using structured inputs (code, CFG, DFG) and model-generated explanatory text (explanation field), this invention not only provides false positive detections but also offers the basis for those detections. This allows developers to understand the logic behind the filtering decisions, increasing their trust in the system and facilitating the repair of real defects—a feature that traditional black-box machine learning methods struggle to match. Attached Figure Description
[0053] Figure 1 This is a schematic flowchart of the method of the present invention; Figure 2 This is a schematic diagram illustrating the process of extracting and constructing extended context for a large language model in an embodiment of the present invention. Detailed Implementation
[0054] The present invention will be further illustrated below with reference to specific embodiments. It should be understood that these embodiments are for illustrative purposes only and are not intended to limit the scope of the invention. After reading the present invention, any modifications of the present invention in various equivalent forms by those skilled in the art will fall within the scope defined by the appended claims.
[0055] This embodiment provides a method for filtering false alarms in static code analysis based on a large language model, such as... Figure 1 As shown, it includes the following steps: Step S1: Obtain the original alarm information generated by the static code analysis tool after scanning the target source code. The alarm information should at least include the defect location and alarm type.
[0056] Step S2: Based on the defect location in Step S1, parse the abstract syntax tree of the corresponding source code file, and automatically extract the complete syntax unit covering the defect location as the source code context fragment. At the same time, construct the control flow graph and data flow graph of the scope of the defect code.
[0057] Step S3: Combine the alarm information obtained in step S1 with the code context fragments, control flow graphs and data flow graphs extracted in step S2 in a structured manner to form an information-rich analysis prompt for a large language model.
[0058] Step S4: Input the analysis prompts generated in step S3 into the large language model and obtain the structured response generated by the model, which includes false alarm judgment results and defect classification.
[0059] Step S5: Based on the structured response results in step S4, the original alarm information is automatically filtered, and a high-confidence alarm list is finally output.
[0060] In this embodiment, the specific implementation of step S1 includes: a) Scanning sub-step: Call the static code analysis tool to scan the target source code and generate scan results containing one or more alarm records.
[0061] b) Extraction sub-step: Extract key fields from each alarm record in the scan results, including source code file path, line number where the defect is located, general weakness enumeration identifier, and defect type description.
[0062] c) Parsing sub-step: Parse the extracted information and organize it into structured data objects so that subsequent steps can quickly retrieve the corresponding source code content based on the defect location information.
[0063] In this embodiment, the specific implementation of step S2 is as follows: Figure 2 As shown, it includes: a) Parsing sub-step: Based on the source code file path and the line number where the defect is located obtained in step S1, call the abstract syntax tree parser (e.g., Tree-Sitter) to parse the source code file, locate the syntax node containing the line number where the defect is located, and determine the scope of the syntax unit to which it belongs, which is usually a function body or class definition.
[0064] b) Source code extraction sub-step: Under the condition of not exceeding the preset character limit (e.g., 2000 characters), extract the source code context fragment containing the defect location from the syntactic unit range, and prioritize retaining the complete function or class definition.
[0065] c) Control Flow Analysis Sub-step: Construct a control flow graph within a syntactic unit. This process supports cross-function control path tracing within the same source file. Generate the corresponding control flow graph structure to represent the execution order, conditional branches, loops, and relationships between cross-function calls.
[0066] d) Data Flow Analysis Sub-step: Within a syntax unit, the definition and usage relationships of variables, parameters, and return values related to the defective code are traced. This process supports cross-function data dependency tracing within the same source file. A corresponding data flow graph structure is generated to represent the definition, reference, and dependency relationships of variables and data in the context.
[0067] e) Fusion sub-step: Integrate the information obtained through source code extraction, control flow analysis, and data flow analysis to generate an extended context containing source code context fragments, control flow graph structures, and data flow graph structures, which will serve as input for subsequent steps.
[0068] The following is a preferred Python implementation pseudocode for step S2. This pseudocode is only for aiding understanding of the implementation process and is not intended to limit the actual implementation of this invention.
[0069] (1) Obtaining pseudocode of defective code fragments based on abstract syntax tree (extracting semantically complete fragments) import tree_sitter # AST parser from tree_sitter_languages import get_language, get_parser def extract_ast_chunk(file_path, defect_line, max_chars=2000): language = get_language('python') # Target language parser = get_parser(language) with open(file_path, 'r') as f: source = f.read() tree = parser.parse(source.encode()) # Locating the defect node defect_node = find_node_at_line(tree.root_node, defect_line) func_node = find_enclosing_function(defect_node) # Function / Class # Extract CodeSnippet (using formula 1) if func_node: chunk_start = func_node.start_byte chunk_end = func_node.end_byte if (chunk_end - chunk_start)>max_chars: # The truncation logic can be refined according to requirements. chunk_end = chunk_start + max_chars source_chunk = source[chunk_start:chunk_end] return source_chunk, func_node return None, None This pseudocode uses a tree-sitter to locate syntax units, ensuring that the CodeSnippet covers the complete function body.
[0070] (2) Control flow graph construction pseudocode import networkx as nx # Image library def build_cfg(ast_node): cfg = nx.DiGraph() # Directed graph nodes = [] # List of basic blocks def traverse(node, pred_block=None): if is_basic_block(node): # Statement sequence block_id = len(nodes) nodes.append(node) cfg.add_node(block_id, label=node.text) if pred_block is not None: cfg.add_edge(pred_block, block_id) current_block = block_id elif is_control(node): # Branch / Loop / Call # Simplified representation: handling branches and calls if pred_block is not None: # Assuming the control node is also a block control_id = len(nodes) cfg.add_node(control_id, label=node.type) cfg.add_edge(pred_block, control_id) for child in node.children: traverse(child, control_id) if is_call(child): # Cross-function callee_cfg = build_cfg(child.callee_ast) # Recursive build cfg = nx.compose(cfg, callee_cfg) # Blend graph traverse(ast_node) return cfg (3) Pseudocode for data flow graph construction import networkx as nx def build_dfg(ast_node, variables): dfg = nx.DiGraph() def track_flow(node): if is_def(node, var): # Define a node dfg.add_node(node.start_point, type='def', var=var) elif is_use(node, var): # Use the dot preds = find_predecessors(node, var) # Tracing forward for pred in preds: dfg.add_edge(pred.start_point, node.start_point) # Dependent edge if is_call(node): # Cross-function data flow param_flow = track_call_params(node.callee, var) dfg.add_edges_from(param_flow) for var in variables: # Defect-related variables track_flow(ast_node) return dfg In this embodiment, the specific implementation of step S3 includes: a) Information integration sub-step: Map and match the alarm information obtained in step S1 with the extended context obtained in step S2, and establish the association between the alarm location and the code node, control flow node and data flow node in the context segment according to the preset fields.
[0071] b) Format Encapsulation Sub-step: Based on the prompt template adapted to the large language model input, the integrated information is converted into a unified structured representation. The structured representation includes text description fields, source code fields, control flow graph fields, and data flow graph fields for describing alarms.
[0072] c) Hint Generation Sub-step: Embedding task instructions and the expected output format into the structured representation forms the final analysis hint for the large language model, guiding the model to make accurate false alarm judgments and defect classifications based on the provided multi-source information. The analysis hint includes a clear analysis task description, the original alarm information output from step S1, the extended context output from step S2, and an output pattern that specifies the model's return format. The output pattern clearly defines the fields that should be included in the response and their data types, such as false alarm judgment (Boolean value), confidence level (floating-point number), predicted defect classification (string), and explanation (string), etc.
[0073] The following is a preferred suggestion template for step S3 (JSON-like format, easy for large language models to parse): { "task": "Analyze the following static analysis alert for falsepositive and classify the defect.", "alert_info": { "file_path": "example.py", "defect_line": 10, "cwe_id": "CWE-476", "type_desc": "Null pointer dereference suspected" }, "code_chunk": "def risky_func(x):\n if x is not None and cond(x):\n return x.method() # Potential null deref at line 10\nreturn None\n\nclass Helper:\n def method(self):\n pass", "cfg": { "description": "Control Flow Graph: Entry ->Block1 (if check) ->Block2 (x.method call) ->Exit. Another path from Block1 to Exit.", "nodes": ["Entry", "Block1 (if x is not None)", "Block2 (returnx.method())", "Exit"], "edges": [["Entry", "Block1"], ["Block1", "Block2"], ["Block1", "Exit"]] }, "dfg": { "description": "Data Flow Graph: Parameter 'x' is used at line10. The use is guarded by a null check.", "nodes": ["def(x)", "use(x at line 10)"], "edges": [["def(x)", "use(x at line 10)"]] }, "instructions": "Based on the provided code, Control Flow Graph(CFG), and Data Flow Graph (DFG), determine if the alert is a false positive.If it is not a false positive, classify the defect type. Provide yourresponse in the specified JSON format.", "output_format": { "is_false_positive": "boolean", "confidence": "float (0.0-1.0)", "defect_class": "string (eg, 'NullDeref', 'BufferOverflow', 'None')", "explanation": "string } } This template embeds multi-source information to guide large models in generating structured responses (such as {'is_false_positive':true, 'confidence': 0.95, ...}), which facilitates parsing.
[0074] In this embodiment, the specific implementation of step S4 includes: a) Model Inference Sub-step: The analysis hints generated in step S3 are input into a large language model that supports structured text input to trigger the model to perform inference based on the source code context, control flow graph structured representation, and data flow graph structured representation in the hints. This embodiment also includes retry logic (e.g., retrying up to 3 times) to handle temporary network or model interface exceptions.
[0075] b) Result generation sub-step: The large language model generates a structured response according to the expected output format defined in the prompt.
[0076] c) Result parsing sub-step: Parse the structured response, map the judgment results and defect classifications back to the corresponding original alarm records, and generate a structured judgment dataset that can be used for subsequent automatic filtering.
[0077] In this embodiment, the specific implementation of step S5 includes: a) Analysis and Decision Sub-step: Align the structured decision dataset generated in step S4 with the original alarm records, and apply the following decision rules to each alarm record: i. If the false alarm judgment result is true (i.e., it is determined to be a false alarm), then mark the alarm record as pending removal; ii. If the false alarm is determined to be false (i.e., it is determined to be a real defect), the alarm record is marked as to be retained, and the defect type information of the alarm record can be updated selectively based on the defect classification label generated by the large language model.
[0078] b) List generation and output sub-step: Based on the application results of the aforementioned decision rules, remove all records marked as "to be removed" from the original alarm set, and integrate all records marked as "to be retained" (including their updated defect types) to finally generate and output a high-confidence alarm list.
[0079] Here, "high confidence alarm" has a clear determination criterion. In the context of this invention, "high confidence" refers to an alarm record that is explicitly determined by the large language model to be a non-false alarm. Specifically, when the field representing false alarm judgment (e.g., is_false_positive) in the structured response obtained from the large language model has a value of false, the alarm is identified as a high confidence alarm.
[0080] In a preferred embodiment, the determination of high confidence can also be combined with the confidence score output by the model. For example, the structured response in step S4 includes a confidence field (e.g., confidence, with a value range of 0-1). In this case, the screening criterion for high-confidence alarms can be further defined as follows: when the is_false_positive field is false and the confidence field value is higher than a preset threshold (e.g., 0.8), the alarm is finally identified as a high-confidence alarm and output to the list. This method can further filter out alarms with uncertain model judgment results, thereby ensuring the reliability of the final output list.
[0081] As demonstrated by the examples above, the method of this invention, through multi-dimensional and in-depth analysis of the code and the construction of structured prompts for interaction with a large language model, can accurately determine the authenticity of static analysis alerts. This method leverages the powerful code understanding capabilities of the large language model to provide a universal and efficient solution for alert filtering, thereby significantly reducing false positive rates and improving the efficiency of software development and security audits.
[0082] Obviously, those skilled in the art should understand that the steps of the static code analysis alarm false alarm filtering method based on a large language model according to the above embodiments of the present invention can be implemented using general-purpose computing devices. They can be centralized on a single computing device or distributed across a network of multiple computing devices. Optionally, they can be implemented using computer-executable program code, thereby storing them in a storage device for execution by the computing device. Furthermore, in some cases, the steps shown or described can be performed in a different order than presented here, or they can be fabricated as separate integrated circuit modules, or multiple modules or steps can be fabricated as a single integrated circuit module. Thus, the embodiments of the present invention are not limited to any particular hardware and software combination.
Claims
1. A method for filtering false alarms in static code analysis based on a large language model, characterized in that, Includes the following steps: Step S1: Obtain alarm information generated by the static code analysis tool, wherein the alarm information includes at least the defect location and alarm type; Step S2: Based on the abstract syntax tree of the source code parsed at the defect location, automatically extract the outermost complete syntax unit covering the defect location as the source code context fragment within a preset character limit, and further within this syntax unit: a) Construct a control flow graph that can represent execution path relationships, including cross-function calls; b) Construct a data flow graph that can represent variable definitions and reference relationships, including cross-functional dependencies; Step S3: Combine the alarm information described in Step S1 with the source code context fragments extracted in Step S2, the constructed control flow graph and data flow graph in a structured manner to form analysis prompts for large language models; Step S4: Input the analysis prompts generated in step S3 into any large language model that supports structured text input and structured output to obtain a structured response that includes false alarm judgment results and defect classification; Step S5: Based on the structured response described in step S4, automatically filter the original alarms and output a list of high-confidence alarms.
2. The static code analysis alarm and false alarm filtering method based on a large language model according to claim 1, characterized in that, The process of obtaining alarm information in step S1 includes: 1a) Scanning sub-step: Call the static code analysis tool to scan the target source code and generate scan results containing one or more alarm records; 1b) Extraction sub-step: From the scan results, extract the key fields of each alarm record, including the source code file path, the line number where the defect is located, the general weakness enumeration identifier, and the defect type description; 1c) Parsing sub-step: Parse the extracted field information and organize it into a structured data object so that subsequent steps can quickly retrieve the corresponding source code content based on the defect location information.
3. The static code analysis alarm and false alarm filtering method based on a large language model according to claim 1, characterized in that, In step S2, the process of extracting source code context fragments based on the abstract syntax tree and constructing control flow graphs and data flow graphs includes: 2a) Parsing sub-step: Based on the defect location obtained in step S1, the defect location is the source code file path and the line number where the defect is located. Call the abstract syntax tree parser that supports the target programming language to parse the source code file, locate the syntax node containing the line number where the defect is located, and determine the scope of the syntax unit to which it belongs, which is the function body or class definition. 2b) Source code extraction sub-step: Under the condition of not exceeding the preset character limit, extract the source code context fragment containing the defect location from the scope of the syntax unit, and prioritize ensuring that it constitutes a complete function or class definition; 2c) Control flow analysis sub-step: Construct a control flow graph within the syntactic unit to support tracing control paths across function calls within the same source file scope, and generate a corresponding control flow graph structure to represent the relationship between the code's execution order, conditional branches, loop structures, and cross-function calls; 2d) Data flow analysis sub-step: Within the syntactic unit, the definition and usage relationship of variables related to the defect code are traced, supporting cross-function call data dependency tracing within the same source file scope, and generating the corresponding data flow graph structure to represent the definition, reference, and dependency relationship of variables and data in the context fragment; 2e) Fusion sub-step: Integrate the information obtained through source code extraction, control flow analysis and data flow analysis to generate an extended context containing source code context fragments, control flow graph structure and data flow graph structure, as input for subsequent steps.
4. The static code analysis alarm and false alarm filtering method based on a large language model according to claim 1, characterized in that, In step S3, the process of structurally combining alarm information with code context fragments, control flow graphs, and data flow graphs includes: 3a) Information integration sub-step: Map and match the alarm information obtained in step S1 with the extended context obtained in step S2, and establish the association between the alarm location and the code node, control flow node and data flow node in the context segment; 3b) Format encapsulation sub-step: According to the preset prompt template, the integrated information is converted into a unified structured representation; the structured representation includes text description fields, source code fields, control flow graph fields and data flow graph fields for describing alarms; 3c) Prompt generation sub-step: Embed the analysis task description and expected output format into the structured representation to form analysis prompts for large language models, so that the model can make false alarm judgments and defect classifications based on the provided multi-source information after receiving the prompts.
5. The static code analysis alarm and false alarm filtering method based on a large language model according to claim 1, characterized in that, In step S4, the process of analyzing the input language model and obtaining the structured response includes: 4a) Model inference sub-step: Input the analysis prompts for the large language model generated in step S3 into the pre-trained large language model that supports structured text input, triggering the model to infer based on the source code context, control flow graph structured representation and data flow graph structured representation in the prompts; 4b) Result generation sub-step: The large language model generates a structured response according to the expected output format defined in the prompt. The structured response includes at least the false alarm judgment result of the alarm record and the defect classification label. 4c) Result parsing sub-step: Parse the structured response, map the judgment results and defect classifications back to the corresponding original alarm records, and generate a structured judgment dataset that can be used for subsequent automatic screening.
6. The static code analysis alarm false alarm filtering method based on a large language model according to claim 1, characterized in that, In step S5, the process of automatically filtering the original alarms based on the structured response results includes: 5a) Analysis and Decision Sub-step: Align the structured decision dataset generated in step S4 with the original alarm records, and apply the following decision rules to each alarm record: i. If the false alarm judgment result is true, then mark the alarm record as to be removed; ii. If the false alarm judgment result is false, the alarm record is marked as to be retained, and the defect type information of the alarm record is updated based on the defect classification label generated by the large language model; 5b) List generation and output sub-step: Based on the application results of the aforementioned decision rules, remove all records marked as "to be removed" from the original alarm set, and integrate all records marked as "to be retained" to finally generate and output a high-confidence alarm list.
7. A computer device, characterized in that: The computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the steps of the static code analysis alarm false alarm filtering method based on a large language model as described in any one of claims 1-6.
8. A computer-readable storage medium having a computer program / instructions stored thereon, characterized in that: When the computer program / instruction is executed by the processor, it implements the steps of the static code analysis alarm false alarm filtering method based on a large language model as described in any one of claims 1-6.