A method, apparatus, computer equipment, and storage medium for detecting code vulnerabilities.
By acquiring vulnerability and path information from multiple test data sets, selecting target test data to generate test cases, and performing detection in a simulation environment, the problem of low efficiency in traditional code vulnerability detection is solved, achieving efficient and accurate vulnerability detection and remediation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INSPUR SUZHOU INTELLIGENT TECH CO LTD
- Filing Date
- 2024-12-20
- Publication Date
- 2026-06-30
AI Technical Summary
Traditional code vulnerability detection relies on manual review, which is inefficient, highly susceptible to subjective factors, and difficult to adapt to the needs of large-scale projects.
By acquiring vulnerability and path information from multiple test data sets, target test data is selected according to preset rules, test cases are generated, and tests are conducted in a simulation testing environment. Vulnerabilities are then patched by combining a general vulnerability scoring system and network vulnerability scanning tools.
It significantly improves the accuracy, efficiency, and comprehensiveness of code vulnerability detection, reduces the impact of environmental differences and external factors, and improves the efficiency and accuracy of vulnerability remediation.
Smart Images

Figure CN119760723B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of vulnerability detection technology, specifically to a method, apparatus, computer device, and storage medium for detecting code vulnerabilities. Background Technology
[0002] Currently, traditional code vulnerability detection relies on manual review, resulting in low efficiency, significant susceptibility to subjective factors, and difficulty in scaling, leading to limitations and challenges when handling large-scale projects. As project size increases and development cycles shorten, traditional manual review methods are becoming increasingly inadequate for the demands of modern software development. Therefore, improving the efficiency and quality of code vulnerability detection is an urgent problem to be solved. Summary of the Invention
[0003] In view of this, the present invention provides a method, apparatus, computer device, and storage medium for detecting code vulnerabilities, in order to solve the problem of how to improve the efficiency and quality of code vulnerability detection.
[0004] In a first aspect, the present invention provides a method for detecting code vulnerabilities, which is applied to perform vulnerability testing on code to be tested. The method includes:
[0005] Obtain vulnerability information and / or path information during the process of testing the above-mentioned code under test using multiple test data, with one test data corresponding to one vulnerability information and / or one path information;
[0006] Based on each of the above-mentioned vulnerability information and / or each of the above-mentioned path information, at least one target test data is selected from multiple test data according to a preset rule, wherein the above-mentioned vulnerability information and the above-mentioned path information corresponding to the at least one target test data conform to the above-mentioned preset rule.
[0007] At least one test case is generated based on the target test data and historical vulnerability data, wherein the historical code corresponding to the historical vulnerability data is of the same type as the code to be tested.
[0008] The code under test is tested using at least one of the above test cases to obtain test results, which include vulnerability data of the code under test.
[0009] This invention provides a method for detecting code vulnerabilities. It acquires vulnerability information and / or path information during the testing process of the code under test using multiple test data sets. Based on each vulnerability and / or path information set, it selects at least one target test data set that conforms to the preset rules from the multiple test data sets. Based on the target test data set and historical vulnerability data of the same type as the code under test, it generates at least one test case. The code under test is then tested using at least one test case, yielding test results including the vulnerability data of the target code under test. By combining multiple test data sets and historical vulnerability information, and using preset rules for filtering and test case generation, the method tests the code under test to obtain vulnerability data. This significantly improves the accuracy, efficiency, and comprehensiveness of code vulnerability detection, thereby achieving the technical effect of improving the efficiency and quality of code vulnerability detection.
[0010] In one optional implementation, the above-mentioned selection of at least one target test data from multiple test data based on each of the aforementioned vulnerability information, each of the aforementioned path information, and preset rules includes:
[0011] Based on each of the above vulnerability information and / or each of the above path information, determine the fitness parameter for each of the above test data. The fitness parameter includes at least one of the following parameters: test speed parameter, test richness parameter, and test depth parameter.
[0012] Based on each of the above fitness parameters and preset weights, determine the fitness value of each of the above test data, with one fitness value corresponding to one test data;
[0013] Compare each of the above fitness values with the fitness threshold, and count all target fitness values that are greater than the above fitness threshold. Based on the relationship between each target fitness value and the test data, determine at least one of the above target test data.
[0014] The method provided in this embodiment analyzes the fitness of test data based on vulnerability information, path information, and preset rules, enabling more accurate selection of suitable test data. Quantitative evaluation of test data using fitness parameters efficiently filters out the most representative target test data. Optimization is performed based on the fitness parameters of the test data, combined with preset weights, ensuring that the specific needs of different testing objectives are met. This avoids interference from redundant test data, thereby improving testing efficiency and quality.
[0015] In one optional implementation, the vulnerability information includes: vulnerability trigger time and number of vulnerability categories; the path information includes: test path length and test path depth.
[0016] Based on each of the aforementioned vulnerability information and / or each of the aforementioned path information, the fitness parameters for each of the aforementioned test data are determined, including:
[0017] Select the minimum vulnerability trigger time from the vulnerability trigger times of the above multiple test data; and determine the test speed parameter for each of the above test data based on each vulnerability trigger time and the minimum vulnerability trigger time.
[0018] Alternatively, based on the number of vulnerability categories in each of the above test data, determine the total number of vulnerability categories in the multiple test data; and based on the number of each of the above vulnerability categories and the total number of the above vulnerability categories, determine the test richness parameter for each of the above test data.
[0019] Alternatively, select the minimum test path length and minimum test path depth from the test path length and test path depth of the above multiple test data; and determine the test depth parameter for each of the above test data based on each of the above test path lengths and each of the above test path depths, as well as the minimum test path length and the minimum test path depth.
[0020] The method provided in this embodiment selects the minimum vulnerability trigger time from multiple test data sets and determines the test speed parameter of the test data based on the difference between the vulnerability trigger time of each test data set and the minimum vulnerability trigger time. This ensures accurate evaluation of test speed, allowing test data with shorter vulnerability trigger times to be prioritized for subsequent vulnerability detection. By comparing the number of vulnerability categories for each test data set with the total number of vulnerability categories for all test data sets, the test richness parameter of the test data can be effectively determined. This ensures broader coverage of vulnerability types and avoids missing potential vulnerabilities. By selecting the minimum test path length and test path depth and comparing them with the path length and depth of each test data set, the test depth parameter of the test data can be accurately evaluated. This helps identify test data that can be used to deeply explore vulnerabilities, ensuring the depth and comprehensiveness of vulnerability detection.
[0021] In one optional implementation, after testing the code under test using at least one test case and obtaining the test results, the method further includes:
[0022] Obtain the test cases corresponding to the above vulnerability data;
[0023] Construct a simulation test environment for the code to be tested.
[0024] In the above simulation test environment, the above test cases are used to test the above code under test, and at least one verified vulnerability is obtained;
[0025] Based on preset classification rules, the vulnerabilities verified above at least once are classified to obtain the types of the vulnerabilities verified above at least once.
[0026] The method provided in this embodiment, by testing the code under test using test cases in a simulated testing environment, can accurately simulate vulnerability triggering conditions in a real-world operating environment. This ensures the accuracy of vulnerability verification. Compared to traditional vulnerability testing directly in a real-world environment, it ensures that vulnerabilities are reproduced more comprehensively and realistically, reducing testing errors caused by environmental differences or external factors. After constructing the simulated testing environment, more controllable testing of the code under test is possible. This ensures that vulnerability verification is not affected by unforeseen factors in the real-world operating environment. This controllability makes vulnerability detection more reliable and avoids omissions caused by uncontrollable factors.
[0027] In one optional implementation, after testing the code under test using the test cases in the above-described simulation testing environment and obtaining at least one verified vulnerability, the process includes:
[0028] Use a general vulnerability scoring system to determine the priority of each vulnerability verified above;
[0029] Based on the priority of each of the above-verified vulnerabilities, at least one target vulnerability is identified, wherein the priority of the target vulnerability is greater than a preset priority threshold.
[0030] Using network vulnerability scanning tools, generate at least one remediation solution for each of the above-mentioned target vulnerabilities;
[0031] Send at least one of the aforementioned target vulnerabilities and at least one corresponding remediation solution to the management console.
[0032] The method provided in this embodiment utilizes a general vulnerability scoring system to objectively determine the priority of each verification vulnerability. Based on the vulnerability priority, the most critical target vulnerability requiring remediation is selected from multiple verification vulnerabilities. Setting a priority threshold allows for flexible filtering of target vulnerabilities that meet the remediation criteria, avoiding interference from low-priority vulnerabilities and making vulnerability remediation more efficient and targeted. By automatically generating remediation plans using network vulnerability scanning tools, the efficiency and accuracy of vulnerability remediation can be significantly improved. Network vulnerability scanning tools can recommend specific remediation measures based on the type and characteristics of the vulnerability. This automated approach reduces the risk of human error, ensures more accurate remediation plans, and generates processing plans in a short time, avoiding human oversight and delays.
[0033] In one optional implementation, the above-mentioned generation of at least one test case based on target test data and historical vulnerability data includes:
[0034] Obtain the behavior logs corresponding to the above historical codes. The behavior logs include information on multiple users' historical operations.
[0035] Determine the target user's historical operation information from the above-mentioned multiple user historical operation information;
[0036] Based on the correspondence between the target user's historical operation information and path information, the target path information corresponding to the target user's historical operation information is determined.
[0037] Based on preset security rules, the target code region is determined from the aforementioned historical code.
[0038] Based on the aforementioned target path information, the aforementioned target code region, and the aforementioned at least one target test data, generate the aforementioned at least one test case.
[0039] The method provided in this embodiment, by combining the target user's historical operation information, path information, and historical code behavior logs, can more accurately determine the focus of testing. By analyzing the target user's historical behavior, potential vulnerabilities can be identified. Through in-depth mining of historical vulnerability data and user operation logs, potential vulnerability areas can be intelligently identified, rather than relying solely on static code review. By combining path information with historical operations, the generated test cases can more accurately simulate vulnerability scenarios that may be encountered in real-world environments. Through historical code behavior logs and user operation information, testing strategies can be dynamically adjusted, and paths can be selected for verification. This makes the testing process not merely a static coverage test, but an adaptive adjustment based on actual conditions, thereby improving the efficiency and accuracy of vulnerability detection.
[0040] In one alternative implementation, before generating at least one test case based on the target test data and historical vulnerability data, the method further includes:
[0041] Collect multiple code samples containing known vulnerabilities from vulnerability databases and assemble them into a code sample set;
[0042] Based on the preset classification rules, the code samples in the above code sample set are classified to obtain the classified code sample set.
[0043] Based on generative adversarial networks, the code sample set for the above classification is expanded to obtain an expanded code sample set;
[0044] Based on the type of the code to be tested, the historical code is searched from the expanded code sample set, and the historical vulnerability data is determined.
[0045] The method provided in this embodiment obtains code samples from multiple known vulnerabilities from a vulnerability database to form a code sample set, ensuring that test case generation is based on more comprehensive and abundant historical vulnerability data. This provides a more sufficient source of vulnerability information for subsequent test case generation, guaranteeing the breadth and depth of testing. By classifying the code sample set, a categorized code sample set can be obtained, enabling more accurate differentiation of different types of vulnerabilities, and thus designing more targeted test cases for different categories of vulnerabilities. This can be based on dimensions such as vulnerability type, code structure, and usage environment, allowing test cases to focus on the specific characteristics of potential vulnerabilities and improving detection efficiency. Based on the type of code to be tested, historical code is searched from the expanded code sample set, and historical vulnerability data is determined. This effectively matches historical vulnerability data with the actual characteristics of the code to be tested, further improving the relevance of test cases.
[0046] Secondly, the present invention provides a code vulnerability detection device, which is used to perform vulnerability testing on code to be tested, and the device includes:
[0047] The acquisition module is used to acquire vulnerability information and / or path information during the process of testing the above-mentioned code under test using multiple test data. One test data corresponds to one vulnerability information and / or one path information.
[0048] The selection module is used to select at least one target test data from multiple test data according to preset rules based on each of the above-mentioned vulnerability information and / or each of the above-mentioned path information, wherein the above-mentioned vulnerability information and the above-mentioned path information corresponding to the at least one target test data conform to the above-mentioned preset rules.
[0049] The generation module is used to generate at least one test case based on the target test data and historical vulnerability data, wherein the historical code corresponding to the historical vulnerability data is of the same type as the code to be tested.
[0050] The testing module is used to test the code under test using at least one of the above test cases and obtain test results, including vulnerability data of the code under test.
[0051] This invention provides a code vulnerability detection device that selects the minimum vulnerability trigger time from multiple test data sets and determines the test speed parameter of the test data based on the difference between the vulnerability trigger time of each test data set and the minimum vulnerability trigger time. This ensures accurate evaluation of test speed, allowing for the priority selection of test data with shorter vulnerability trigger times for subsequent vulnerability detection. By comparing the number of vulnerability categories in each test data set with the total number of vulnerability categories across all test data sets, the test richness parameter of the test data can be effectively determined. This ensures broader coverage of vulnerability types and avoids missing potential vulnerabilities. By selecting the minimum test path length and test path depth and comparing them with the path length and depth of each test data set, the test depth parameter of the test data can be accurately evaluated. This helps identify test data that can be used to deeply explore vulnerabilities, ensuring the depth and comprehensiveness of vulnerability detection.
[0052] Thirdly, the present invention provides a computer device, including: a memory and a processor, the memory and the processor being communicatively connected to each other, the memory storing computer instructions, and the processor executing the computer instructions to perform the code vulnerability detection method described in the first aspect or any corresponding embodiment.
[0053] Fourthly, the present invention provides a computer-readable storage medium storing computer instructions for causing a computer to execute the code vulnerability detection method described in the first aspect or any corresponding embodiment thereof.
[0054] Fifthly, the present invention provides a computer program product, including computer instructions, which are used to cause a computer to execute the code vulnerability detection method described in the first aspect or any corresponding embodiment thereof. Attached Figure Description
[0055] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.
[0056] Figure 1 This is one of the flowcharts for a code vulnerability detection method according to an embodiment of the present invention;
[0057] Figure 2 This is a second flowchart of a code vulnerability detection method according to an embodiment of the present invention;
[0058] Figure 3This is a schematic diagram of a code vulnerability detection device according to an embodiment of the present invention;
[0059] Figure 4 This is a schematic diagram of the hardware structure of a computer device according to an embodiment of the present invention. Detailed Implementation
[0060] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0061] According to an embodiment of the present invention, a method for detecting code vulnerabilities is provided. It should be noted that the steps shown in the flowchart in the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions. Furthermore, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be executed in a different order than that shown here.
[0062] This embodiment provides a method for detecting code vulnerabilities, which can be used on computers. Figure 1 This is one of the flowcharts for a code vulnerability detection method according to an embodiment of the present invention, such as... Figure 1 As shown. The method is applied to perform vulnerability testing on the code to be tested, and the method includes:
[0063] Step S101: Obtain vulnerability information and / or path information during the testing of the code under test using multiple test data.
[0064] One test data point corresponds to one vulnerability and / or one path information. Test data can be randomly generated by a fuzzing tool. The code under test can be a software project, module, or codebase. Vulnerability information can include all vulnerabilities discovered during testing, such as buffer overflows, null pointer dereferences, and resource leaks, or it can be vulnerability pattern information. Path information can be the program's execution path during testing, such as the execution flow under different conditional branches, or a sequence of nodes.
[0065] One way to obtain vulnerability information and / or path information is to test the code under test multiple times, collecting vulnerability and path information for each test. Each test data corresponds to specific vulnerability and path information; vulnerability information refers to code problems discovered during execution, while path information refers to the program's execution path during the test.
[0066] Step S102: Based on each vulnerability information and / or each path information, select at least one target test data from multiple test data according to preset rules.
[0067] At least one target test data point must contain vulnerability information and path information that conforms to preset rules. These preset rules may include criteria such as vulnerability severity, path complexity, or vulnerability type.
[0068] One way to select at least one target test data is to select test data that conforms to preset rules as target test data based on the vulnerability pattern in the vulnerability information and the node sequence in the path information.
[0069] Step S103: Generate at least one test case based on the target test data and historical vulnerability data.
[0070] Historical vulnerability data corresponds to historical code of the same type as the code under test. Test cases can include input data that triggers the vulnerability, a sequence of operations, and expected results. Historical vulnerability data can include vulnerabilities previously discovered in similar code, or vulnerabilities that have similar structure, logic, or functionality to the code under test.
[0071] One method for generating test cases is to generate test cases with corresponding input data, operation sequences, and expected results based on target test data and historical vulnerability data.
[0072] Step S104: Use at least one test case to test the code under test and obtain the test results.
[0073] The test results include vulnerability data for the code under test.
[0074] One way to test code under test is to input the input data from the test cases into the code under test according to the operation sequence, compare the output results with the expected results, and determine the vulnerability data of the code under test as the test result.
[0075] This invention provides a method for detecting code vulnerabilities. It acquires vulnerability information and / or path information during the testing process of the code under test using multiple test data sets. Based on each vulnerability and / or path information set, it selects at least one target test data set that conforms to the preset rules from the multiple test data sets. Based on the target test data set and historical vulnerability data of the same type as the code under test, it generates at least one test case. The code under test is then tested using at least one test case, yielding test results including the vulnerability data of the target code under test. By combining multiple test data sets and historical vulnerability information, and using preset rules for filtering and test case generation, the method tests the code under test to obtain vulnerability data. This significantly improves the accuracy, efficiency, and comprehensiveness of code vulnerability detection, thereby achieving the technical effect of improving the efficiency and quality of code vulnerability detection.
[0076] In one alternative implementation, in order to further select target test data, Figure 2 This is a second flowchart of a code vulnerability detection method according to an embodiment of the present invention. Step S102 includes:
[0077] Step S201: Based on each of the above-mentioned vulnerability information and / or each of the above-mentioned path information, determine the fitness parameter for each of the above-mentioned test data.
[0078] The fitness parameters mentioned above include at least one of the following: test speed, test richness, and test depth. Test speed measures the time or computational resources required to execute the test data. Faster test data typically executes efficiently and is suitable for quickly verifying program stability. Test richness measures the number or types of code paths the test data can reach during execution. Test data covering more code paths means it can more comprehensively detect potential vulnerabilities. Test depth measures the test data's ability to test deeper logic within the code. It relates to whether the test data can delve into more complex logic, branches, or boundary conditions. Deeper test data helps identify deeper potential vulnerabilities.
[0079] One way to determine fitness parameters is to determine test speed and test richness based on the relative value of vulnerability information in the overall test data, and to determine test speed and test depth based on the relative value of path information in the overall test data.
[0080] Step S202: Determine the fitness value of each of the above-mentioned test data according to each of the fitness parameters and the preset weights.
[0081] One test data point corresponds to one fitness value. The preset weights can be normalized values that are manually set in advance according to the detection requirements or detection scenario.
[0082] One way to determine the fitness value is to sum the products of the test speed parameter, test richness parameter, test depth parameter and their corresponding preset weights, and use the summation result as the fitness value.
[0083] Step S203: Compare the fitness value of each of the above fitness values with the fitness threshold, and count all target fitness values that are greater than the fitness threshold. Based on the relationship between each target fitness value and the test data, determine at least one target test data.
[0084] The fitness threshold is a pre-set value used to determine which test data has sufficient testing value.
[0085] One way to determine target test data is to compare each fitness value with a fitness threshold, filter out fitness values that are greater than the fitness threshold, determine their corresponding test data, and use the corresponding test data as target test data.
[0086] The method provided in this embodiment analyzes the fitness of test data based on vulnerability information, path information, and preset rules, enabling more accurate selection of suitable test data. Quantitative evaluation of test data using fitness parameters efficiently filters out the most representative target test data. Optimization is performed based on the fitness parameters of the test data, combined with preset weights, ensuring that the specific needs of different testing objectives are met. This avoids interference from redundant test data, thereby improving testing efficiency and quality.
[0087] In one optional implementation, to further determine the fitness parameters of the test data, vulnerability information includes: vulnerability trigger time and number of vulnerability categories; path information includes: test path length and test path depth.
[0088] Vulnerability trigger time refers to the time required for a vulnerability to be triggered during testing. A shorter time indicates that the vulnerability is more likely to be triggered. Number of vulnerability categories refers to the number of different vulnerability types involved in the execution of test data. Test data that can trigger more types of vulnerabilities indicates broader coverage. Test path length refers to the "length" of the path traversed by the test data during testing, i.e., the number of lines of code involved in the code execution. A longer path length means a larger range of code covered by the test data. Test path depth refers to the depth into the code path, especially whether the test data can access more complex logic, branches, or nested structures. A greater path depth may help uncover deeper potential problems.
[0089] Step S201 includes:
[0090] Select the minimum vulnerability trigger time from multiple test data; and determine the test speed parameter for each test data based on the vulnerability trigger time and the minimum vulnerability trigger time.
[0091] Alternatively, determine the total number of vulnerability categories for multiple test datasets based on the number of vulnerability categories for each test dataset; and determine the test richness parameter for each test dataset based on the number of vulnerability categories for each data set and the total number of vulnerability categories.
[0092] Alternatively, select the minimum test path length and minimum test path depth from the test path lengths and test path depths of multiple test data; and determine the test depth parameter for each test data based on each test path length and each test path depth, as well as the minimum test path length and minimum test path depth.
[0093] In this embodiment, the test speed parameter can be the minimum vulnerability trigger time selected from multiple test data vulnerability trigger times, i.e., identifying which test data triggers the vulnerability earliest. For each test data, the ratio between its vulnerability trigger time and the minimum trigger time is calculated as the test speed parameter.
[0094] The test richness parameter can be calculated by first determining the total number of vulnerability categories across multiple test datasets, i.e., the number of all different vulnerability types triggered in all test datasets. For each test dataset, the ratio of its vulnerability category count to the total number of vulnerability categories is calculated and used as the test richness parameter.
[0095] The test depth parameter can be selected from the shortest path length among all test data paths, representing the simplest path during execution. Alternatively, it can be selected from the shallowest path depth among all test data paths, representing the shallowest level path during execution. For each test data point, the sum of its path length and path depth, and the sum of the minimum path length and minimum path depth, are calculated. The ratio of these two sums is used as the test depth parameter.
[0096] The method provided in this embodiment selects the minimum vulnerability trigger time from multiple test data sets and determines the test speed parameter of the test data based on the difference between the vulnerability trigger time of each test data set and the minimum vulnerability trigger time. This ensures accurate evaluation of test speed, allowing test data with shorter vulnerability trigger times to be prioritized for subsequent vulnerability detection. By comparing the number of vulnerability categories for each test data set with the total number of vulnerability categories for all test data sets, the test richness parameter of the test data can be effectively determined. This ensures broader coverage of vulnerability types and avoids missing potential vulnerabilities. By selecting the minimum test path length and test path depth and comparing them with the path length and depth of each test data set, the test depth parameter of the test data can be accurately evaluated. This helps identify test data that can be used to deeply explore vulnerabilities, ensuring the depth and comprehensiveness of vulnerability detection.
[0097] In an optional implementation, to further reproduce the test results, after step S104, the following step is further included:
[0098] Obtain test cases corresponding to the aforementioned vulnerability data; construct a simulation test environment for the code to be tested; test the code to be tested using the aforementioned test cases in the aforementioned simulation test environment to obtain at least one verified vulnerability; classify the at least one verified vulnerability based on preset classification rules to obtain the type of the at least one verified vulnerability.
[0099] In this embodiment, relevant test cases are extracted based on previous test results. The simulation environment simulates real system operating conditions, including hardware, operating system, library dependencies, network conditions, etc. After setting up the simulation test environment, the acquired test cases, i.e., the input data related to the previously identified vulnerabilities, are used to test the code under test. The test cases are run in the simulation environment to verify whether the code under test will trigger the corresponding vulnerability. The test results will confirm whether the vulnerability is reproduced in the simulation environment. Even in different environments, it will determine whether the vulnerability still exists and whether it can be repeatedly triggered.
[0100] The method provided in this embodiment, by testing the code under test using test cases in a simulated testing environment, can accurately simulate vulnerability triggering conditions in a real-world operating environment. This ensures the accuracy of vulnerability verification. Compared to traditional vulnerability testing directly in a real-world environment, it ensures that vulnerabilities are reproduced more comprehensively and realistically, reducing testing errors caused by environmental differences or external factors. After constructing the simulated testing environment, more controllable testing of the code under test is possible. This ensures that vulnerability verification is not affected by unforeseen factors in the real-world operating environment. This controllability makes vulnerability detection more reliable and avoids omissions caused by uncontrollable factors.
[0101] In one optional implementation, to further patch the vulnerability, after testing the code under test using the above-described test cases in the above-described simulation test environment and obtaining at least one verified vulnerability, the process includes:
[0102] Using a general vulnerability scoring system, determine the priority of each of the above-verified vulnerabilities; based on the priority of each of the above-verified vulnerabilities, determine at least one target vulnerability, wherein the priority of the target vulnerability is greater than a preset priority threshold; using a network vulnerability scanning tool, generate at least one remediation solution corresponding to each of the above at least one target vulnerability; send the above at least one target vulnerability and the corresponding at least one remediation solution to the management terminal.
[0103] In this embodiment, after identifying at least one verification vulnerability, the system scores each verified vulnerability using the Common Vulnerability Scoring System (CVSS). CVSS is a standardized vulnerability assessment tool that considers multiple factors (such as vulnerability severity, ease of exploitation, and scope of impact) to assign a priority score. Prioritizing all verification vulnerabilities requires setting a preset priority threshold. Target vulnerabilities are those with a priority higher than or equal to the set threshold. Remediation plan generation typically relies on network vulnerability scanning tools, which can automatically detect vulnerabilities and provide suggested remediation measures based on vulnerability type. Network vulnerability scanning tools generate remediation plans based on the following factors: vulnerability type (e.g., buffer overflow, SQL injection); affected system or software version; and vulnerability resolution methods (e.g., code repair, configuration changes, patch installation). After the remediation plan is generated, each target vulnerability and its corresponding remediation plan are sent to the management terminal.
[0104] The method provided in this embodiment utilizes a general vulnerability scoring system to objectively determine the priority of each verification vulnerability. Based on the vulnerability priority, the most critical target vulnerability requiring remediation is selected from multiple verification vulnerabilities. Setting a priority threshold allows for flexible filtering of target vulnerabilities that meet the remediation criteria, avoiding interference from low-priority vulnerabilities and making vulnerability remediation more efficient and targeted. By automatically generating remediation plans using network vulnerability scanning tools, the efficiency and accuracy of vulnerability remediation can be significantly improved. Network vulnerability scanning tools can recommend specific remediation measures based on the type and characteristics of the vulnerability. This automated approach reduces the risk of human error, ensures more accurate remediation plans, and generates processing plans in a short time, avoiding human oversight and delays.
[0105] In an optional implementation, to further accurately generate test cases, step S103 includes:
[0106] Obtain the behavior logs corresponding to the aforementioned historical code, which include historical operation information of multiple users; determine the historical operation information of the target user from the aforementioned historical operation information of multiple users; determine the target path information corresponding to the aforementioned historical operation information of the target user based on the correspondence between the aforementioned historical operation information of the target user and the path information; determine the target code region from the aforementioned historical code based on preset security rules; generate the aforementioned at least one test case based on the aforementioned target path information, the aforementioned target code region, and the aforementioned at least one target test data.
[0107] In this embodiment, the behavior log is log data that records user operation information. It includes the user's historical operations in the system, such as clicks, input, page browsing, API calls, etc. Each operation can be accompanied by specific timestamps, operation content, user identifiers, and other information. The target user's historical operation information can be information related to the target path information. Pre-set security rules can be security standards or inspection rules set in advance according to security requirements and vulnerability protection requirements. When generating test cases, the following information can be combined: Target path information: the user's historical operation path, representing the user's operation flow. Target code area: the part of the code that needs to be tested in detail, which may be code related to the user's operation path. Target test data: test data prepared for executing test cases, which can be simulated user input, operation data, or preset test conditions, etc.
[0108] The method provided in this embodiment, by combining the target user's historical operation information, path information, and historical code behavior logs, can more accurately determine the focus of testing. By analyzing the target user's historical behavior, potential vulnerabilities can be identified. Through in-depth mining of historical vulnerability data and user operation logs, potential vulnerability areas can be intelligently identified, rather than relying solely on static code review. By combining path information with historical operations, the generated test cases can more accurately simulate vulnerability scenarios that may be encountered in real-world environments. Through historical code behavior logs and user operation information, testing strategies can be dynamically adjusted, and paths can be selected for verification. This makes the testing process not merely a static coverage test, but an adaptive adjustment based on actual conditions, thereby improving the efficiency and accuracy of vulnerability detection.
[0109] In an optional implementation, to further accurately determine historical vulnerability data, the method further includes the following step before step S103:
[0110] Multiple code samples with known vulnerabilities are obtained from a vulnerability database to form a code sample set; the code samples in the above code sample set are classified according to a preset classification rule to obtain a classified code sample set; the above classified code sample set is expanded based on a generative adversarial network to obtain an expanded code sample set; according to the type of the code to be tested, the above historical code is searched from the above expanded code sample set, and the above historical vulnerability data is determined.
[0111] In this embodiment, by collecting multiple code samples with known vulnerabilities into a set, preparation can be made for subsequent vulnerability analysis and expansion. After classification, the code sample set will be divided into multiple subsets, each containing vulnerable code of similar type or similar characteristics. This classification helps to focus on different types of vulnerabilities and conduct in-depth analysis for different scenarios. An adversarial network-expanded set of fully classified code samples is generated. Based on the type of code to be tested, historical code with similar structure, logic, or function to the code to be tested can be searched from the expanded code sample set. Through the search process, historical code with similar vulnerability characteristics can ultimately be found from the expanded code samples, thereby identifying the vulnerabilities present in these historical codes.
[0112] The method provided in this embodiment obtains code samples from multiple known vulnerabilities from a vulnerability database to form a code sample set, ensuring that test case generation is based on more comprehensive and abundant historical vulnerability data. This provides a more sufficient source of vulnerability information for subsequent test case generation, guaranteeing the breadth and depth of testing. By classifying the code sample set, a categorized code sample set can be obtained, enabling more accurate differentiation of different types of vulnerabilities, and thus designing more targeted test cases for different categories of vulnerabilities. This can be based on dimensions such as vulnerability type, code structure, and usage environment, allowing test cases to focus on the specific characteristics of potential vulnerabilities and improving detection efficiency. Based on the type of code to be tested, historical code is searched from the expanded code sample set, and historical vulnerability data is determined. This effectively matches historical vulnerability data with the actual characteristics of the code to be tested, further improving the relevance of test cases.
[0113] In an optional implementation, to further accurately detect code vulnerabilities, after step S104, the method further includes: performing static and dynamic analysis on the code under test to obtain the control flow graph (CFG) and data flow graph (DFG) of the code under test, as well as the execution path information and function call stack generated at runtime.
[0114] Based on static and dynamic analysis information, high-risk areas in the code under test are identified, especially code segments that may lead to security vulnerabilities, such as improper input validation, buffer overflows, and database injection.
[0115] Based on the vulnerability types and locations in historical vulnerability data, and combined with the potential vulnerability areas in the code under test, the paths and conditions that test cases should cover are determined, thereby generating a set of test cases with efficient vulnerability detection capabilities, ensuring that the code paths most likely to contain vulnerabilities can be reached during the testing process.
[0116] By simulating and verifying the generated test cases, we ensure that each test case can effectively discover potential vulnerabilities in actual testing, optimize the coverage and vulnerability discovery rate of test cases, and improve the overall vulnerability detection quality.
[0117] Furthermore, embodiments of the present invention also provide a code vulnerability detection device. Figure 3 This is a schematic diagram of a code vulnerability detection device according to an embodiment of the present invention, such as... Figure 3 As shown, this detection device is used to perform vulnerability testing on the code under test. The device includes:
[0118] The acquisition module 301 is used to acquire vulnerability information and / or path information during the process of testing the above-mentioned code under test using multiple test data. One test data corresponds to one vulnerability information and / or one path information.
[0119] The selection module 302 is used to select at least one target test data from multiple test data according to preset rules based on each of the above-mentioned vulnerability information and / or each of the above-mentioned path information, wherein the vulnerability information and path information corresponding to the at least one target test data conform to the above-mentioned preset rules.
[0120] The generation module 303 is used to generate at least one test case based on the target test data and historical vulnerability data, wherein the historical code corresponding to the historical vulnerability data is of the same type as the code to be tested.
[0121] The test module 304 is used to test the code under test using at least one of the above test cases and obtain test results, including vulnerability data of the code under test.
[0122] This invention provides a code vulnerability detection device that selects the minimum vulnerability trigger time from multiple test data sets and determines the test speed parameter of the test data based on the difference between the vulnerability trigger time of each test data set and the minimum vulnerability trigger time. This ensures accurate evaluation of test speed, allowing for the priority selection of test data with shorter vulnerability trigger times for subsequent vulnerability detection. By comparing the number of vulnerability categories in each test data set with the total number of vulnerability categories across all test data sets, the test richness parameter of the test data can be effectively determined. This ensures broader coverage of vulnerability types and avoids missing potential vulnerabilities. By selecting the minimum test path length and test path depth and comparing them with the path length and depth of each test data set, the test depth parameter of the test data can be accurately evaluated. This helps identify test data that can be used to deeply explore vulnerabilities, ensuring the depth and comprehensiveness of vulnerability detection.
[0123] Further functional descriptions of the above modules and units are the same as those in the corresponding embodiments described above, and will not be repeated here.
[0124] In this embodiment, the code vulnerability detection device is presented in the form of a functional unit. Here, a unit refers to an ASIC (Application Specific Integrated Circuit) circuit, a processor and memory that execute one or more software or fixed programs, and / or other devices that can provide the above functions.
[0125] This invention also provides a computer device having the above-described features. Figure 3 The device shown is for detecting code vulnerabilities.
[0126] Please see Figure 4 , Figure 4 This is a schematic diagram of the structure of a computer device provided in an optional embodiment of the present invention, such as... Figure 4 As shown, the computer device includes one or more processors 10, memory 20, and interfaces for connecting the components, including high-speed interfaces and low-speed interfaces. The components communicate with each other via different buses and can be mounted on a common motherboard or otherwise installed as needed. The processors can process instructions executed within the computer device, including instructions stored in or on memory to display graphical information of a GUI on external input / output devices (such as display devices coupled to the interfaces). In some alternative implementations, multiple processors and / or multiple buses can be used with multiple memories and multiple memory modules, if desired. Similarly, multiple computer devices can be connected, each providing some of the necessary operations (e.g., as a server array, a group of blade servers, or a multiprocessor system). Figure 4 Take a processor 10 as an example.
[0127] Processor 10 may be a central processing unit, a network processor, or a combination thereof. Processor 10 may further include an integrated circuit. The integrated circuit may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof. The programmable logic device may be a complex programmable logic device (CAMP), a field-programmable gate array (FPGA), a general-purpose array logic (GPRS), or any combination thereof.
[0128] The memory 20 stores instructions executable by at least one processor 10 to cause the at least one processor 10 to perform the method shown in the above embodiment.
[0129] The memory 20 may include a program storage area and a data storage area. The program storage area may store the operating system and applications required for at least one function; the data storage area may store data created based on the use of the computer device. Furthermore, the memory 20 may include high-speed random access memory and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid-state storage device. In some alternative embodiments, the memory 20 may optionally include memory remotely located relative to the processor 10, and these remote memories may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
[0130] The memory 20 may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as flash memory, hard disk or solid-state drive; the memory 20 may also include a combination of the above types of memory.
[0131] The computer device also includes a communication interface 30 for communicating with other devices or communication networks.
[0132] This invention also provides a computer-readable storage medium. The methods described above according to embodiments of the invention can be implemented in hardware or firmware, or implemented as computer code that can be recorded on a storage medium, or implemented as computer code downloaded via a network and originally stored on a remote storage medium or a non-transitory machine-readable storage medium and then stored on a local storage medium. Thus, the methods described herein can be processed by software stored on a storage medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware. The storage medium can be a magnetic disk, optical disk, read-only memory, random access memory, flash memory, hard disk, or solid-state drive, etc.; further, the storage medium can also include combinations of the above types of memory. It is understood that computers, processors, microprocessor controllers, or programmable hardware include storage components capable of storing or receiving software or computer code, which, when accessed and executed by the computer, processor, or hardware, implements the methods shown in the above embodiments.
[0133] A portion of this invention can be applied as a computer program product, such as computer program instructions, which, when executed by a computer, can invoke or provide the methods and / or technical solutions according to the invention through the operation of the computer. Those skilled in the art will understand that the forms in which computer program instructions exist in a computer-readable medium include, but are not limited to, source files, executable files, installation package files, etc. Correspondingly, the ways in which computer program instructions are executed by a computer include, but are not limited to: the computer directly executing the instructions, or the computer compiling the instructions and then executing the corresponding compiled program, or the computer reading and executing the instructions, or the computer reading and installing the instructions and then executing the corresponding installed program. Here, the computer-readable medium can be any available computer-readable storage medium or communication medium accessible to a computer.
[0134] Although embodiments of the invention have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations all fall within the scope defined by the appended claims.
Claims
1. A method for detecting code vulnerabilities, characterized in that, The method is applied to perform vulnerability testing on the code to be tested, and the method includes: Obtain vulnerability information and / or path information during the process of testing the code under test using multiple test data, with one test data corresponding to one vulnerability information and / or one path information; Based on each vulnerability information and / or each path information, determine the fitness parameter for each test data, wherein the fitness parameter includes at least one of the following parameters: test speed parameter, test richness parameter, and test depth parameter; determine the fitness value for each test data according to each fitness parameter and a preset weight, with one fitness value corresponding to one test data; compare each fitness value with a fitness threshold, and count all target fitness values greater than the fitness threshold; determine at least one target test data according to the relationship between each target fitness value and the test data; The system retrieves behavior logs of historical code corresponding to historical vulnerability data, the behavior logs including historical operation information of multiple users; identifies target user historical operation information from the multiple user historical operation information; determines target path information corresponding to the target user historical operation information based on the correspondence between the target user historical operation information and path information; identifies target code regions from the historical code based on preset security rules; and generates at least one test case based on the target path information, the target code regions, and at least one target test data, wherein the historical code and the code to be tested are of the same type. The code under test is tested using at least one test case to obtain test results, which include vulnerability data of the code under test.
2. The method according to claim 1, characterized in that, The vulnerability information includes: vulnerability trigger time and number of vulnerability categories; the path information includes: test path length and test path depth. The step of determining the fitness parameter for each piece of test data based on each vulnerability information and / or each path information includes: Select the minimum vulnerability trigger time from the multiple vulnerability trigger times of the test data; and determine the test speed parameter for each of the test data based on each vulnerability trigger time and the minimum vulnerability trigger time. Alternatively, the total number of vulnerability categories for the multiple test data can be determined based on the number of vulnerability categories for each test data; and the test richness parameter for each test data can be determined based on the number of vulnerability categories for each test data and the total number of vulnerability categories. Alternatively, the minimum test path length and minimum test path depth can be selected from the test path length and test path depth of the plurality of test data; and the test depth parameter of each test data can be determined based on each test path length and each test path depth, as well as the minimum test path length and the minimum test path depth.
3. The method according to claim 1 or 2, characterized in that, After testing the code under test using the at least one test case and obtaining the test results, the method further includes: Obtain the test cases corresponding to the vulnerability data; Construct a simulation test environment for the code to be tested; In the simulation testing environment, the test cases are used to test the code under test, and at least one verified vulnerability is obtained; Based on preset classification rules, the at least one verified vulnerability is classified to obtain the type of the at least one verified vulnerability.
4. The method according to claim 3, characterized in that, In the simulation testing environment, after testing the code under test using the test cases and obtaining at least one verified vulnerability, the process includes: The priority of each verified vulnerability is determined using a general vulnerability scoring system. Based on the priority of each verified vulnerability, at least one target vulnerability is identified, wherein the priority of the target vulnerability is greater than a preset priority threshold. Using network vulnerability scanning tools, at least one remediation solution is generated for each of the at least one target vulnerability; Send the at least one target vulnerability and the corresponding at least one remediation solution to the management terminal.
5. The method according to claim 1 or 2, characterized in that, Before generating at least one test case based on the target test data and historical vulnerability data, the process also includes: Collect multiple code samples containing known vulnerabilities from vulnerability databases and assemble them into a code sample set; Based on preset classification rules, the code samples in the code sample set are classified to obtain a classified code sample set; Based on generative adversarial networks, the set of code samples for the classification is expanded to obtain an expanded set of code samples; Based on the type of the code to be tested, the historical code is searched from the expanded code sample set, and the historical vulnerability data is determined.
6. A code vulnerability detection device, characterized in that, The device is used to perform vulnerability testing on the code to be tested, and the device includes: The acquisition module is used to acquire vulnerability information and / or path information during the process of testing the code under test using multiple test data, with one test data corresponding to one vulnerability information and / or one path information. A selection module is used to determine a fitness parameter for each test data based on each vulnerability information and / or each path information. The fitness parameter includes at least one of the following parameters: test speed parameter, test richness parameter, and test depth parameter. Based on each fitness parameter and a preset weight, a fitness value is determined for each test data, with one fitness value corresponding to one test data. Each fitness value is compared with a fitness threshold, and all target fitness values greater than the fitness threshold are counted. Based on the relationship between each target fitness value and the test data, at least one target test data is determined. A generation module is used to obtain behavior logs of historical code corresponding to historical vulnerability data, the behavior logs including historical operation information of multiple users; determine the historical operation information of a target user from the historical operation information of multiple users; determine the target path information corresponding to the historical operation information of the target user according to the correspondence between the historical operation information of the target user and path information; determine the target code region from the historical code based on preset security rules; and generate at least one test case according to the target path information, the target code region, and at least one target test data, wherein the historical code is of the same type as the code to be tested. The testing module is used to test the code under test using at least one test case and obtain test results, the test results including vulnerability data of the code under test.
7. A computer device, characterized in that, include: A memory and a processor are communicatively connected, the memory stores computer instructions, and the processor executes the computer instructions to perform the code vulnerability detection method according to any one of claims 1 to 5.
8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions for causing a computer to execute the code vulnerability detection method according to any one of claims 1 to 5.