Vulnerability detection method and apparatus, electronic device, and storage medium

By employing a multi-layered vulnerability detection method, combined with multi-channel information acquisition and model training, the problem of inaccurate identification of novel high-value vulnerabilities in existing technologies has been solved, achieving efficient identification and classification of high-value vulnerabilities.

WO2026123483A1PCT designated stage Publication Date: 2026-06-18SHANGHAI DOUXIANG INFORMATION TECHNOLOGY CO LTD +1

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
SHANGHAI DOUXIANG INFORMATION TECHNOLOGY CO LTD
Filing Date
2025-03-10
Publication Date
2026-06-18

Smart Images

  • Figure CN2025081645_18062026_PF_FP_ABST
    Figure CN2025081645_18062026_PF_FP_ABST
Patent Text Reader

Abstract

The present application relates to the technical field of vulnerability detection, and discloses a vulnerability detection method and apparatus, an electronic device, and a storage medium. The method comprises: acquiring a vulnerability under detection and vulnerability information corresponding to the vulnerability under detection; using a preset high-value vulnerability model to determine, on the basis of the vulnerability information, whether the vulnerability under detection is a high-value vulnerability; if the vulnerability under detection is not a high-value vulnerability, using a preset anomaly detection model to identify the vulnerability under detection, and determining whether the vulnerability under detection is a rare vulnerability; if the vulnerability under detection is a rare vulnerability, using a preset clustering model to analyze whether the vulnerability under detection belongs to a high-value vulnerability cluster in the clustering model; if the vulnerability under detection belongs to the high-value vulnerability cluster, outputting the vulnerability under detection as a high-value vulnerability; and if the vulnerability under detection does not belong to the high-value vulnerability cluster, outputting the vulnerability under detection as a common vulnerability. In this way, whether a vulnerability is a novel high-value vulnerability can be accurately determined, thereby identifying a novel high-value vulnerability that has been incorrectly determined as a low-value vulnerability.
Need to check novelty before this filing date? Find Prior Art

Description

Vulnerability detection methods and devices, electronic devices, and storage media

[0001] Cross-reference to related applications

[0002] This application claims priority to Chinese Patent Application No. 202411802176.9, filed on December 9, 2024, entitled “Method and Apparatus for Vulnerability Detection, Electronic Device, Storage Medium”, the entire contents of which are incorporated herein by reference. Technical Field

[0003] This application relates to the field of vulnerability detection technology, and in particular to a vulnerability detection method and apparatus, electronic device, and storage medium. Background Technology

[0004] High-value vulnerabilities typically refer to those that pose a significant threat to systems, data, or users and can be exploited by malicious attackers to launch serious attacks. These vulnerabilities are often highly difficult to exploit and pose a high degree of harm. Once successfully exploited, they can lead to serious consequences such as operational disruptions, data breaches, and financial losses. Therefore, accurately identifying high-value vulnerabilities is extremely important. Related technologies often pre-train a model to identify whether a vulnerability is high-value. However, because the training data for this model often cannot fully cover the diverse scenarios in real-world operating environments, relying solely on the model to identify high-value vulnerabilities can mistakenly identify some high-value vulnerabilities as low-value vulnerabilities. This is especially true when new vulnerabilities emerge, as the model's training data may lack relevant data, making it difficult for the model to identify new high-value vulnerabilities.

[0005] It should be noted that the information disclosed in the background section above is only used to enhance the understanding of the background of this application, and therefore may include information that does not constitute prior art known to those skilled in the art. Summary of the Invention

[0006] To provide a basic understanding of some aspects of the disclosed embodiments, a brief summary is given below. This summary is not intended as a general commentary, nor is it intended to identify key / important components or describe the scope of protection of these embodiments, but rather as a prelude to the detailed description that follows.

[0007] This application provides a method, apparatus, electronic device, and storage medium for vulnerability detection to identify novel high-value vulnerabilities.

[0008] This application provides a vulnerability detection method, comprising: acquiring a vulnerability to be detected and corresponding vulnerability information; using a preset high-value vulnerability model to determine whether the vulnerability to be detected belongs to a high-value vulnerability based on the vulnerability information; if the vulnerability to be detected belongs to the high-value vulnerability, then outputting the vulnerability to be detected as a high-value vulnerability; if the vulnerability to be detected does not belong to the high-value vulnerability, then using a preset anomaly detection model to identify the vulnerability to be detected and determine whether the vulnerability to be detected belongs to a rare vulnerability; if it does not belong to a rare vulnerability, then outputting the vulnerability to be detected as a common vulnerability; if the vulnerability to be detected belongs to a rare vulnerability, then using a preset clustering model to analyze whether the vulnerability to be detected belongs to a high-value vulnerability cluster in the clustering model; if it belongs to the high-value vulnerability cluster, then outputting the vulnerability to be detected as a high-value vulnerability; if it does not belong to the high-value vulnerability cluster, then outputting the vulnerability to be detected as a common vulnerability.

[0009] In the above implementation, by further identifying low-value vulnerabilities identified by the high-value vulnerability model using the anomaly detection model, vulnerabilities that differ from ordinary low-value vulnerabilities can be identified. In this case, the vulnerability may be a novel one. By determining whether the vulnerability to be detected belongs to a high-value vulnerability cluster, it is possible to accurately determine whether the vulnerability is a novel high-value vulnerability, thereby identifying novel high-value vulnerabilities that have been mistakenly identified as low-value vulnerabilities.

[0010] Furthermore, obtaining vulnerability information corresponding to the vulnerability to be detected includes: obtaining a preset feature directory, the preset feature directory including multiple vulnerability feature items; parsing the vulnerability to be detected to obtain first feature information corresponding to the vulnerability feature item; extracting second feature information corresponding to the vulnerability feature item from the white-hat profile of the submitter who provided the vulnerability to be detected; extracting third feature information corresponding to the vulnerability feature item from a preset third-party platform; the first feature information, the second feature information, and the third feature information are used as the vulnerability information.

[0011] In the above implementation, by analyzing the vulnerability to be detected, vulnerability entries matching a preset characteristic directory are extracted from the white-hat profile of the vulnerability submitter and preset third-party platforms. In this way, more comprehensive vulnerability information can be obtained through multiple channels.

[0012] Furthermore, before determining whether the vulnerability to be detected belongs to a high-value vulnerability based on the vulnerability information using a preset high-value vulnerability model, the process includes: determining whether the vulnerability information exists in a preset whitelist; the whitelist stores vulnerability information that belongs to high-value vulnerabilities; if the vulnerability information exists in the whitelist, then output that the vulnerability information belongs to a high-value vulnerability; if the vulnerability information does not exist in the whitelist, then determine whether the vulnerability to be detected belongs to a high-value vulnerability based on the vulnerability information using the preset high-value vulnerability model.

[0013] In the above implementation, by determining whether the vulnerability information of the vulnerability to be detected has certain characteristics of being a high-value vulnerability, it is possible to easily and accurately identify whether the vulnerability to be detected is a high-value vulnerability, thereby reducing the workload of subsequently using the high-value vulnerability model to identify high-value vulnerabilities.

[0014] Furthermore, before determining whether the vulnerability to be detected belongs to a high-value vulnerability based on the vulnerability information using a preset high-value vulnerability model, the process includes: determining whether the vulnerability information exists in a preset blacklist, where the blacklist stores vulnerability information belonging to low-value vulnerabilities; the exploit value of the low-value vulnerability is lower than that of the high-value vulnerability; if the vulnerability information exists in the blacklist, then the vulnerability information is output as a low-value vulnerability; if the vulnerability information does not exist in the blacklist, then the preset high-value vulnerability model is used to determine whether the vulnerability to be detected belongs to a high-value vulnerability based on the vulnerability information.

[0015] In the above implementation, by determining whether the vulnerability information of the vulnerability to be detected has certain characteristics of being a low-value vulnerability, it is possible to easily and accurately identify whether the vulnerability to be detected is a low-value vulnerability, thereby reducing the workload of subsequently using a high-value vulnerability model to identify high-value vulnerabilities.

[0016] Furthermore, the high-value vulnerability model is obtained by: acquiring sample vulnerability information labeled as high-value vulnerabilities; training a preset random forest model using the sample vulnerability information labeled as high-value vulnerabilities; and updating the network parameters of the random forest model using a preset grid search algorithm until the recognition accuracy of the random forest model reaches a set threshold, thereby obtaining the high-value vulnerability model.

[0017] Considering the relatively small number of high-value vulnerability samples, this application uses a random forest model for training. Since the random forest model constructs multiple decision trees and takes the average prediction result, it can reduce the risk of overfitting that may be caused by a single decision tree and improve the generalization ability of the model. By training the random forest model, the trained high-value vulnerability model can more accurately identify high-value vulnerabilities.

[0018] Furthermore, the clustering model includes multiple high-value vulnerability clusters; analyzing whether the vulnerability to be detected belongs to a high-value vulnerability cluster in the clustering model using a preset clustering model includes: determining whether a high-value vulnerability cluster exists among clusters whose distance to the vulnerability to be detected is less than a preset distance; if it exists, then the vulnerability to be detected belongs to that high-value vulnerability cluster; if it does not exist, then the vulnerability to be detected does not belong to a high-value vulnerability cluster.

[0019] In the above implementation, if the distance between the vulnerability to be detected and a certain high-value vulnerability cluster is small, it indicates that the characteristics of the vulnerability to be detected are more similar to those of the high-value vulnerability cluster. Therefore, the vulnerability to be detected is very likely to belong to the same category of vulnerabilities as the high-value vulnerabilities corresponding to the high-value vulnerability cluster. In this way, determining that the vulnerability to be detected belongs to the high-value vulnerability cluster makes it easier to determine the actual vulnerability type of the vulnerability to be detected.

[0020] Furthermore, the clustering model is obtained through the following methods: acquiring sample vulnerability information labeled with high-value vulnerabilities and low-value vulnerabilities; using a preset clustering algorithm to cluster the sample vulnerability information labeled with high-value vulnerabilities and low-value vulnerabilities to obtain candidate clustering models; and adjusting the parameters of the clustering algorithm until the silhouette coefficient of the candidate clustering model reaches a preset coefficient, and then determining the candidate clustering model as the clustering model.

[0021] In the above implementation, by clustering sample vulnerability information, vulnerabilities with similar characteristics can be grouped into a cluster, which helps to discover common features between vulnerabilities and the correlation between vulnerabilities, thereby facilitating the determination of the vulnerability type of new vulnerabilities when they exist.

[0022] This application provides an electronic device, including a processor and a memory, wherein the memory stores computer-executable instructions that can be executed by the processor, and the processor executes the computer-executable instructions to implement the above-described vulnerability detection method.

[0023] This application provides a storage medium storing computer-executable instructions. When these computer-executable instructions are invoked and executed by a processor, they enable the processor to implement the aforementioned vulnerability detection method.

[0024] The above general description and the description below are exemplary and illustrative only and are not intended to limit this application. Attached Figure Description

[0025] One or more embodiments are illustrated by way of example with reference to the accompanying drawings. These illustrations and drawings do not constitute a limitation on the embodiments. Elements having the same reference numerals in the drawings are considered similar elements. The drawings do not constitute a limitation of scale, and wherein:

[0026] Figure 1 is a schematic diagram of a vulnerability detection method provided in an embodiment of this application;

[0027] Figure 2 is a schematic diagram of a vulnerability detection device provided in an embodiment of this application;

[0028] Figure 3 is a schematic diagram of an electronic device provided in an embodiment of this application.

[0029] Icons: Acquisition Module 1; First High-Value Identification Module 2; Rare Vulnerability Identification Module 3; Second High-Value Identification Module 4; Bus 5; Processor 6; Memory 7; Communication Interface 8. Detailed Implementation

[0030] To provide a more detailed understanding of the features and technical content of the embodiments of this application, the implementation of the embodiments of this application will be described in detail below with reference to the accompanying drawings. The accompanying drawings are for illustrative purposes only and are not intended to limit the embodiments of this application. In the following technical description, for ease of explanation, several details are used to provide a full understanding of the disclosed embodiments. However, one or more embodiments may still be implemented without these details. In other cases, well-known structures and devices may be simplified in their depiction to simplify the drawings.

[0031] The terms "first," "second," etc., in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate for the embodiments of this application described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion.

[0032] Unless otherwise stated, the term "multiple" means two or more.

[0033] The term "correspondence" can refer to an association or binding relationship. The correspondence between A and B means that there is an association or binding relationship between A and B.

[0034] Example 1

[0035] This application provides a vulnerability detection method. Referring to Figure 1, which is a flowchart illustrating the vulnerability detection method provided in this application, the method includes:

[0036] Step S101: Obtain the vulnerability to be detected and the vulnerability information corresponding to the vulnerability to be detected.

[0037] In some embodiments, obtaining vulnerability information corresponding to a vulnerability to be detected includes: obtaining a preset feature directory, which includes multiple vulnerability feature items; parsing the vulnerability to be detected to obtain first feature information corresponding to the vulnerability feature items; extracting second feature information corresponding to the vulnerability feature items from the white-hat profile of the submitter who provided the vulnerability to be detected; extracting third feature information corresponding to the vulnerability feature items from a preset third-party platform; and using the first feature information, second feature information, and third feature information as vulnerability information.

[0038] In the above embodiments, the first characteristic information can be the characteristics of the vulnerability itself, such as: the company to which the vulnerability belongs, the domain name to which the vulnerability belongs, the component used, the vulnerability submitter, the vulnerability type, the vulnerability level, the vulnerability submission time, and the location where the vulnerability occurs. The location where the vulnerability occurs can be, for example, an app, a mini-program, a website, hardware, or a Linux (operating system kernel).

[0039] In the above embodiments, the second feature information includes, for example: the number of vulnerabilities submitted by the vulnerability submitter, the percentage of valid vulnerabilities submitted by the vulnerability submitter in the past, the percentage of high-risk vulnerabilities submitted by the vulnerability submitter, the percentage of high-value vulnerabilities submitted by the vulnerability submitter, the percentage of each type of vulnerability submitted by the vulnerability submitter, and the percentage of each type of vulnerability ignored by the vulnerability submitter.

[0040] In the above embodiments, the third feature information includes, for example, the Baidu Index of the company to which the vulnerability to be detected belongs, the number of stars on GitHub (a software project hosting platform) for the component to be detected, and the number of downloads of the component to be detected on GitHub.

[0041] The examples of the first, second, and third feature information given above are merely illustrative; the relevant content can be obtained according to the vulnerability feature items set in the preset feature directory. For instance, assuming the vulnerability feature item only contains the Baidu Index of the company to which the vulnerability belongs, then the third feature information can simply obtain the Baidu Index of the company to which the vulnerability belongs. Conversely, assuming the vulnerability feature item contains both the Baidu Index of the company to which the vulnerability belongs and the download volume of the component of the vulnerability on GitHub, then the third feature information needs to obtain both the Baidu Index of the company to which the vulnerability belongs and the download volume of the component of the vulnerability on GitHub.

[0042] For example, as shown in Table 1, which is an example table of a preset feature directory, the feature directory includes multiple vulnerability feature items, such as: Submit_num, submit_effect_ratio, etc. The vulnerability feature item "Submit_num" represents the number of vulnerabilities submitted by white-hat hackers. The vulnerability feature item "submit_effect_ratio" represents the percentage of valid vulnerabilities submitted historically by white-hat hackers.

[0043] Table 1

[0044] Optionally, engineers can select features that can influence whether a vulnerability to be detected is a high-value vulnerability as vulnerability feature items in the feature catalog.

[0045] Step S102: Using a preset high-value vulnerability model, determine whether the vulnerability to be detected is a high-value vulnerability based on the vulnerability information. If the vulnerability to be detected is a high-value vulnerability, proceed to step S103; if the vulnerability to be detected is not a high-value vulnerability, proceed to step S104.

[0046] In some embodiments, step S102 may include: inputting vulnerability information into a preset high-value vulnerability model to obtain whether the vulnerability to be detected corresponding to the vulnerability information is a high-value vulnerability.

[0047] For example, vulnerability information that matches the vulnerability feature items in Table 1 is constructed into a 24-dimensional feature vector. The 24-dimensional feature vector is then input into a preset high-value vulnerability model to determine whether the vulnerability to be detected corresponding to the vulnerability information is a high-value vulnerability.

[0048] In some embodiments, a high-value vulnerability model is obtained by: acquiring sample vulnerability information labeled as high-value vulnerabilities; training a preset random forest model using the sample vulnerability information labeled as high-value vulnerabilities; and updating the network parameters of the random forest model using a preset grid search algorithm until the recognition accuracy of the random forest model reaches a set threshold, thereby obtaining a high-value vulnerability model.

[0049] In the above embodiments, sample vulnerability information labeled with whether it is a high-value vulnerability can be obtained in the following way: Sample vulnerability information is obtained from a vulnerability crowdsourcing platform; the sample vulnerability information is sent to a preset first labeling platform; and the sample vulnerability information labeled with whether it is a high-value vulnerability is received from the first labeling platform. The first labeling platform labels the sample vulnerability information with whether it is a high-value vulnerability in response to user instructions. For example, high-value vulnerabilities are labeled as 1, and non-high-value vulnerabilities are labeled as 0. Vulnerability crowdsourcing platforms include, for example, WooYun Crowdsourcing and Vulnerability Box.

[0050] Optionally, the content of the sample vulnerability information can correspond to the vulnerability feature items in the feature catalog.

[0051] Optionally, engineers can set thresholds based on accuracy requirements.

[0052] In some embodiments, before determining whether a vulnerability to be detected is a high-value vulnerability based on vulnerability information using a preset high-value vulnerability model, the process includes: determining whether vulnerability information exists in a preset whitelist; if vulnerability information exists in the whitelist, then outputting that vulnerability information is a high-value vulnerability; if vulnerability information does not exist in the whitelist, then determining whether a vulnerability to be detected is a high-value vulnerability based on vulnerability information using a preset high-value vulnerability model.

[0053] Optionally, the whitelist can store vulnerability information that falls under the category of high-value vulnerabilities. For example, a vulnerability classified as high-risk. In this case, if the vulnerability information contains some or all of the same vulnerability information stored in the whitelist, then the vulnerability information is considered to exist in the whitelist.

[0054] Optionally, the whitelist can store vulnerability information rules that belong to high-value vulnerabilities. In this case, if vulnerability information meets one of the vulnerability information rules in the whitelist, then the vulnerability information is considered to exist in the whitelist.

[0055] For example, as shown in Table 2, which is an example table of rules in the whitelist, a rule could be: "belongs to a high-risk vulnerability." For example, if the vulnerability to be detected is determined to be a high-risk vulnerability based on its vulnerability information, then the vulnerability to be detected is a high-value vulnerability. Specifically, the vulnerability level in the vulnerability information can be used to directly determine whether the vulnerability to be detected is a high-risk vulnerability.

[0056] Table 2

[0057] In some embodiments, before determining whether a vulnerability to be detected is a high-value vulnerability based on the vulnerability information using a preset high-value vulnerability model, the process includes: determining whether the vulnerability information meets preset blacklist rules; if the vulnerability information exists in the blacklist, then outputting that the vulnerability information is a low-value vulnerability; if the vulnerability information does not exist in the blacklist, then determining whether the vulnerability to be detected is a high-value vulnerability based on the vulnerability information using the preset high-value vulnerability model.

[0058] Among them, the exploit value of low-value vulnerabilities is lower than that of high-value vulnerabilities.

[0059] Optionally, the blacklist stores vulnerability information that falls under the category of low-value vulnerabilities. For example, an Alexa ranking of 2 million (a website that publishes global website rankings).

[0060] Optionally, the blacklist stores vulnerability information rules that fall under the category of low-value vulnerabilities.

[0061] For example, referring to Table 3, which is an example table of rules in the blacklist. As shown in Table 3, a rule is, for example, "Alexa ranking outside the top 1 million." For example, by obtaining the Alexa ranking of the vulnerability information to be detected, if the Alexa ranking of the vulnerability to be detected is outside the top 1 million, then the vulnerability to be detected is a low-value vulnerability.

[0062] Table 3

[0063] Optionally, before using a pre-defined high-value vulnerability model to determine whether a vulnerability to be detected is a high-value vulnerability based on the vulnerability information, a whitelist and a blacklist can be used together to filter the vulnerability information.

[0064] Step S103: Output the vulnerability to be detected as a high-value vulnerability.

[0065] Step S104: Use a preset anomaly detection model to identify the vulnerability to be detected and determine whether it is a rare vulnerability. If the vulnerability to be detected is not a rare vulnerability, proceed to step S105; if the vulnerability to be detected is a rare vulnerability, proceed to step S106.

[0066] In some embodiments, the vulnerability information of the vulnerability to be detected is input into a preset anomaly detection model to obtain whether the vulnerability to be detected corresponding to the vulnerability information is a rare vulnerability.

[0067] In some embodiments, an anomaly detection model can be obtained by inputting sample vulnerability information labeled as either rare or unusual into a pre-defined isolated forest model for training, thereby obtaining the anomaly detection model.

[0068] In some embodiments, sample vulnerability information labeled with whether it belongs to rare vulnerabilities can be obtained by: obtaining sample vulnerability information from a vulnerability crowdsourcing platform, sending the sample vulnerability information to a preset second labeling platform, and obtaining sample vulnerability information labeled with whether it belongs to rare vulnerabilities from the second labeling platform. The second labeling platform labels the sample vulnerability information with whether it belongs to rare vulnerabilities in response to user instructions.

[0069] Step S105: Output the vulnerability to be detected as a common vulnerability.

[0070] Step S106: Analyze whether the vulnerability to be detected belongs to a high-value vulnerability cluster in the clustering model using a preset clustering model. If it belongs to a high-value vulnerability cluster, proceed to step S107; otherwise, proceed to step S108.

[0071] In some embodiments, the clustering model is obtained by: acquiring sample vulnerability information labeled with high-value vulnerabilities and low-value vulnerabilities; using a preset clustering algorithm to cluster the sample vulnerability information labeled with high-value vulnerabilities and low-value vulnerabilities to obtain candidate clustering models; and adjusting the parameters of the clustering algorithm until the silhouette coefficient of the candidate clustering model reaches the preset coefficient, and then determining the candidate clustering model as the clustering model.

[0072] In the above embodiments, clustering algorithms include, for example, hierarchical clustering, K-Means clustering, and DBSCAN clustering (density-based clustering).

[0073] For example, assuming the clustering algorithm is a hierarchical clustering algorithm, adjusting the parameters of the clustering algorithm can involve adjusting the metric, linkage, and n_clusters (the number of clusters generated). The metric can be Euclidean distance, Manhattan distance, cosine similarity, precomputed, etc. The linkage can be ward, complete, average, single, etc. Here, ward minimizes the variance of the clusters to be merged; complete uses the furthest neighbor, taking the distance between the farthest neighbors as the inter-cluster distance; average uses the average distance of all distances between clusters as the inter-cluster distance; and single uses the nearest neighbor, taking the distance between the closest neighbors as the inter-cluster distance.

[0074] For example, vulnerability information matching the vulnerability characteristics in Table 1 above is normalized. After normalization, a hierarchical clustering algorithm is used for clustering, and the parameters of metric, linkage, and n_clusters in the hierarchical clustering algorithm are adjusted until the silhouette coefficient reaches a preset coefficient, thus obtaining a clustering model. The preset coefficient can be 1. Thus, the silhouette coefficient is used to evaluate the tightness and separation of the clusters. After clustering using the clustering algorithm, the silhouette coefficient equals 1 when each cluster contains either high-value vulnerabilities or low-value vulnerabilities. The more instances of high-value and low-value vulnerabilities being in the same cluster, the lower the silhouette coefficient. Engineers can set the preset coefficient according to actual accuracy requirements.

[0075] In some embodiments, analyzing whether a vulnerability to be detected belongs to a high-value vulnerability cluster in the clustering model using a preset clustering model includes: determining whether a high-value vulnerability cluster exists among clusters whose distance to the vulnerability to be detected is less than a preset distance; if it exists, determining that the vulnerability to be detected belongs to the high-value vulnerability cluster; if it does not exist, determining that the vulnerability to be detected does not belong to the high-value vulnerability cluster.

[0076] In the above embodiments, a first feature vector representing vulnerability information of the vulnerability to be detected and a second feature vector representing a high-value vulnerability cluster can be obtained; for each second feature vector: the distance between the second feature vector and the first feature vector is calculated, and the distance is used as the distance between the high-value vulnerability cluster corresponding to the second feature vector and the vulnerability to be detected.

[0077] Step S107: Output the vulnerability to be detected as a high-value vulnerability.

[0078] Step S108: Output the vulnerability to be detected as a common vulnerability.

[0079] For example, the process involves acquiring the vulnerability to be detected and its corresponding vulnerability information. A whitelist is used to assess the vulnerability information and determine if the vulnerability is a high-value vulnerability. If the vulnerability is not high-value, a blacklist is used to assess the vulnerability information and determine if it is a low-value vulnerability. If the vulnerability is not low-value, a pre-defined high-value vulnerability model is used to determine if it is a high-value vulnerability based on the vulnerability information. If the vulnerability is not high-value, a pre-defined anomaly detection model is used to identify the vulnerability and determine if it is a rare vulnerability. If the vulnerability is rare, a clustering model including multiple clusters of high-value vulnerabilities is acquired to determine if a high-value vulnerability cluster exists among clusters whose distance to the vulnerability is less than a pre-defined distance. If such a cluster exists, the vulnerability is determined to belong to that high-value vulnerability cluster. This approach, considering that the whitelist cannot comprehensively record information about high-value vulnerabilities, and the blacklist cannot comprehensively record information about low-value vulnerabilities, filters using both the whitelist and blacklist, along with the high-value vulnerability model, can identify previously unidentified high-value vulnerabilities. Subsequently, by using anomaly detection models and clustering models, new types of high-value vulnerabilities can be further screened out, thereby improving the accuracy of identifying high-value vulnerabilities.

[0080] Example 2

[0081] Based on the same inventive concept, this application provides a vulnerability detection device, as shown in FIG2. The vulnerability detection device includes: an acquisition module 1, a first high-value determination module 2, a rare vulnerability determination module 3, and a second high-value determination module 4. The acquisition module 1 is used to acquire the vulnerability to be detected and its corresponding vulnerability information. The first high-value determination module 2 is used to determine whether the vulnerability to be detected is a high-value vulnerability based on the vulnerability information using a preset high-value vulnerability model; if the vulnerability to be detected is a high-value vulnerability, it outputs that the vulnerability to be detected is a high-value vulnerability. The rare vulnerability determination module 3 is used to identify the vulnerability to be detected using a preset anomaly detection model if the vulnerability to be detected is not a high-value vulnerability, and determine whether the vulnerability to be detected is a rare vulnerability; if it is not a rare vulnerability, it outputs that the vulnerability to be detected is a common vulnerability. The second high-value determination module 4 is used to analyze whether the vulnerability to be detected belongs to a high-value vulnerability cluster in the clustering model if the vulnerability to be detected is a rare vulnerability; if it belongs to a high-value vulnerability cluster, it outputs that the vulnerability to be detected is a high-value vulnerability; if it does not belong to a high-value vulnerability cluster, it outputs that the vulnerability to be detected is a common vulnerability.

[0082] In some embodiments, the acquisition module 1 is used to acquire vulnerability information corresponding to the vulnerability to be detected in the following ways: acquiring a preset feature directory, which includes multiple vulnerability feature items; parsing the vulnerability to be detected to acquire first feature information corresponding to the vulnerability feature item; extracting second feature information corresponding to the vulnerability feature item from the white-hat profile of the submitter who provided the vulnerability to be detected; extracting third feature information corresponding to the vulnerability feature item from a preset third-party platform; and using the first feature information, second feature information, and third feature information as vulnerability information.

[0083] In some embodiments, the first high-value determination module 2 is further configured to: determine whether the vulnerability to be detected belongs to a high-value vulnerability before using a preset high-value vulnerability model to determine whether the vulnerability information belongs to a high-value vulnerability: determine whether the vulnerability information exists in a preset whitelist; the whitelist stores vulnerability information that belongs to high-value vulnerabilities; if the vulnerability information exists in the whitelist, output that the vulnerability information belongs to a high-value vulnerability; if the vulnerability information does not exist in the whitelist, use the preset high-value vulnerability model to determine whether the vulnerability to be detected belongs to a high-value vulnerability.

[0084] In some embodiments, the first high-value determination module 2 is further configured to: determine whether the vulnerability to be detected belongs to a high-value vulnerability before using a preset high-value vulnerability model to determine whether the vulnerability information belongs to a high-value vulnerability: whether the vulnerability information exists in a preset blacklist; whether the blacklist stores vulnerability information belonging to low-value vulnerabilities; whether the exploit value of low-value vulnerabilities is lower than that of high-value vulnerabilities; whether the vulnerability information exists in the blacklist, and whether the vulnerability information belongs to a low-value vulnerability; if the vulnerability information does not exist in the blacklist, and whether the vulnerability to be detected belongs to a high-value vulnerability based on the vulnerability information using the preset high-value vulnerability model.

[0085] In some embodiments, the vulnerability detection apparatus further includes: a model training module, configured to acquire sample vulnerability information labeled as high-value vulnerabilities; train a preset random forest model using the sample vulnerability information labeled as high-value vulnerabilities; and update the network parameters of the random forest model using a preset grid search algorithm until the recognition accuracy of the random forest model reaches a set threshold, thereby obtaining a high-value vulnerability model.

[0086] In some embodiments, the second high-value determination module 4 is used to analyze whether the vulnerability to be detected belongs to a high-value vulnerability cluster in the clustering model by using a preset clustering model in the following manner: determining whether there is a high-value vulnerability cluster among the clusters that are less than a preset distance from the vulnerability to be detected; if there is, determining that the vulnerability to be detected belongs to the high-value vulnerability cluster; if there is no such cluster, determining that the vulnerability to be detected does not belong to the high-value vulnerability cluster.

[0087] In some embodiments, the vulnerability detection apparatus further includes: a clustering module, configured to acquire sample vulnerability information labeled with high-value vulnerabilities and low-value vulnerabilities; cluster the sample vulnerability information labeled with high-value vulnerabilities and low-value vulnerabilities using a preset clustering algorithm to obtain candidate clustering models; and adjust the parameters of the clustering algorithm until the silhouette coefficient of the candidate clustering model reaches a preset coefficient, and determine the candidate clustering model as the clustering model.

[0088] It is understood that the embodiments described in Embodiment 1 are also applicable in Embodiment 2 without conflict. For the sake of brevity, they will not be repeated here.

[0089] Example 3

[0090] Referring to Figure 3, this embodiment of the application provides an electronic device, including a processor 6 and a memory 7. Optionally, the device may further include a communication interface 8 and a bus 5. The processor 6, communication interface 8, and memory 7 can communicate with each other via the bus 5. The communication interface 8 can be used for information transmission. The processor 6 can invoke logical instructions in the memory 7 to execute the aforementioned vulnerability detection method.

[0091] Furthermore, the logical instructions in the aforementioned memory 7 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium.

[0092] The memory 7, as a computer-readable storage medium, can be used to store software programs and computer-executable programs, such as program instructions / modules corresponding to the methods in the embodiments of this application. The processor 6 executes the program instructions / modules stored in the memory 7 to perform functional applications and data processing, that is, to implement the vulnerability detection method in the above embodiments.

[0093] The memory 7 may include a program storage area and a data storage area. The program storage area may store the operating system and applications required for at least one function; the data storage area may store data created based on the use of the terminal device. Furthermore, the memory 7 may include high-speed random access memory and may also include non-volatile memory.

[0094] Among them, electronic devices can be computers or servers, etc.

[0095] This application provides a storage medium storing computer-executable instructions configured to perform the aforementioned vulnerability detection method.

[0096] This application provides a computer program product, which includes a computer program stored on a storage medium. The computer program includes program instructions, which, when executed by a computer, cause the computer to perform the aforementioned vulnerability detection method.

[0097] The aforementioned computer-readable storage medium may be a transient computer-readable storage medium or a non-transitory computer-readable storage medium.

[0098] The technical solutions of this application embodiment can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes one or more instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the method described in this application embodiment. The aforementioned storage medium can be a non-transitory storage medium, including various media capable of storing program code such as USB flash drives, portable hard drives, read-only memory, random access memory, magnetic disks, or optical disks, or it can be a transient storage medium.

[0099] In the embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. The apparatus embodiments described above are merely illustrative. For example, the division of units is only a logical functional division, and there may be other division methods in actual implementation. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed.

[0100] The above descriptions are merely embodiments of this application and are not intended to limit the scope of protection of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of protection of this application. Furthermore, the above embodiments can be combined with each other to form new embodiments without conflict.

Claims

1. A method for vulnerability detection, characterized in that, include: Obtain the vulnerability to be detected and the vulnerability information corresponding to the vulnerability to be detected; Using a pre-defined high-value vulnerability model, determine whether the vulnerability to be detected is a high-value vulnerability based on the vulnerability information; If the vulnerability to be detected belongs to the high-value vulnerability category, then output that the vulnerability to be detected is a high-value vulnerability. If the vulnerability to be detected does not belong to the high-value vulnerability category, then the vulnerability to be detected is identified using a preset anomaly detection model to determine whether the vulnerability to be detected belongs to the rare vulnerability category. If it does not belong to the rare vulnerability category, then the vulnerability to be detected is output as a common vulnerability. If the vulnerability to be detected belongs to the rare vulnerability category, then a preset clustering model is used to analyze whether the vulnerability to be detected belongs to the high-value vulnerability cluster in the clustering model. If it belongs to the high-value vulnerability cluster, then output that the vulnerability to be detected is a high-value vulnerability; If it does not belong to the high-value vulnerability cluster, the vulnerability to be detected is output as a common vulnerability.

2. The method according to claim 1, characterized in that, Obtain vulnerability information corresponding to the vulnerability to be detected, including: Obtain a preset feature directory, which includes multiple vulnerability feature items; Analyze the vulnerability to be detected and obtain the first feature information corresponding to the vulnerability feature item; Extract second feature information corresponding to the vulnerability feature items from the white-hat profile of the submitter who provided the vulnerability to be detected; Extract the third feature information corresponding to the vulnerability feature from a preset third-party platform; the first feature information, the second feature information, and the third feature information are used as the vulnerability information.

3. The method according to claim 1, characterized in that, Before determining whether the vulnerability to be detected is a high-value vulnerability based on the vulnerability information using a preset high-value vulnerability model, the process includes: Determine whether the vulnerability information exists in a preset whitelist; the whitelist stores vulnerability information that belongs to high-value vulnerabilities; If the vulnerability information exists in the whitelist, then output that the vulnerability information is a high-value vulnerability; If the vulnerability information does not exist in the whitelist, a preset high-value vulnerability model is used to determine whether the vulnerability to be detected is a high-value vulnerability based on the vulnerability information.

4. The method according to claim 1, characterized in that, Before determining whether the vulnerability to be detected is a high-value vulnerability based on the vulnerability information using a preset high-value vulnerability model, the process includes: Determine whether the vulnerability information exists in a preset blacklist; the blacklist stores vulnerability information that is a low-value vulnerability; the exploit value of the low-value vulnerability is lower than that of the high-value vulnerability. If the vulnerability information exists in the blacklist, then output that the vulnerability information is a low-value vulnerability; If the vulnerability information does not exist in the blacklist, a preset high-value vulnerability model is used to determine whether the vulnerability to be detected is a high-value vulnerability based on the vulnerability information.

5. The method according to claim 1, characterized in that, The high-value vulnerability model was obtained through the following methods: Obtain sample vulnerability information labeled with whether it is a high-value vulnerability; The sample vulnerability information labeled as high-value vulnerability is used to train a preset random forest model, and the network parameters of the random forest model are updated using a preset grid search algorithm until the recognition accuracy of the random forest model reaches a set threshold, thereby obtaining the high-value vulnerability model.

6. The method according to any one of claims 1 to 5, characterized in that, The clustering model includes multiple high-value vulnerability clusters; analyzing whether the vulnerability to be detected belongs to a high-value vulnerability cluster in the clustering model using a preset clustering model includes: Determine whether a high-value vulnerability cluster exists among the clusters whose distance to the vulnerability to be detected is less than a preset distance; if it exists, determine that the vulnerability to be detected belongs to the high-value vulnerability cluster; if it does not exist, determine that the vulnerability to be detected does not belong to the high-value vulnerability cluster.

7. The method according to any one of claims 1 to 5, characterized in that, The clustering model was obtained through the following methods: Obtain sample vulnerability information labeled with high-value and low-value vulnerabilities; The sample vulnerability information labeled with high-value and low-value vulnerabilities is clustered using a preset clustering algorithm to obtain candidate clustering models; the parameters of the clustering algorithm are adjusted until the silhouette coefficient of the candidate clustering model reaches the preset coefficient, and the candidate clustering model is determined as the clustering model.

8. A vulnerability detection device, characterized in that, include: The acquisition module is used to acquire the vulnerability to be detected and the vulnerability information corresponding to the vulnerability to be detected. The first high-value determination module is used to determine whether the vulnerability to be detected is a high-value vulnerability based on the vulnerability information using a preset high-value vulnerability model; if the vulnerability to be detected is a high-value vulnerability, then the module outputs that the vulnerability to be detected is a high-value vulnerability. The rare vulnerability determination module is used to identify the vulnerability to be detected using a preset anomaly detection model if the vulnerability to be detected does not belong to the high-value vulnerability, and to determine whether the vulnerability to be detected belongs to the rare vulnerability; if it does not belong to the rare vulnerability, the module outputs that the vulnerability to be detected is a common vulnerability. The second high-value determination module, if the vulnerability to be detected belongs to the rare vulnerability, then uses a preset clustering model to analyze whether the vulnerability to be detected belongs to the high-value vulnerability cluster in the clustering model; if it belongs to the high-value vulnerability cluster, then outputs that the vulnerability to be detected is a high-value vulnerability; if it does not belong to the high-value vulnerability cluster, then outputs that the vulnerability to be detected is a common vulnerability.

9. An electronic device, characterized in that, The method includes a processor and a memory, the memory storing computer-executable instructions that can be executed by the processor, the processor executing the computer-executable instructions to implement the vulnerability detection method according to any one of claims 1 to 7.

10. A storage medium, characterized in that, The storage medium stores computer-executable instructions, which, when invoked and executed by a processor, cause the processor to implement the vulnerability detection method according to any one of claims 1 to 7.