Mining evaluation method based on big data vulnerability mining
By employing big data vulnerability mining and assessment methods, combined with data preprocessing, feature extraction, and the Apriori algorithm, the severity of vulnerabilities is identified and assessed. This addresses the problem of inaccurate vulnerability identification in existing technologies and improves the security and stability of the system.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INFORMATION & COMMNUNICATION BRANCH STATE GRID JIANGXI ELECTRIC POWER CO
- Filing Date
- 2023-10-14
- Publication Date
- 2026-06-12
AI Technical Summary
Existing technologies struggle to accurately extract features related to vulnerabilities, posing a risk of missing or incorrectly extracting features. This results in serious false positives and false negatives during vulnerability identification and assessment, compromising system security and stability.
The vulnerability mining and assessment method based on big data includes steps such as data preprocessing, feature extraction, data mining algorithm analysis, vulnerability identification and assessment, visualization, and remediation verification. Combined with the Apriori algorithm and risk model, the severity and potential impact of vulnerabilities are identified and assessed, and the remediation effect is verified through a security coefficient.
It effectively identifies potential vulnerabilities in the system, improves the accuracy of vulnerability remediation and the security and stability of the system, and ensures the security assessment and verification of the system after vulnerability remediation.
Smart Images

Figure CN117421735B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of vulnerability mining and evaluation technology, and more specifically to a vulnerability mining and evaluation method based on big data vulnerability mining. Background Technology
[0002] Security vulnerabilities are weaknesses in a system or application that can be exploited by attackers. Vulnerabilities can lead to security incidents such as data breaches, system crashes, and malicious intrusions, seriously endangering an organization's business operations and user privacy. Data mining technology is the process of discovering hidden patterns, information, and correlations in large-scale datasets. In the security field, data mining technology can be used to discover abnormal behavior, identify potential threats, and discover vulnerabilities.
[0003] The vulnerability mining and assessment method based on big data refers to the use of data mining technology in a big data environment to discover and assess security vulnerabilities in systems, applications, or networks. The goal of big data vulnerability mining and assessment is to discover potential vulnerabilities in a timely manner so as to fix them and improve system security.
[0004] The existing technology has the following shortcomings:
[0005] 1. Existing technologies may have difficulty accurately extracting features related to vulnerabilities, especially for complex and covert vulnerability patterns, and there is a risk of missing or incorrectly extracting features;
[0006] 2. Existing technologies may have false positives and false negatives in the vulnerability identification and assessment process, which may lead to inaccurate assessments of the severity and potential impact of vulnerabilities. After vulnerability identification and remediation, there is no security assessment and verification process for the system, so it is impossible to evaluate the effectiveness of vulnerability remediation and cannot guarantee the stability and security of system operation. Summary of the Invention
[0007] The purpose of this invention is to provide a mining and evaluation method based on big data vulnerability mining to address the shortcomings of the prior art.
[0008] To achieve the above objectives, the present invention provides the following technical solution: a mining and evaluation method based on big data vulnerability mining, wherein the evaluation method includes the following steps:
[0009] S1: Collect and acquire data related to the system, application, or network, and preprocess the data;
[0010] S2: Extract vulnerability-related features from the collected data;
[0011] S3: By analyzing and mining vulnerability patterns in data using data mining algorithms, potential vulnerabilities are identified. The severity and potential impact of the vulnerabilities are determined by evaluating and analyzing the mining results.
[0012] S4: Visualize the mining results and generate corresponding reports;
[0013] S5: Based on the recommendations in the report, fix the identified vulnerabilities, and then verify and test them.
[0014] S6: Regularly monitor the security status of the system, detect new vulnerabilities and threats, and base decisions on assessment results and actual security incidents.
[0015] Preferably, in step S3, identifying potential vulnerabilities includes the following steps:
[0016] S3.1: Transform the data into the transaction set format required by the Apriori algorithm;
[0017] S3.2: Set the minimum support threshold for the Apriori algorithm to filter infrequent itemsets;
[0018] S3.3: Run the Apriori algorithm to generate frequent itemsets;
[0019] S3.4: Generate association rules based on frequent itemsets and filter out the association rules that meet the requirements;
[0020] S3.5: Analyze the association rules obtained from the mining and filter out the association rules related to the vulnerability;
[0021] S3.6: Understand the characteristics of the vulnerability based on the itemsets and rule attributes in the association rules.
[0022] Preferably, in step S3, determining the severity and potential impact of the vulnerability by evaluating and analyzing the mining results includes the following steps:
[0023] S3.7: Organize the vulnerability results obtained from the discovery, and define indicators for assessing the severity and potential impact of vulnerabilities based on actual needs and security standards;
[0024] S3.8: Based on defined metrics, use a risk model to assess each vulnerability and determine its severity and potential impact;
[0025] S3.9: Conduct risk analysis on the vulnerabilities identified in the assessment, and evaluate the correlation and mutual impact between different vulnerabilities.
[0026] Preferably, in step S3.8, assessing each vulnerability using a risk model includes the following steps:
[0027] S3.8.1: Define a risk model suitable for the organization based on actual needs and safety standards;
[0028] S3.8.2: Based on the risk model, determine the assessment metrics used to evaluate the severity and potential impact of vulnerabilities;
[0029] S3.8.3: Collect data related to each vulnerability and calculate the values of assessment metrics to evaluate the severity of the vulnerability.
[0030] Preferably, in step S5, the verification and testing after repair includes the following steps:
[0031] Obtain the frequency of vulnerability scans, frequency of abnormal access, and dispersion index of security incident response time in the system;
[0032] The security coefficient aq is obtained by comprehensively calculating the frequency of vulnerability scans, the frequency of abnormal access, and the dispersion index of security incident response time. x The calculation expression is:
[0033]
[0034] In the formula, sj l ld is the dispersion index of security incident response time. p As for the frequency of vulnerability scans, yc p The abnormal access frequency is represented by α, β, and γ, which are the dispersion exponents of security event response time, vulnerability scan occurrence frequency, and abnormal access frequency, respectively, and α, β, and γ are all greater than 0.
[0035] Obtain the safety factor aq x Then, the safety factor aq x Compared with the safety threshold, if the safety factor aq x If the security threshold is ≥, and the system's security is good after vulnerability patching, then the security coefficient aq is considered good. x <Security threshold; assessment indicates that the system's security is poor even after vulnerability remediation.
[0036] Preferably, the security event response time dispersion index sj l The calculation logic is as follows:
[0037] The standard deviation of security incident response time, bzd, is calculated using the following expression:
[0038]
[0039] In the formula, i = {1, 2, 3, ..., n}, n represents the total number of security event response time data points, and n is a positive integer. SJi represents the numerical value representing the response time of each security event. This represents the average response time for security incidents.
[0040] If the average response time of security incidents The time threshold, and the standard deviation of the security incident response time bzd ≤ the standard threshold, sj l =1.0;
[0041] If the average response time of security incidents The time threshold, and the standard deviation of the security incident response time (bzd) > the standard threshold, sj l =0.8;
[0042] If the average response time of security incidents The time threshold, and the standard deviation of the security incident response time (bzd) > the standard threshold, sj l =0.6;
[0043] If the average response time of security incidents The time threshold, and the standard deviation of the security incident response time bzd ≤ the standard threshold, sj l =0.4.
[0044] Preferably, the logic for obtaining the frequency of vulnerability scans is as follows:
[0045] Use vulnerability scanning tools to regularly scan the system, automatically discover and report vulnerabilities in the system, including the number, severity and frequency of vulnerabilities, and obtain the frequency of vulnerability occurrence by analyzing vulnerability scanning reports.
[0046] Preferably, the logic for obtaining the abnormal access frequency is as follows:
[0047] Analyze the system's security event logs, including abnormal login attempts, abnormal access, and unauthorized access records, and statistically analyze the number and frequency of abnormal access events in the logs.
[0048] The technical effects and advantages provided by the present invention in the above technical solution are as follows:
[0049] 1. This invention extracts vulnerability-related features from collected data and analyzes and mines vulnerability patterns in the data using data mining algorithms to identify potential vulnerabilities. By evaluating and analyzing the mining results, the severity and potential impact of the vulnerabilities are determined, and the mining results are visualized to help security teams or managers better understand and analyze the vulnerability situation. Based on the recommendations in the report, the discovered vulnerabilities are repaired. After repair, verification and testing are required to ensure that the vulnerabilities are successfully repaired. The system is then subjected to security assessment and verification again. This assessment method not only effectively identifies potential vulnerabilities in the system, but also greatly improves the stability and security of system operation by conducting security assessment and verification again after the vulnerabilities are repaired.
[0050] 2. This invention obtains the security coefficient by comprehensively calculating the frequency of vulnerability scans, the frequency of abnormal access, and the dispersion index of security event response time in the system, thereby effectively improving data processing efficiency. Attached Figure Description
[0051] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this invention. For those skilled in the art, other drawings can be obtained based on these drawings.
[0052] Figure 1 This is a flowchart of the method of the present invention. Detailed Implementation
[0053] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0054] Example 1: Please refer to Figure 1 As shown in this embodiment, the data mining and evaluation method based on big data vulnerability mining includes the following steps:
[0055] A. Data Collection and Preparation: First, collect and acquire a large amount of data related to the system, application, or network. This data may include system logs, network traffic, configuration files, etc. Then, clean, transform, and preprocess the data to ensure its quality and consistency, including the following steps:
[0056] Identify the sources from which data needs to be collected and acquired, which may include system logs, network traffic records, configuration files, databases, etc. Use appropriate methods and tools to collect and extract data from these sources to obtain relevant information. This may involve techniques such as web scraping, log collection, and database querying. Clean the collected raw data to remove duplicates, missing data, noisy data, and inconsistent data. Various techniques can be used in the cleaning process, such as data normalization, deduplication, outlier detection and handling. Transform the cleaned data into a processable format and perform necessary data integration, including data format conversion, data normalization, and data integration, to ensure the consistency and usability of the data in subsequent analysis. Preprocess the transformed and integrated data to improve the effectiveness of subsequent data mining and analysis. Preprocessing steps may include feature selection, feature scaling, dimensionality reduction, and outlier handling.
[0057] B. Vulnerability Feature Extraction: In this step, vulnerability-related features are extracted from the collected data. Vulnerability features may include abnormal network traffic patterns, abnormal system log events, configuration errors, etc. These features help identify potential vulnerabilities and include the following steps:
[0058] Feature Definition and Selection: Based on the characteristics and definitions of vulnerabilities, relevant features for vulnerability identification are determined. For example, abnormal network traffic patterns may involve sudden changes in traffic volume, the use of unconventional ports, etc. Appropriate technologies and algorithms are used to extract vulnerability-related features from the raw data. This may involve data processing, data analysis, feature engineering, and other technical means to identify and extract features with potential vulnerability relevance from the data. As needed, new features are constructed to better describe vulnerability patterns. Feature construction may include feature combination, feature transformation, feature derivation, and other operations to enhance vulnerability identification capabilities. The extracted features are selected and filtered to improve the accuracy and efficiency of vulnerability identification. Feature selection techniques can be used to eliminate irrelevant or redundant features, thereby extracting the most informative feature subset. The extracted features are appropriately represented and encoded for subsequent data mining and analysis. This may include standardization, normalization, discretization, and other operations on the features to ensure feature consistency and comparability.
[0059] C. Data Mining Algorithm Selection and Application: Select appropriate data mining algorithms to analyze and mine vulnerability patterns in data. Commonly used data mining algorithms include clustering, classification, association rule mining, etc. Choose the appropriate algorithm for vulnerability mining according to the actual situation.
[0060] Understanding common data mining algorithms, including clustering, classification, and association rule mining, as well as their working principles, applicable scenarios, and characteristics, is crucial for selecting the most suitable algorithm for vulnerability discovery. Clarifying the goals and requirements of vulnerability discovery, such as identifying different types of vulnerabilities and finding vulnerability patterns, is essential. Based on these goals and requirements, the necessary data mining techniques and corresponding algorithms should be determined. Prepared data should be prepared and preprocessed to ensure data quality and consistency. Simultaneously, features related to vulnerabilities should be selected and extracted according to the mining objectives, and feature selection should be performed to reduce feature dimensionality and improve algorithm efficiency. Finally, a suitable data mining algorithm should be selected based on the mining objectives, data characteristics, and requirements. Commonly used algorithms include clustering algorithms (such as K-means and DBSC). This study compares and evaluates various algorithms, including classification algorithms (such as decision trees and support vector machines) and association rule mining algorithms (such as Apriori and FP-growth), considering factors such as accuracy, efficiency, interpretability, and scalability. The selected algorithms are then applied to the dataset for vulnerability mining. Based on the actual situation, the algorithms are tuned and optimized to improve the accuracy and reliability of the mining results. Adjusting algorithm parameters, adopting specific heuristic rules, or using ensemble learning methods can all improve the effectiveness of vulnerability mining. The mining results are then analyzed and interpreted to understand the patterns, characteristics, and associations of the discovered vulnerabilities. This can include techniques such as visualization analysis, statistical analysis, and rule interpretation for further vulnerability assessment and processing.
[0061] D. Vulnerability Identification and Assessment: Analyze the data using selected data mining algorithms to identify potential vulnerabilities. By evaluating and analyzing the mining results, determine the severity and potential impact of the vulnerabilities. Assessment models or rules can be used to quantify the risk level of vulnerabilities.
[0062] E. Results Visualization and Reporting: Visualize the findings to help security teams or managers better understand and analyze vulnerabilities, generate corresponding reports, and detail the discovered vulnerabilities, risk assessments, and recommended remediation measures, including the following steps:
[0063] Based on the data mining results and the needs of the target audience, design appropriate visualization methods and layouts, and select suitable charts, graphs, and visualization interfaces to display vulnerability information, providing intuitive and easy-to-understand information. Analyze and transform the mining results to convert the data into the format required for visualization. As needed, summarize, aggregate, filter, or sort the data to prepare the dataset for visualization. Use appropriate visualization tools and technologies to present the transformed data as charts, graphs, or visualization interfaces. Commonly used visualization tools include Tableau and PowerWord. Tools such as BI, matplotlib, and D3.js can be selected based on needs. Based on the design and data transformation results, corresponding visualizations can be created, including bar charts, line charts, pie charts, heatmaps, and scatter plots, to display information such as different vulnerability types, vulnerability severity, and risk distribution. Based on the visualizations and data mining results, corresponding reports should be generated. These reports should detail the discovered vulnerabilities, including their type, number, severity, and potential impact. The reports should also include risk assessments and recommended remediation measures to guide the security team or management in taking appropriate actions. The generated reports should be explained and discussed to help the security team or management better understand the mining results and vulnerability situation. The metrics, charts, and conclusions in the report should be explained, relevant questions answered, and necessary context and background information provided. The reports should be shared with the security team or management for communication and discussion to ensure that the report clearly conveys the vulnerability situation and related risks, thereby facilitating decision-making and vulnerability remediation.
[0064] F. Remediation and Verification: Based on the recommendations in the report, remediate the identified vulnerabilities. After remediation, verification and testing are required to ensure that the vulnerabilities have been successfully remediated. Finally, conduct a security assessment and verification of the system again.
[0065] G. Monitoring and Continuous Improvement: Regularly monitor the system's security status, detect new vulnerabilities and threats, and continuously improve and optimize vulnerability discovery and assessment methods based on assessment results and actual security incidents to enhance the accuracy and efficiency of vulnerability discovery. This includes the following steps:
[0066] Security monitoring and vulnerability intelligence gathering: Establish a system security monitoring mechanism to regularly collect and monitor data sources such as system logs, network traffic, and security events to promptly grasp the system's security status and discover new vulnerabilities and threats. Simultaneously, subscribe to and collect vulnerability intelligence and security bulletins to obtain the latest vulnerability and threat information. Based on security monitoring data and collected vulnerability intelligence, perform vulnerability identification and assessment on the system. Combining the aforementioned vulnerability mining and assessment methods, identify new vulnerabilities and potential threats, and conduct corresponding risk assessments and vulnerability impact analyses. Based on the assessment results and actual security events, analyze the effectiveness and shortcomings of the vulnerability mining and assessment methods. Improve the accuracy and efficiency of vulnerability mining by optimizing data collection, feature extraction, and algorithm selection. This may require trying new methods. Test and validate improved vulnerability discovery and assessment methods by modifying technologies, algorithms, or tools, or by adjusting the parameters and configurations of existing methods. Use real or simulated datasets to evaluate the performance and effectiveness of the improved methods, verifying whether they can more accurately identify vulnerabilities, reduce false positive rates, and improve the efficiency of vulnerability discovery. Maintain contact with the security community and professional organizations, participate in security seminars, training sessions, and knowledge-sharing activities, continuously learn and update knowledge and technologies in the field of vulnerability discovery, understand the latest security threats and attack techniques, apply this knowledge to improve vulnerability discovery and assessment methods, regularly review the effectiveness of vulnerability discovery and assessment methods and the implementation of improvement measures, and through regular evaluation and review, identify potential problems and opportunities for improvement, and further optimize and enhance the quality and effectiveness of vulnerability discovery and assessment methods.
[0067] This application extracts vulnerability-related features from collected data and analyzes and mines vulnerability patterns in the data using data mining algorithms to identify potential vulnerabilities. By evaluating and analyzing the mining results, the severity and potential impact of the vulnerabilities are determined, and the mining results are visualized to help security teams or managers better understand and analyze the vulnerability situation. Based on the recommendations in the report, the discovered vulnerabilities are patched. After patching, verification and testing are required to ensure that the vulnerabilities are successfully patched, and the system is then subjected to another security assessment and verification. This assessment method not only effectively identifies potential vulnerabilities in the system, but also, by conducting another security assessment and verification after patching the vulnerabilities, greatly improves the stability and security of the system operation.
[0068] Example 2: Vulnerability Identification and Assessment: Data is analyzed using selected data mining algorithms to identify potential vulnerabilities. The severity and potential impact of the vulnerabilities are determined by evaluating and analyzing the mining results.
[0069] in:
[0070] Analyzing data using selected data mining algorithms to identify potential vulnerabilities includes the following steps:
[0071] 1) Data conversion to transaction set: Convert the data into the transaction set format required by the Apriori algorithm, and convert the data into a data structure suitable for the Apriori algorithm. Usually, the data is converted into the form of itemsets.
[0072] 2) Set minimum support: Set the minimum support threshold for the Apriori algorithm according to actual needs. Support refers to the frequency of an itemset in the dataset. Setting an appropriate minimum support can filter out infrequent itemsets.
[0073] 3) Generate frequent itemsets: Run the Apriori algorithm to generate frequent itemsets. The Apriori algorithm uses a layer-by-layer search method, starting from a single item and gradually generating larger itemsets until it is no longer possible to generate frequent itemsets.
[0074] 4) Generate association rules: Based on frequent itemsets, generate association rules. Association rules are derived from frequent itemsets and include antecedents and consequents. Based on the set minimum confidence threshold, select association rules that meet the requirements.
[0075] 5) Interpretation and screening of mining results: Interpret and analyze the association rules obtained from mining, and screen out association rules related to vulnerabilities. Usually, focus on association rules that contain vulnerability features or vulnerability-related attributes.
[0076] 6) Vulnerability identification and result interpretation: Based on the mining results, potential vulnerabilities are identified, and the characteristics, relationships and potential impacts of the vulnerabilities are understood according to the itemsets and rule attributes in the association rules.
[0077] The assessment and analysis of the findings determine the severity and potential impact of the vulnerabilities, including the following steps:
[0078] 1) Interpretation and organization of mining results: Interpretation and organization of the vulnerability results obtained from mining, understanding the characteristics, attributes and related information of each vulnerability, and classifying and categorizing the mining results according to the type and severity of the vulnerability;
[0079] 2) Definition of vulnerability assessment metrics: Based on actual needs and security standards, define metrics used to assess the severity and potential impact of vulnerabilities. These metrics may include the risk level, severity, and potential scope of impact of the vulnerability.
[0080] 3) Vulnerability assessment: Based on defined metrics, each vulnerability is assessed to determine its severity and potential impact. Risk models and other methods can be used for assessment.
[0081] 3.1) Risk Model Definition: Based on actual needs and security standards, define a risk model suitable for the organization. The risk model should include all factors required for vulnerability assessment, such as vulnerability type, attack path, possible scope of impact, and vulnerability exploitation difficulty.
[0082] 3.2) Determination of vulnerability assessment indicators: Based on the risk model, determine the assessment indicators used to assess the severity and potential impact of vulnerabilities. These indicators may include the degree of harm of the vulnerability, the scope of potential impact, possible vulnerability exploitation methods, and the difficulty of vulnerability remediation.
[0083] 3.3) Determining the weight of indicators: Based on the design of the risk model, determine the weight of each assessment indicator. The weight reflects the importance of different indicators in vulnerability assessment and can be determined based on expert opinions, security standards or organizational needs.
[0084] 3.4) Data collection and metric calculation: Collect data related to each vulnerability to calculate the value of the evaluation metric. This may involve the collection and analysis of data such as vulnerability characteristics, system configuration, and network topology.
[0085] 3.5) Vulnerability assessment and scoring: Using the defined assessment metrics and weights, assess and score each vulnerability, and calculate the overall score of the vulnerability based on the values and weights of the metrics;
[0086] 3.6) Risk Level Classification: Based on the comprehensive score of the vulnerability, the vulnerability is classified into different risk levels, such as high risk, medium risk, low risk, etc. The threshold value of different risk levels is determined according to the organization's needs and standards.
[0087] 4) Risk analysis and correlation analysis: Conduct risk analysis on the vulnerabilities identified in the assessment, evaluate the correlation and mutual influence between different vulnerabilities, understand the relationship between vulnerabilities, and more comprehensively assess the potential impact of vulnerabilities;
[0088] 5) Data visualization and reporting: Visualize the vulnerability assessment results and generate corresponding reports. Use charts, graphs and other methods to present information such as the severity and scope of the vulnerability so that the security team or managers can better understand and analyze the vulnerability situation.
[0089] 6) Risk Prioritization and Decision-Making: Based on the assessment results and the severity of the vulnerabilities, prioritize the vulnerabilities to determine those that should be addressed first. Based on the vulnerability assessment results, make risk decisions and formulate corresponding remediation plans and strategies.
[0090] 7) Continuous monitoring and updating: Regularly monitor and update vulnerability assessment results to reflect the actual vulnerability situation and potential impact of the system. Adjust the vulnerability assessment and analysis methods and indicators in a timely manner based on new vulnerability discovery results, security incidents and risk changes.
[0091] Remediation and Verification: Based on the recommendations in the report, remediate the identified vulnerabilities. After remediation, verification and testing are required to ensure that the vulnerabilities have been successfully remediated. Finally, conduct a security assessment and verification of the system again.
[0092] in:
[0093] The security assessment and verification of the system includes the following steps:
[0094] 1) Obtain the frequency of vulnerability scans, frequency of abnormal access, and dispersion index of security incident response time in the system;
[0095] 2) The security coefficient aq is obtained by comprehensively calculating the frequency of vulnerability scans, the frequency of abnormal access, and the dispersion index of security incident response time. x The calculation expression is:
[0096]
[0097] In the formula, sj l ld is the dispersion index of security incident response time. p As for the frequency of vulnerability scans, yc p The abnormal access frequency is represented by α, β, and γ, which are the dispersion exponents of security event response time, vulnerability scan occurrence frequency, and abnormal access frequency, respectively, and α, β, and γ are all greater than 0.
[0098] Obtain the safety factor aq x Then, the safety factor aq x Compared with the safety threshold, if the safety factor aq x If the security threshold is ≥, and the system's security is good after vulnerability patching, then the security coefficient aq is considered good. x <Security threshold; assessment indicates that the system's security is poor even after vulnerability remediation.
[0099] Security incident response time dispersion index sj l The calculation logic is as follows:
[0100] 1) Obtain the standard deviation of security incident response time, bzd, calculated using the following expression:
[0101]
[0102] In the formula, i = {1, 2, 3, ..., n}, n represents the total number of security event response time data points, and n is a positive integer. SJi represents the numerical value representing the response time of each security event. This represents the average response time for security incidents.
[0103] 2) If the average response time of a security incident is ≤ Time threshold, and the standard deviation of security incident response time bzd ≤ standard threshold, sj l =1.0;
[0104] 3) If the average response time of a security incident is ≤ Time threshold, and the standard deviation of security incident response time bzd > standard threshold, sj l =0.8;
[0105] 4) If the average response time of a security incident is >Time threshold, and the standard deviation of security incident response time (bzd) >standard threshold, sj l =0.6;
[0106] 5) If the average response time of a security incident is >Time threshold, and the standard deviation of security incident response time bzd ≤ standard threshold, sj l =0.4.
[0107] The logic for obtaining the frequency of vulnerability scans is as follows:
[0108] Regularly scan your system using professional vulnerability scanning tools (such as Nessus, OpenVAS, Nexpose, etc.). These tools will automatically discover and report vulnerabilities in your system and provide relevant data and statistics, including the number, severity, and frequency of vulnerabilities. By analyzing the vulnerability scan reports, you can obtain the frequency of vulnerability occurrences, calculate how many scans each vulnerability was discovered in, or calculate the number of times each vulnerability occurred within a certain time frame.
[0109] The logic for obtaining the abnormal access frequency is as follows:
[0110] Analyzing the system's security event logs, including records of abnormal login attempts, abnormal access, and unauthorized access, allows us to understand the abnormal access situation the system faces by statistically analyzing the number and frequency of abnormal access events in the logs. Analyzing the log records of the IDS / IPS system detects and records abnormal behaviors and attack activities on the network, and by analyzing these logs, we can understand the frequency of abnormal access the system faces.
[0111] This application obtains a security coefficient by comprehensively calculating the frequency of vulnerability scans, the frequency of abnormal access, and the dispersion index of security event response time in the system, thereby effectively improving data processing efficiency.
[0112] The above formulas are all dimensionless calculations. The formulas are derived from software simulations based on a large amount of collected data to obtain the most recent real-world results. The preset parameters in the formulas are set by those skilled in the art according to the actual situation.
[0113] In the description of this specification, references to terms such as "an embodiment," "example," "specific example," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of the invention. In this specification, illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.
[0114] The preferred embodiments of the present invention disclosed above are merely illustrative of the invention. These preferred embodiments do not exhaustively describe all details, nor do they limit the invention to any specific implementation. Clearly, many modifications and variations can be made based on the content of this specification. This specification selects and specifically describes these embodiments to better explain the principles and practical applications of the invention, thereby enabling those skilled in the art to better understand and utilize the invention. The invention is limited only by the claims and their full scope and equivalents.
Claims
1. A mining evaluation method based on big data vulnerability mining, characterized in that: The evaluation method includes the following steps: S1: Collect and acquire data related to the system, application, or network, and preprocess the data; S2: Extract vulnerability-related features from the collected data; S3: By analyzing and mining vulnerability patterns in data using data mining algorithms, potential vulnerabilities are identified. The severity and potential impact of the vulnerabilities are determined by evaluating and analyzing the mining results. S4: Visualize the mining results and generate corresponding reports; S5: Based on the recommendations in the report, patch the discovered vulnerabilities, and then verify and test them. The verification and testing after patching includes the following steps: obtaining the vulnerability scan occurrence frequency, abnormal access frequency, and security incident response time dispersion index in the system; and comprehensively calculating the security coefficient by combining the vulnerability scan occurrence frequency, abnormal access frequency, and security incident response time dispersion index. The calculation expression is: In the formula, The dispersion index for security incident response time. The frequency of vulnerability scans This indicates an abnormal access frequency. , , These are the weighting coefficients for the security incident response time dispersion index, vulnerability scan frequency, and abnormal access frequency, respectively. , , All are greater than 0; obtain the safety factor. Then, the safety factor Compared with the safety threshold, if the safety factor If the security threshold is ≥, and the system's security is good after vulnerability patching, then the security coefficient is [missing information]. <Security threshold, assessment of poor system security after vulnerability patching; the security event response time dispersion index.> The calculation logic is as follows: Obtain the standard deviation of security event response time. The calculation expression is: In the formula, i= , This represents the total number of data points for the security incident response time. It is a positive integer. This represents the numerical value indicating the response time for each security incident. This represents the average response time for security incidents; if the average response time for security incidents... ≤Time threshold, and standard deviation of security incident response time ≤ standard threshold, If the average response time of a security incident is ≤Time threshold, and standard deviation of security incident response time >Standard threshold, If the average response time of a security incident is >Time threshold, and standard deviation of security incident response time >Standard threshold, If the average response time of a security incident is >Time threshold, and standard deviation of security incident response time ≤ standard threshold, ; S6: Regularly monitor the security status of the system, detect new vulnerabilities and threats, and base decisions on assessment results and actual security incidents.
2. The mining and evaluation method based on big data vulnerability mining according to claim 1, characterized in that: In step S3, identifying potential vulnerabilities includes the following steps: S3.1: Transform the data into the transaction set format required by the Apriori algorithm; S3.2: Set the minimum support threshold for the Apriori algorithm to filter infrequent itemsets; S3.3: Run the Apriori algorithm to generate frequent itemsets; S3.4: Generate association rules based on frequent itemsets and filter out the association rules that meet the requirements; S3.5: Analyze the association rules obtained from the mining and filter out the association rules related to the vulnerability; S3.6: Understand the characteristics of the vulnerability based on the itemsets and rule attributes in the association rules.
3. The mining and evaluation method based on big data vulnerability mining according to claim 2, characterized in that: In step S3, the severity and potential impact of the vulnerability are determined by evaluating and analyzing the mining results, including the following steps: S3.7: Organize the vulnerability results obtained from the discovery, and define indicators for assessing the severity and potential impact of vulnerabilities based on actual needs and security standards; S3.8: Based on defined metrics, use a risk model to assess each vulnerability and determine its severity and potential impact; S3.9: Conduct risk analysis on the vulnerabilities identified in the assessment, and evaluate the correlation and mutual impact between different vulnerabilities.
4. The mining and evaluation method based on big data vulnerability mining according to claim 3, characterized in that: Step S3.8, assessing each vulnerability using the risk model, includes the following steps: S3.8.1: Define a risk model suitable for the organization based on actual needs and safety standards; S3.8.2: Based on the risk model, determine the assessment metrics used to evaluate the severity and potential impact of vulnerabilities; S3.8.3: Collect data related to each vulnerability and calculate the values of assessment metrics to evaluate the severity of the vulnerability.
5. The mining and evaluation method based on big data vulnerability mining according to claim 1, characterized in that: The logic for obtaining the frequency of vulnerability scans is as follows: Use vulnerability scanning tools to regularly scan the system, automatically discover and report vulnerabilities in the system, including the number, severity and frequency of vulnerabilities, and obtain the frequency of vulnerability occurrence by analyzing vulnerability scanning reports.
6. The mining and evaluation method based on big data vulnerability mining according to claim 5, characterized in that: The logic for obtaining the abnormal access frequency is as follows: Analyze the system's security event logs, including abnormal login attempts, abnormal access, and unauthorized access records, and statistically analyze the number and frequency of abnormal access events in the logs.