Feature selection method and system for network security data

A feature selection method and network security technology, applied in the feature selection method and system field of network security data, can solve the problem that redundant features are not processed

Inactive Publication Date: 2016-12-21
XINJIANG UNIVERSITY
View PDF3 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The invention combines the advantages of Filter-style feature selection and Wrapper-style feature selection, and uses the secondary division method in the machine learning process to solve the problems of high-dimensional small samples and SNP pathogenic combination modes in SNP data feature selection, and improve In order to improve the analysis efficiency and accuracy, although Relief can calculate the weight of each feature, and then use the SVM-RFE algorithm to compare the weights, irrelevant attributes can be removed, but redundant features are not processed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Feature selection method and system for network security data
  • Feature selection method and system for network security data
  • Feature selection method and system for network security data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

[0061] see figure 1 , the feature selection method of the network security data provided by the application, comprising the following steps:

[0062] Step S110: constructing a KDDCUP99 data set, and processing the data set to obtain a high-dimensional vector group;

[0063] It can be understood that in feature selection, the selection of data sets is the first step in researching and evaluating algorithms, and the accuracy of data sets will directly determine the evaluation results of various algorithms. The KDDCUP99 dataset provided i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention proposes a feature selection method and system for network security data. The method comprises the steps of performing data normalization processing on a KDDCUP99 data set; performing Re-ReliefF data dimension reduction on a vector set; removing unrelated data or data with relatively low relevance to form a candidate feature set; and obtaining a feature with minimum relevance with the candidate feature set by utilizing an improved Re-ReliefF algorithm. According to the feature selection method and system for the network security data, provided by the invention, for redundant features existent in the data, redundant data in the data is removed by virtue of an MRMR (Maximum Relevance Minimum Redundancy) thought, so that the efficiency of a classifier is improved.

Description

technical field [0001] The invention relates to the technical field of network data security processing, in particular to a feature selection method and system for network security data. Background technique [0002] Feature selection for data with high-dimensional and small-sample characteristics is one of the research hotspots in the field of data mining. This type of data generally has the characteristics of huge data volume, high feature dimension, and small number of samples. Commonly used data analysis methods have sample tendency, and the efficiency and accuracy of high-dimensional small-sample data analysis are low. [0003] The ReliefF algorithm has the advantages of high evaluation efficiency, no restrictions on data types, and can remove irrelevant features. However, the disadvantage of the ReliefF algorithm is that the design does not consider the correlation between features and cannot remove redundant features. The algorithm will give all and Features with hi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2465
Inventor 努尔布力王浩黄春虎
Owner XINJIANG UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products