A Bayesian weighting method based on cfs_kl

A KL divergence and attribute technology, applied in the field of machine learning, can solve problems such as unrealistic, limited naive Bayesian classification effect, and data not so strong independence, so as to improve accuracy, improve classification effect, and alleviate feature independence the effect of the requirements

Active Publication Date: 2022-02-18
XI AN JIAOTONG UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, since learning the optimal Bayesian classifier is just like learning the Bayesian network, it is an NP-hard problem, so learning the Naive Bayesian classifier has been favored by many scholars, and Naive Bayesian is often based on a simple But unrealistic assumption: the features of the training data are independent of each other. This strong condition is difficult to achieve in real life. Even in reality, it has logically shown that the features are independent of each other. In the actual data Not so independent, which greatly limits the classification effect of Naive Bayes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Bayesian weighting method based on cfs_kl
  • A Bayesian weighting method based on cfs_kl
  • A Bayesian weighting method based on cfs_kl

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] A kind of Bayesian weighting method based on CFS_KL of the present invention comprises the following steps:

[0060] S1. In the data collection stage, disassemble the nmap fingerprint library, obtain training data, and simulate test data;

[0061] Analyze the operating system identification rules in the nmap fingerprint library. The nmap fingerprint library will send 16 data packets to generate a corresponding response sequence, and each response sequence will correspond to some flag bits. The fingerprint library of nmap contains the operating system's fingerprint information contained in the response data packet of the operating system known to nmap to the 16 probe packets of nmap. Therefore, the fingerprint name in the fingerprint library is used as the tag data of the model, and the flag bits of the response sequence under the fingerprint name are used to form the training data. The following is a fingerprint of the nmap fingerprint library:

[0062] Fingerprint Li...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a novel Bayesian weighting method based on CFS_KL, which uses the fingerprint name in the fingerprint library as the marking data of the model, and the response sequence flag bits under the fingerprint name constitute the training data; the training data is sealed and preprocessed ;Use the KL divergence to calculate the degree of association between the attribute and the class as the weight of each attribute; use the feature selection method to select 42 dimensions; use the dimension selected by CFS to modify the weight calculated by the KL divergence; use the weighted Bay The Yesian algorithm is used for training; the vector is input into the trained fingerprint model through the sealing operation, and the maximum posterior probability of each flow is calculated based on the CFS_KL weighted Bayesian algorithm, and the simulated data test is completed; by sending packets to the target network segment , collect the real traffic, input the real traffic into the fingerprint model, and predict the result; calculate the test accuracy of the real traffic. The invention alleviates the requirement of the Bayesian algorithm on feature independence, and improves the recognition accuracy of the Bayesian algorithm.

Description

technical field [0001] The invention belongs to the technical field of machine learning, in particular to a Bayesian weighting method based on CFS_KL (correlation-based feature selection_Kullback-Leibler). Background technique [0002] As one of the ten classic algorithms of machine learning, Bayesian algorithm has many applications in many fields, and all of them have shown very good results. For example, judging whether an email is spam or not based on its title and content. However, since learning the optimal Bayesian classifier is just like learning the Bayesian network, it is an NP-hard problem, so learning the Naive Bayesian classifier has been favored by many scholars, and Naive Bayesian is often based on a simple But unrealistic assumption: The features of the training data are independent of each other. This strong condition is difficult to achieve in real life. Even in reality, it has logically shown that the features are independent of each other. In the actual da...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62
CPCG06F18/24155G06F18/214
Inventor 桂小林安迪
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products