Unlock instant, AI-driven research and patent intelligence for your innovation.

Out-of-distribution network flow data detection method based on calculated likelihood ratio

A network traffic and data detection technology, applied in neural learning methods, biological neural network models, advanced technologies, etc., can solve the problems of high false alarm rate, non-uniqueness, and difficult to set up.

Pending Publication Date: 2022-08-02
HARBIN INST OF TECH
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the randomness of how to extract features and the extracted features, there is no standard, and there is a large gap for different types of data, so the calculated distance is not unique.
At the same time, the measurement scales between different features are not the same, and calculating distance or similarity is a very subjective and difficult thing
In addition, it is not easy to set the scale for judging whether it is out-of-distribution data. If the setting is too large or too small, it will easily cause a high false positive rate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Out-of-distribution network flow data detection method based on calculated likelihood ratio
  • Out-of-distribution network flow data detection method based on calculated likelihood ratio
  • Out-of-distribution network flow data detection method based on calculated likelihood ratio

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 2

[0111] According to the method of Embodiment 1, the model is trained and tested. The training data used in the training of the original model is the Moore dataset, which is a public traffic dataset. The Moore data set was collected by researchers in the Cambridge University laboratory. The traffic data set contains 12 types of traffic such as email, malicious traffic, and database. The perturbed data is generated by adding Gaussian white noise to the original Moore dataset in step 3. And use the generated perturbed data to train a perturbed model. The test data uses a mixture of Moore dataset and self-collected traffic data. The self-collected traffic data set contains the same type of traffic as the Moore data set, but due to the update of the data traffic form and network protocol, although the self-collected traffic is of the same type as the Moore data set, it is better than the traffic in the Moore data set. Self-collected traffic is out-of-distribution data, so the pu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an out-of-distribution network flow data detection method based on a calculated likelihood ratio, and belongs to the field of network flow data detection. The objective of the invention is to improve the accuracy and confidence of network flow data identification. The method comprises the following steps: extracting network traffic characteristics: the original traffic is a pcap packet, is divided into different data streams according to a quintuple, and is set to extract a data packet length sequence, calculate a packet arrival time interval sequence, store the sequences and generate a CSV file as original training data of model training; the method comprises the following steps: training an original classification model by using original training data, training the original classification model by adopting a deep learning algorithm long-short-term memory network to obtain a model trained by the original training data, generating disturbance data, generating disturbance data by adopting a method of adding Gaussian white noise, training a disturbance model, and obtaining a model trained by the disturbance data. And calculating a likelihood ratio, and judging out-of-distribution data. According to the invention, the accuracy and confidence of network traffic data identification are high.

Description

technical field [0001] The invention belongs to the field of network traffic data detection, in particular to a method for detecting out-of-distribution network traffic data based on a calculated likelihood ratio. Background technique [0002] With the increase of network private protocols, the types of network traffic are also more and more, and the similarity is also gradually improved. Many of today's network security problems need to be based on the identification and detection of network traffic. Most of the traditional identification and detection technologies are based on machine learning algorithms or deep learning algorithms to train classification models. [0003] Out-of-distribution data refers to the assumption that there is a data set S, which consists of data (X, Y), where X represents the extracted feature set, and Y represents the data label set. If there is a sample s(x, y) where y does not belong to Y, then the sample s is called out-of-distribution data. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): H04L47/2441G06N3/04G06N3/08
CPCH04L47/2441G06N3/08G06N3/048G06N3/044Y02D30/50
Inventor 余翔湛刘立坤史建焘叶麟张晓慧葛蒙蒙苗钧重刘凡韦贤葵李精卫石开宇王久金冯帅赵跃宋赟祖郭明昊车佳臻
Owner HARBIN INST OF TECH