Method for separating unknown multi-protocol mixed data frames into single protocol data frames

A technology of mixed data and protocol data, applied to electrical components, transmission systems, etc., can solve problems such as the difficulty of distinguishing different protocols, the low accuracy of protocol frame cluster evaluation, and the difficulty in calculating the number K of mixed protocols.

Active Publication Date: 2015-07-08
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF6 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to overcome the deficiencies of the prior art, provide a method for separating unknown multi-protocol mixed data frames into single-protocol data frames, solve the difficulty in calculating the approximate value of the number of mixed protocol types K, and distinguish different protocols. , the evaluation accuracy of protocol frame clusters is low, and it is difficult to be intuitive and effective

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for separating unknown multi-protocol mixed data frames into single protocol data frames
  • Method for separating unknown multi-protocol mixed data frames into single protocol data frames
  • Method for separating unknown multi-protocol mixed data frames into single protocol data frames

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0083] Embodiment 1, the calculation of protocol type K is to take 27 kinds of protocols in Tcpdump, each of which takes 100 data frames, and if there are less than 100, all of them are taken; each data frame takes the first 68 bytes; the resulting protocols are mixed As input; variable settings: Liminal is set to 95, low_liminal is set to 10;

[0084] uniterate takes the corresponding K value from 50 to 99 records, the following is the brief experiment result of liminal=95; low_liminal=10; uniterate=99:

[0085] The maximum frame length is: 68;

[0086] Total number of frames: 2509;

[0087] Number of column stats: 68;

[0088] The number of sets in the candidate result set: 62;

[0089] The number of collections in the result set: 27;

[0090] Bytes: 00; Occurrences: 2379; Frequency: 0.9481865; Occurrences: Not shown;

[0091] Bytes: 10; Occurrences: 1172; Frequency: 0.46711838; Occurrences: Not shown;

[0092] Bytes: 7b; Occurrences: 700; Frequency: 0.2789956; Lines O...

Embodiment 2

[0122] Example 2, for the k-means clustering experiment:

[0123] data input:

[0124]For the 27 protocols in Tcpdump, take 100 data frames for each type, and take all of the less than 100 data frames; take the first 68 bytes of each data frame; mix the obtained protocols, and mark the protocol type after each data frame , for the Classes to clusters evaluation function of weka to evaluate the clustering effect.

[0125] Steps:

[0126] 1. Open the arff format file with weka.

[0127] 2. Use the StringToWordVector filter to process the text attribute. Set the WordCount parameter of StringToVector to flase, and use the default parameter settings for others. Each byte of the processed data stream represents an attribute. There are 256 attributes in total, and the attribute value is 1 or 0. 1 indicates that the attribute exists, and 0 indicates that it does not.

[0128] 3. Select the simplemeans clustering algorithm in weka for clustering, select Classes to clusters evaluati...

Embodiment 3

[0131] Example 3, clustering effect evaluation experiment:

[0132] Design the following two experiments, one is to use 2000 single-protocol data frames as input, and the other is to use 2500 multi-protocol mixed data frames as data, and then compare and analyze the obtained entropy values ​​to judge whether the clusters are good or bad .

[0133] (1) The entropy value of each column of 2000 single agreements is calculated as follows:

[0134] column number

entropy value

column number

entropy value

column number

entropy value

1

1.73797

15

0

29

2.923939

2

2.579031

16

0

30

3.635007

3

3.253605

17

0

31

4.842482

4

3.443339

18

0

32

5.652463

5

3.573282

19

0

33

0.677264

6

3.781037

20

0

34

2.003118

7

0.739385

21

0

35

3.112292

8

2.533421

22

1.30097

3...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for separating unknown multi-protocol mixed data frames into single protocol data frames. The method comprises the following steps that firstly, binary data are converted into a hexadecimal format, and n rows and m columns of mixed unknown protocol data frames are input; secondly, an approximate value K of type numbers of the input unknown protocol data frames is calculated; thirdly, a K value is designated by using a K-means algorithm, clustering is carried out, and n class clusters are obtained; fourthly, the good or bad of each class cluster is assessed by using a class cluster assessing algorithm based on entropy; fifthly, the class clusters with good clustering effects are placed into a result set, fingerprint information of the type is extracted, and the fingerprint information is stored in a fingerprint database. According to the method for separating the unknown multi-protocol mixed data frames into the single protocol data frames, the problems that the calculation of the approximate value of the mixed protocol type number K is difficult, the separation of different protocol areas is difficult, the assessing accuracy on protocol frame class clusters is low, and the visual and effective effects are difficult to achieve are solved.

Description

technical field [0001] The invention relates to a method for separating unknown multi-protocol mixed data frames into single-protocol data frames. Background technique [0002] With the development of science and technology and the improvement of computer technology, the development of the network is becoming more and more complex. The security of the information network has become the core content of the national informationization strategy. In a specific network environment, the threat of stealing secrets through special means is becoming more and more severe. , this kind of stealing way is usually sent through wireless communication, and most of the data used in this communication is unknown multi-protocol mixed data. And the follow-up judgment of information security is very important. [0003] However, in the current method of separating unknown multi-protocol mixed data frames into single-protocol data frames, there are three difficulties. It is difficult to calculate...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/06
Inventor 张凤荔周洪川刘渊郝玉洁张俊娇
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products