A Sequence Alignment Based Binary Unknown Protocol Message Format Division Method

A technology of sequence alignment and protocol message, applied in digital transmission systems, data exchange networks, electrical components, etc., can solve the problems of high time complexity, inappropriate binary protocol, lack of format division basis and method, etc. Time complexity, effects of accurate and efficient format inference

Active Publication Date: 2021-02-26
SOUTHEAST UNIV
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the existing private protocol packet format inference schemes have some defects
The PI project uses the unsupervised UPGMA clustering method for hierarchical clustering, which has high time complexity
Although Discoverer reduces the time complexity by constructing message attribute sequences, its processing method of dividing message samples by common text class delimiters is not suitable for binary protocols
[0004] The existing binary protocol division method based on sequence alignment still mainly uses the PI project as the basis for improvement, and considers improving the conventional technical means Needleman-Wunsch algorithm and Smith-Waterman algorithm itself, such as the steps of matrix construction in the improved algorithm, but The comparison process in the classic PI project is still basically retained, which has the disadvantages of high time complexity and lack of clear format division basis and methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Sequence Alignment Based Binary Unknown Protocol Message Format Division Method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0031] Such as figure 1 As shown, a binary unknown protocol message format division method based on sequence comparison in this embodiment, for the network traffic set G to be formatted (set G is the protocol data that has been preprocessed and only contains a single protocol set), the element g in G is a character sequence in an unknown protocol format, and the length of the longest character sequence is Len.

[0032] The processing steps are as follows:

[0033] 1. Initialize the comparison result record sequence Seq[n], where n=1, 2, . . . , 2*Len, indicating a certain position of the protocol sequence.

[0034] After initialization, the initial value of each position of the Seq[n] sequence is 0. 2*Len is the longest value of the sequence Seq recorded in the comparison result, and the length of Seq in actual processing may be less than 2*Len. In the division of the protocol format, it is necessary to consider mining and retaining the diversity features in the sequence dat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a binary unknown protocol message format division method based on sequence comparison, which includes the following main steps: preprocessing to obtain a single type of protocol sequence set; setting the result sequence; performing global sequence comparison and pairwise protocol sequence Local sequence alignment; merging the global sequence alignment results; recording the local sequence alignment results as similarity; integrating the alignment results into the result sequence; dividing the message format according to the result sequence, etc. Compared with schemes using methods such as hierarchical clustering, the present invention has lower algorithm time complexity, and can also effectively improve the problem of field position slippage caused by inserting too many blank spaces in the existing schemes during sequence alignment, and has better accuracy. efficiency and practicality.

Description

technical field [0001] The invention belongs to the technical field of network protocol analysis, and in particular relates to a method for dividing the binary unknown protocol message format based on sequence comparison. Background technique [0002] In 1967, R.A.Scantleburry and K.A.Bartlett of England's National Physical Laboratory first used the English word "protocol" to describe the process of data communication in a memo. Nowadays, various standardization organizations, network communication technology solution providers, Network operators have formulated corresponding public agreements. As the name implies, the specifications of this type of protocol are public, and the data format used is also in the known category, such as the hypertext transfer protocol most commonly used when mobile apps interact with the background, and the dynamic host configuration protocol used when configuring addresses in home routers. At the same time, for the purpose of commercial intere...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04L29/06H04L12/26
CPCH04L43/18H04L69/03H04L69/06
Inventor 秦中元陆凯
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products