An online traffic identification method based on incremental clustering algorithm

A technology of traffic identification and incremental clustering, applied in the network field, can solve the problems of high manpower and material resources, large space-time overhead, lack of encrypted traffic identification, etc., and achieve high recognition rate and good real-time effect

Inactive Publication Date: 2018-12-21
HARBIN ENG UNIV
View PDF3 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The identification method based on the port number is only accurate for network protocol traffic identification using commonly used ports and registered ports; the identification method based on behavioral feature matching has relatively large time and space overhead, and the identification performance is insufficient; the traffic identification method based on deep packet inspection It takes a lot of manpower and material re

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An online traffic identification method based on incremental clustering algorithm
  • An online traffic identification method based on incremental clustering algorithm
  • An online traffic identification method based on incremental clustering algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The present invention will be further described below in conjunction with accompanying drawing:

[0029] This traffic identification method is divided into an offline training phase and an online identification phase, using 5-tuples (source / destination port, source / destination IP address, protocol type) to define the data flow, and the first 5 data packets of the data flow as sub-flows Early traffic identification of online data flow meets the real-time and high-speed requirements of online traffic identification. At the same time, the four attributes of data packet size, data packet average interval time, data packet average length and server port number are used to carry out two-stage feature extraction.

[0030] combined with figure 2 And attached image 3 The offline training phase can be described in detail:

[0031] 1. First, preprocess the collected network traffic, including fragmentation and reassembly of IP layer data packets, filtering out-of-order retran...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the network technical field, in particular to an online traffic identification method based on an incremental clustering algorithm. The method includes: an offline recognitionstage and an online recognition stage, wherein in the offline recognition stage,a semi-supervised learning flow algorithm based on an improved K-means algorithm is used to perform preliminary clustering and mapping work on the prepared training data sets, and the data sets which are preliminarily classifiedare obtained; in the online recognition stage,based on the completed clustering and mappingdata sets formed in the offline identification phase, incremental clustering is used to determine the network application type of the newly added data streams online, so as to achieve the purpose oftraffic identification. According to the method,based on machine learning technology, by constructing a suitable recognition model to learn the prepared data, the online traffic can be incrementally clustered in real time, and the preliminary semi-supervised classification can be carried out by combining the prepared training set, which can realize the online recognition of network traffic, and has good real-time performance and high recognition rate.

Description

technical field [0001] The invention belongs to the field of network technology, and in particular relates to an online traffic identification method based on an incremental clustering algorithm. Background technique [0002] With the rapid development of the Internet, the network environment has become more and more complex, and new types of applications and services have become increasingly diverse. Real-time and accurate identification of network traffic is of great significance to network management and traffic. There are currently four traffic identification methods: port number-based traffic identification methods, behavioral feature-based traffic identification methods, deep packet inspection-based traffic identification methods, and machine learning-based traffic identification methods. The identification method based on the port number is only accurate for network protocol traffic identification using commonly used ports and registered ports; the identification meth...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L12/26G06K9/62
CPCH04L43/026H04L43/028H04L43/062G06F18/23213
Inventor 苘大鹏杨武王巍玄世昌吕继光甘志雄
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products