Online VoIP flow identification method based on C4.5 decision tree

An identification method and decision tree technology, applied in the field of traffic identification, can solve the problems of real-time research of the system, the instability of taking a fixed number of data packets, and the lack of online identification.

Active Publication Date: 2016-09-28
GUILIN UNIV OF ELECTRONIC TECH
View PDF0 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the literature does not study the real-time performance of the system, and the identification with a fixed number of data packets is not stable, and the system does not realize the real online identification
Document 4 "Di Mauro M, Longo M.Skype traffic detection: A decision theory-based tool[C] / / Security Technology (ICCST), 2014International CarnahanConference on.IEEE, 2014:1-6" proposes to call the TShark module to achieve online packet capture, And use the machine learning algorithm in the WEKA tool to build a classifier to realize the online identification of VoIP traffic, but the accuracy is low, only up to 83%, and Tshark can only work under the Linux system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Online VoIP flow identification method based on C4.5 decision tree
  • Online VoIP flow identification method based on C4.5 decision tree
  • Online VoIP flow identification method based on C4.5 decision tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0162] Parsing PCAP files

[0163] Use the data stream file captured by Wireshark software or the data stream file of downloading to be all PCAP file format, and the present invention needs the training set of CSV file format, therefore need to carry out format conversion, method is as follows:

[0164] Analyze the PCAP file header, including: data link layer 14-byte header + 20-byte IP header + 20-byte TCP or UDP header;

[0165] Program to realize the extraction of features such as quintuple {source IP address, destination IP address, source port, destination port, transmission protocol}, data packet length and time interval in the PCAP file header;

[0166] Complete the assembly of the UDP flow according to the Cisco distribution rules (the packet interval of the UDP flow does not exceed 30s);

[0167] Save the assembled dataset in CSV format.

[0168] The relevant 12 flow statistical features are selected as the feature set for the C4.5 decision tree algorithm to learn, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an online VoIP flow identification method based on a C4.5 decision tree. In the method, voice flow key characteristics are screened, such that an optimal characteristic sub-set is obtained; furthermore, a classifier is constructed by using a C4.5 decision tree algorithm; therefore, the online identification precision is improved; a JPcap packet capturing-detecting mechanism is provided for the first time; a sniffer is written by using a JPcap library, such that a data packet is captured in real time; simultaneously, data flow characteristic values are counted in a distributed manner; in combination with the threshold time, the VoIP flow in a network is dynamically identified; and the real-time performance is improved. An identification result shows that: the offline identification precision is up to 99%; the online identification precision is up to 92%; furthermore, the identification time is only 0.57 s; disadvantages in the prior art are overcome; and high-precision and real-time online VoIP flow identification is completed.

Description

technical field [0001] The invention relates to the technical field of traffic identification, in particular to a method for online identification of VoIP traffic based on a C4.5 decision tree. Background technique [0002] VoIP (Voice over Internet Protocol) continues to increase its proportion in voice communication services due to its low service cost and easy deployment. While its development brings opportunities, it also brings great challenges to network security operations. In recent years, VoIP applications illegally and secretly operating have caused more and more advertisements and fraudulent calls, causing interference and harm to people's daily life. Therefore, it is particularly important to control VoIP services. However, high-precision online VoIP traffic online identification has become an urgent problem to be solved in practical applications. [0003] Current research focuses on offline recognition, mainly using machine learning algorithms to build classifi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L12/26H04L12/851
CPCH04L43/026H04L43/028H04L43/18H04L47/2483
Inventor 刘建明唐霞李龙陈振舜张致远
Owner GUILIN UNIV OF ELECTRONIC TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products