Mobile application traffic identification method and system based on machine learning

A mobile application and machine learning technology, applied in the field of traffic identification, can solve the problem that the traffic identification method is not suitable for mobile application traffic identification processing, there is no way to effectively meet the application requirements, and the specific application program that the traffic comes from cannot be identified. Achieve the effect of improving information richness, improving classification and recognition ability, and reducing misclassification rate

Pending Publication Date: 2022-05-13
CHONGQING UNIV OF POSTS & TELECOMM
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

First of all, almost all mobile communications are transmitted through HTTP / HTTPS, which makes traditional methods based on port identification only identify mobile traffic as Web, and cannot identify which specific application the traffic comes from
Secondly, the traditional method based on DPI (Deep Packet Inspection Technology) is to identify the traffic by identifying the payload of the data packet. Nowadays, in order to protect the privacy of users, many applications use encryption protocols for data transmission, and DPI technology has no way Effectively meet the actual application needs
Based on the above reasons, traditional traffic identification methods are not suitable for the identification and processing of mobile application traffic.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mobile application traffic identification method and system based on machine learning
  • Mobile application traffic identification method and system based on machine learning
  • Mobile application traffic identification method and system based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] Use a mobile phone without any application to connect to the hotspot on the PC side. When collecting the traffic of a target application, obtain and only run the apk file of the application, close the program background running function of the mobile phone system, and prohibit the background running. Open wireshark on the PC side to capture the traffic data packets sent by the mobile terminal. The traffic data packets collected by each application are based on the amount of characteristic data generated subsequently. The collection goal is: after the collected traffic is processed and generated, the amount of characteristic data can reach about 3000. In the collected data set, the detailed information of each data packet is recorded. After collecting data, the application traffic is dumped locally in pcap format.

[0055] Use the wireshark tool to process the pcap file stored locally, filter and delete the wrong and retransmitted data packets. Then use the tshark com...

Embodiment 2

[0065] like figure 2 , the mobile application traffic identification system used in the present invention includes a traffic monitoring module, a traffic processing module, a traffic display module, a feature extraction module, a feature display module, an application identification module, and a result display module;

[0066] Traffic monitoring module: Deploy the traffic monitoring tool wireshark to capture the application traffic sent by the mobile phone, and save every 1000 data packets captured to the local automatically;

[0067] Traffic processing module: For the traffic to be detected in the form of pcap stored locally, use the tshark command "-Tfields-eframe.time_delta-e frame.len-e ip.src-e ip.dst-e tcp.srcport-etcp.dstport "Analyze the pcap file, obtain the interval between sending two data packets, the size of the data packet, the source IP address, destination IP address, source port, and destination port of each data packet, and redirect them to the CSV file;

...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a mobile application traffic identification method and system based on machine learning, and belongs to the field of traffic identification. The method comprises a flow acquisition stage, a flow processing stage, a feature extraction stage, a flow marking stage, a flow balancing stage and a model training stage. The system comprises a flow monitoring module, a flow processing module, a flow display module, a feature extraction module, a feature display module, an application identification module and a result display module. According to the method, a multi-feature fusion feature extraction scheme is provided, the information richness is improved, the model training effect is optimized, and the classification accuracy is improved; a model training mode combining an SMOTE + ENN sample balance algorithm and a random forest algorithm is designed, so that the misclassification rate of minority class samples is reduced, and the classification and recognition capability of a classifier is improved.

Description

technical field [0001] The invention belongs to the field of traffic identification and relates to a machine learning-based mobile application traffic identification method. Background technique [0002] The particularity of mobile application traffic brings great challenges to traditional traffic identification methods. First of all, mobile communication is almost always transmitted through HTTP / HTTPS, which makes the traditional method based on port identification only identify mobile traffic as Web, and cannot identify which specific application the traffic comes from. Secondly, the traditional method based on DPI (Deep Packet Inspection Technology) is to identify the traffic by identifying the payload of the data packet. Nowadays, in order to protect the privacy of users, many applications use encryption protocols for data transmission, and DPI technology has no way Effectively meet the needs of practical applications. Based on the above reasons, traditional traffic id...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L47/125H04L47/2441H04L47/2483H04L69/22G06N3/00G06N20/00
CPCH04L47/2441H04L47/2483H04L47/125H04L69/22G06N20/00G06N3/006
Inventor 陈龙汤婷婷韩世凯
Owner CHONGQING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products