A Sample Automatic Calibration Method for Encrypted Traffic Identification

An automatic calibration and traffic identification technology, applied in the field of network security and user privacy, which can solve the problems of correspondence, inapplicability of classifiers, inability to determine the start and end time points of user access, etc., and achieve the effect of accurate user behavior and accurate identification.

Active Publication Date: 2020-04-28
XI AN JIAOTONG UNIV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

(2) In real network scenarios, only mixed traffic data can be obtained. How to split it into different website request data as training samples is a basic problem in classification learning
However, in a real network environment, the classifier trained by this highly hypothetical traffic sample is not applicable, because capturing traffic at the egress cannot determine the start and end time points of user access, and how much traffic is obtained A user or even multiple website requests are mixed together, so it is not possible to capture all the traffic of the entire session and correspond to the network behavior

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Sample Automatic Calibration Method for Encrypted Traffic Identification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0031] Step 1: Given a pcap file of traffic data captured continuously for n days, parse it into a data packet sequence in the format of , and require the sequence to be sequenced from small to large by time stamp Sort. Given the communication log generated on the proxy server side, the format of each record is , according to the characteristic that the port of the same IP will not be reused every two hours, it is required The communication log is to generate a file every two hours, that is, every even hour at 0:00, 2:00, 4:00, 6:00, 8:00... and so on to generate a log file, such as 2018 / 4 The communication log from 18:00 to 20:00 of / 20 is recorded as a file at 2018-04-20.18:00.

[0032] Step 2: Timestamp the largest and smallest in the sequence of packets as ts 0 and ts 1 , convert it to the format of [Year-Month-Day.Hour:Minute:Second], recorded as t 0 and t 1 . Calculates less than and closest to t 0 and less than and closest to t 1 The [Year-Month-Day. Even-number...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a sample automatic calibration method for encrypted traffic identification, and proposes a traffic splitting method based on TCP characteristics. Starting from dividing different application programs, the traffic is split into a plurality of different samples. The log information analyzes and splits the traffic data, so as to realize the corresponding relationship between the response network behavior and the traffic data, that is, to realize the traffic data calibration in classification learning. This method makes full use of the relevant knowledge of the application layer communication protocol TCP and the log information of the proxy server, and can be applied to the identification of encrypted traffic in real scenarios.

Description

technical field [0001] The invention belongs to the field of network security and user privacy, and in particular relates to a sample automatic calibration method for encrypted traffic identification. Background technique [0002] In recent years, with the rapid development of the Internet, the network has been closely integrated into our production and life, and network security has become an issue that cannot be ignored. In daily life, people's awareness of network security has gradually increased, and more and more users and enterprises have begun to pay attention to the protection and safe transmission of information. The network behavior recognition technology based on encrypted traffic can be used to implement network security supervision, especially the supervision of illegal business and bad information, such as human trafficking, prostitution and gambling, arms trading, etc. Encrypted traffic recognition (Website Fingerprinting, WF) is a technology that classifies ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04L29/06H04L29/08H04L12/851H04L9/32
CPCH04L9/3297H04L47/2483H04L63/1408H04L63/1425H04L67/56
Inventor 马小博师马玮焦洪山安冰玉赵延康李剑锋彭嘉豪
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products