HTTP traffic feature recognition and extraction method based on machine learning

A technology of flow characteristics and machine learning, applied in digital transmission systems, electrical components, transmission systems, etc., can solve problems such as easy to extract features by mistake, pollute feature database, and large labor costs, so as to reduce investment, improve accuracy, and reduce The effect of probability on the data

Active Publication Date: 2020-03-17
南京烽火星空通信发展有限公司
View PDF10 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Although some feature recognition and extraction products have been launched on the market, these products have certain shortcomings. Extraction based on feature formats such as regular expressions and state machines is easy to extract features by mistake, pollute the feature library, and make th

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • HTTP traffic feature recognition and extraction method based on machine learning
  • HTTP traffic feature recognition and extraction method based on machine learning
  • HTTP traffic feature recognition and extraction method based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0044] Such as figure 1 As shown, the present invention provides a machine learning-based HTTP traffic feature recognition and extraction method, specifically comprising:

[0045] 1. HTTP traffic identification collection

[0046] The HTTP traffic identification collection module is the data source of the feature detection and rule generation modules. The traffic provided to the backend needs to meet the diversity and effectiveness. The former covers as many types of HTTP traffic as possible, and the latter filters meaningless packets to reduce the backend Deal with stress. The specific process is as follows:

[0047] 1.1 HTTP traffic identification

[0048] (1) The input traffic port is not limited and full traffic sampling is performed;

[0049] (2) TCP session reorganization, restore session context;

[0050] (3) Detect application layer load (...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an HTTP (Hyper Text Transport Protocol) traffic feature recognition and extraction method based on machine learning. The method comprises the following steps: step 1, carryingout HTTP traffic recognition and acquisition; step 2, carrying out feature detection and generating rules and step 3, extracting HTTP traffic characteristics. Compared with feature extraction based onregular expressions on the existing market, the method has the advantages that the feature accuracy is improved, the probability of mistakenly extracting dirty data by the regular expressions is reduced, and compared with a feature marking method based on manpower, the labor cost input and the response feedback time to novel features are reduced. Meanwhile, in the patent, feature/rule generationand feature extraction are separated, a unique extraction engine can be designed, and the feature extraction efficiency is improved.

Description

technical field [0001] The invention relates to a method for identifying and extracting HTTP traffic features based on machine learning. Background technique [0002] In the Internet society, there is a large amount of HTTP traffic on the network, and there is a large amount of valuable data in the HTTP traffic. Collecting these data and integrating them into a knowledge base helps to understand information in a timely manner, respond to events, and make decisions. At present, there are many methods for parsing HTTP data and extracting effective features, such as extraction based on regular expressions, extraction based on feature matching, and methods based on machine learning to identify features. [0003] Although some feature recognition and extraction products have been launched on the market, these products have certain shortcomings. Extraction based on feature formats such as regular expressions and state machines is easy to extract features by mistake, pollute the fe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L12/851H04L29/06H04L29/08
CPCH04L47/2483H04L67/02H04L69/04H04L69/06H04L69/22
Inventor 祝远鉴王懿韩震汪洋
Owner 南京烽火星空通信发展有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products