Traffic identification method based on bag of word (BOW) model and statistic features

A technology of statistical features and traffic identification, applied in transmission systems, digital transmission systems, electrical components, etc., to simplify the feature extraction process and overcome privacy and efficiency issues

Active Publication Date: 2012-07-11
上海深杳智能科技有限公司 +1
View PDF4 Cites 41 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

"But again, the technology still hasn't solved the above problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Traffic identification method based on bag of word (BOW) model and statistic features
  • Traffic identification method based on bag of word (BOW) model and statistic features
  • Traffic identification method based on bag of word (BOW) model and statistic features

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The embodiments of the present invention are described in detail below. Based on the premise of the technical solution of the present invention, the present embodiment provides detailed implementation and specific operation process, but the protection scope of the present invention is not limited to the following embodiments.

[0042] like figure 1 As shown, the traffic data on the real network is stored in the data object collection device (specifically, this device is a router or a switch or a server, in short, it is a series of core settings of the network), assuming that the device stores N data flow objects are collected, and other technologies (such as DPI or manual identification, etc.) are used to mark the network traffic category (such as WEB, P2P, VOIP, etc.) to which each data flow object belongs, then These stream objects become the training set data objects for machine learning. Afterwards, the vector representations of these training set data objects are ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a traffic identification method based on a bag of word (BOW) model and statistic features. The method adopts the BOW model matched with a feature extraction method, trains collected network traffic features, and thus obtains a feature vector corresponding to each network category. For new network traffic, similarly, by extracting traffic features, utilizing the BOW module to obtain a corresponding feature vector, and then sequentially comparing with a feature vector of each network category which is previously established, the category corresponding to the feature vector with highest matching degree serves as a category tag of the new network traffic. The BOW method combines with an unsupervised k-means clustering method and a supervised k-nearest neighbor method, thereby being more suitable for multi-category classification. Due to the fact that the BOW model is not sensitive to space position, during feature extraction, arrangement according to time series of features is not required, and processing is convenient.

Description

technical field [0001] The invention relates to a method for identifying network data streams, in particular adopting a machine learning model of BoW (Bag of Words) in conjunction with a proposed feature extraction method for predictive modeling. Background technique [0002] At the end of the 1990s and the beginning of this century, batches of experiments and attempts on Internet traffic classification technology emerged, including revolutionary technological innovations. One of the main driving forces of scientific and technological research is the actual application requirements. Throughout the entire Internet development process, traffic identification mainly plays an extremely important role in the following aspects: [0003] ●Internet service providers (ISPs) need to know which applications their users are using, or the development trend of obtaining applications, so as to implement various business goals. Such as dynamically allocating network resources for users of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L12/26H04L12/24
Inventor 陈凯张寅周曲周异杨小康
Owner 上海深杳智能科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products