Unlock instant, AI-driven research and patent intelligence for your innovation.

Application layer protocol characteristic extracting method based on Hadoop

An application layer protocol and feature extraction technology, applied in the direction of digital transmission systems, electrical components, transmission systems, etc., can solve the problem of lack of effective methods for feature extraction, and achieve the effect of improving limitations and

Inactive Publication Date: 2015-07-15
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, the detection technology based on the application layer feature field has become the mainstream method of application layer protocol identification, but there is still a lack of effective methods for feature extraction. The characteristics of this protocol are mainly extracted by artificially analyzing the specification document of an application layer protocol.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Application layer protocol characteristic extracting method based on Hadoop
  • Application layer protocol characteristic extracting method based on Hadoop
  • Application layer protocol characteristic extracting method based on Hadoop

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0064] The traffic data packets on the network are captured. The selected data packets include two types of HTTP protocols and two versions of OICQ protocols. Each OICQ protocol version includes three types. In order to better illustrate the implementation process, only 53 data packets are selected in this embodiment, and a certain amount of data packets that are too long are deleted to a certain extent. In this embodiment, the minimum support degree a=0.1 is set, so the minimum support number n=53×0.1≈5, and the mark of each data packet increases from 1, and the data form after preprocessing is:

[0065] 1_0230370081bac10000007616cbf90594f97a4a60f9087309f1129a98c046b400fe8b831e1efa64607866eca88782e64872f73bf1075d583f2c18e98dbd8f4992802

[0066] 2_0230370081310000000a787c52eebc39ba2941cf14b9e735f56de72aa4ebcd01474a741cf14b9e735f56de72aa4ebcd01474a728ae5e9e06d8719f726f6518c9019c237d89e047022fd5e7174215af4b4067fa42c5e189b13a6403

[0067] ...

[0068] 53_02303700583275000000aa2...

Embodiment 2

[0112] Capture the traffic data packets on the network, and select the data packets of the FTP protocol. The size is 13.9MB, and there are 44345 data packets in total. In this embodiment, the minimum support degree a=0.02 is set, so the minimum support number n=44345×0.02≈887. The preprocessed data looks like this:

[0113] 25674_3232362d46696c65207375636365737366756c6c79207472616e736665727265640d0a32323620302e303138207365636f6e6473202c20312e3230204d627974657320706572207365636f6e640d0a

[0114] 25780_3232302d53747564656e74656e204e6574205477656e7465687474703a2f2f7777772e736e742e757477656e74652e6e6c2f200d0a3232302d74686520556e6976657273697479206f66205477656e7465687474703a2f2f7777772e757477656e74652e6e6c2f20200d0a3232302d0d0a3232302d546869732073797374656d206d6179206265207573656420323420686f7572732061206461792c20

[0115] 43888_323530204469726563746f7279207375636365737366756c6c79206368616e6765642e0d0a

[0116] 43872_3235302d0d0a

[0117] The processing process of alternative it...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an application layer protocol characteristic extraction method based on Hadoop. A Map Reduce model of a Hadoop platform is utilized to scan a target application layer protocol data package, according to a minimum supporting number, frequent items are screened out from alternative items, methods of the frequent items are screened out by combining of high-order alternative items to find the longest frequent item, offset amount is used to screen out non-mutual-overlapping frequent items form all frequent items to be taken as a characteristic field sequence combination to form characteristic strings, according to the minimum supporting number, a final characteristic string which can reflect target application layer protocol characteristics is screened out from the characteristic strings, and extraction of the target application layer protocol characteristics is completed. The extraction method just needs to scan application layer protocol data once, can accurately extract the characteristics of an application layer protocol, and improves boundedness in the process of artificial conducting application layer protocol characteristic extraction in mass protocol data, and subjectivity of characteristic determination.

Description

technical field [0001] The invention belongs to the technical field of application layer protocol identification, and more specifically relates to a Hadoop-based application layer protocol feature extraction method. Background technique [0002] With the rapid development of the Internet and the continuous development of broadband technology, some new requirements have emerged in the Internet. Along with these demands, the forms and types of application layer protocols are more complex than in the past, and the proportion of traditional protocol traffic in total traffic is getting smaller and smaller. On the contrary, new application protocols such as P2P, streaming media, and online games are constantly emerging. Therefore, how to correctly identify these complex protocols is a problem that protocol identification algorithms must solve now. The methods for identifying protocols mainly include: port-based identification, load-based identification, measurement-based identifi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): H04L29/06H04L29/08H04L12/70
Inventor 孙健陈小英徐杰隆克平张毅李乾坤王晓丽梁雪芬姚洪泽陈旭
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA