Unlock instant, AI-driven research and patent intelligence for your innovation.

Http protocol information extraction method and device

An http protocol and information extraction technology, applied in the field of data analysis, can solve problems such as slow speed and low efficiency, and achieve the effect of simple extraction

Inactive Publication Date: 2016-11-09
XIAMEN MEIYA PICO INFORMATION
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The main purpose of the present invention is to provide a method and device for extracting http protocol information, so as to solve the technical problems of slow speed and low efficiency when http protocol obtains valid information of protocol content in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Http protocol information extraction method and device
  • Http protocol information extraction method and device
  • Http protocol information extraction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0024] First, Embodiment 1 of the present invention provides a method for extracting http protocol information, which mainly describes the process of extracting http protocol information, see figure 1 , the method may include the following steps:

[0025] Step S102: Load the extraction rules for http protocol information extraction, and store them in memory.

[0026] When extracting the http protocol information, the extraction rules are first loaded and stored in the memory. According to the characteristics of the http protocol, the extraction rule includes a plurality of rules, which are respectively matched with hosts and urls in different situations.

[0027] Step S104: Obtain the host and url in a piece of data from the data to be analyzed.

[0028] The data to be analyzed can be big data, and during processing, the http protocol information is extracted one by one from the data to be analyzed. In this step, for each piece of data, the host and url in the data are obta...

Embodiment 2

[0035] This embodiment is a further preferred method for extracting http protocol information on the basis of Embodiment 1, see figure 2 , the method may include the following steps:

[0036] Step S202: Loading extraction rules for http protocol information extraction.

[0037] Preferably, the extraction rules are written in the form of an xml configuration file. When loading the extraction rules, the following steps are adopted:

[0038] Use SAXReader to read in the xml configuration file; traverse the host tag to construct the HostInfo entity object; traverse the urlinfo tag under the host tag to construct the UrlInfo entity object, and verify the validity of the protocol small class code and custom class; traverse the urlinfo tag under the getinfo tag, construct GetInfo entity object, verify the validity of pType, srcData attribute and custom class; traverse the todata tag under the getinfo tag, construct Todata entity object, verify the validity of keystring and custom cla...

Embodiment 3

[0112] Corresponding to the method for extracting http protocol information provided in Embodiment 1 of the present invention, the embodiment of the present invention also provides a device for extracting http protocol information, see image 3 , the device may include a rule loader and a rule parser.

[0113] Among them, the rule loader is used to load the extraction rules for http protocol information extraction and store them in the memory. The specific process is as described in the second embodiment above, and will not be repeated here, in order to meet the diversity and complexity of the content of the http protocol , it is also possible to implement a personalized extraction interface according to specific requirements, and then achieve custom personalized extraction; the rule parser is used to obtain the host and url in a piece of data from the data to be analyzed, and judge whether the obtained host and url are consistent with the extracted The rules are matched, and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an http protocol information extraction method and device. The method comprises the following steps of: loading an extraction rule which is used for extracting http protocol information, and storing the extraction rule in an internal memory; obtaining a host and a url of one piece of data from to-be-analyzed data; judging whether the obtained host and url are matched with the extraction rule or not; and extracting the http protocol information according to the extraction rule when the obtained host and url are matched with the extraction rule. Through the http protocol information extraction method and device, rapid and efficient analysis and information extraction can be carried out on http protocols under big data.

Description

technical field [0001] The present invention relates to the technical field of data analysis, in particular, to a method and device for extracting http protocol information. Background technique [0002] With the rapid development of the Internet era, the era of big data will also come. Nowadays, with the rise of new data sources such as social data, enterprise content, transaction and application data, the limitations of traditional data sources have been broken, and enterprises increasingly need effective information to ensure its authenticity and security. [0003] Today, when the amount of data is very large, the types of data protocols are rapidly increasing and the content of the protocol is rapidly updated, and the complexity of protocol analysis is self-evident. The extraction of http data protocol information will face great challenges. At present, in the big data environment, there are many types of http protocols and complex relationships. The traditional analysi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/84
Inventor 朱海勇鄢小征栾江霞周成祖
Owner XIAMEN MEIYA PICO INFORMATION