Data processing system and method for network flow monitoring

A data processing system and network traffic technology, which is applied in the field of data processing systems for network traffic monitoring, can solve a large number of data processing and analysis problems, achieve efficient data organization forms and systems, avoid leakage risks, and enhance security.

Pending Publication Date: 2022-07-22
INFORMATION & COMM BRANCH OF STATE GRID JIANGSU ELECTRIC POWER
0 Cites 0 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0003] Purpose of the invention: The purpose of the invention is to provide a data processing system and method for network traffic monitoring, so as to solve ...
View more

Method used

[0030] Data processing layer: data processing is the core of big data services, and the data processing layer mainly includes two modules of off-line computing and real-time computing. Offline computing is mainly used for large-volume data processing, and the processing time is long and the real-time requirements are not high, such as statistics of the total traffic of each city every day. Real-time computing is ma...
View more

Abstract

The invention discloses a data processing system and method for network flow monitoring, the system comprises a data acquisition layer, a data processing layer and a data output and display layer, the data acquisition layer is used for acquiring network flow monitoring data; the data processing layer mainly comprises an offline calculation module and a real-time calculation module; and the data output and display layer is used for displaying, analyzing and processing results after data calculation. According to the invention, a multi-level hierarchical division data processing architecture is adopted, so that the network flow monitoring system has an efficient data organization form and system; the service data with high real-time requirement is processed and calculated, so that the user can be helped to find and solve problems in time; and the network flow data adopts an intranet security isolation storage mode, so that the security of the data is enhanced, and the risk of sensitive data leakage is avoided.

Application Domain

Transmission

Technology Topic

Internet trafficMonitoring data +8

Image

  • Data processing system and method for network flow monitoring

Examples

  • Experimental program(1)

Example Embodiment

[0027] The technical solutions of the present invention will be further described below with reference to the accompanying drawings.
[0028] A data processing system for network traffic monitoring, comprising a data acquisition layer, a data processing layer and a data output and display layer, specifically:
[0029] Data collection layer: collect network traffic monitoring data. There are two sources of network traffic monitoring data. One is the interaction information between the user's intranet client and the server collected by the Kelai probe. The data is passed in through kafka and the format is json. The information mainly includes client_ip, server_ip, client_port, Server_port, flow_end_time and other fields, through which offline statistics and real-time calculations can be performed. The data in Kafka is written to S3 and clickhouse for storage after cleaning and conversion. Clickhouse only saves data for nearly seven days for data query and For ad hoc computing, the format of data stored in S3 is spark+delta lake, which is convenient for offline computing using spark sql later; the other is relational data stored in mysql, which mainly includes attribute information of intranet ip, etc. The data in Mysql is mainly extracted and regularly updated through the task scheduling system through the spark program or the datax program.
[0030] Data processing layer: Data processing is the core of big data services. The data processing layer mainly includes two modules: offline computing and real-time computing. Offline computing is mainly used for large-scale data processing, and the processing time is long and the real-time requirements are not high, such as statistics of the total traffic of various cities every day. Real-time computing is mainly used for high real-time requirements, such as statistics of access traffic, abnormal ports, abnormal IPs, and high-frequency access per second. Real-time computing will generate a large amount of result data. In order to solve the efficiency problem of the data display module, the method of spark streaming computing + clickhouse storage is used to provide efficient external computing and query functions.
[0031] Data output and display layer: display and analyze the results of data calculation. Considering the concurrency of clickhouse and not supporting oltp operation, and the limited capacity of relational database, this system uses clickhouse+mysql to store the result data. Put the result data with a small amount of data or the data with long query time and consistent results of multiple queries into mysql for the background system to call. When the performance of mysql cannot be satisfied, the result data is written to clickhouse, and the two databases cooperate with each other to provide services for programs such as large screen and monitoring.
[0032] A data processing method for network traffic monitoring, comprising the following steps:
[0033] (1) Collect the interaction data between the user's intranet client and the server through the Kelai probe, and the obtained data data is passed into the message queue kafka component, and the data transmission format is a json string;
[0034] (2) Use the data integration component dataX to obtain information such as the attributes of the intranet ip from the relational database;
[0035] (3) For offline computing scenarios, the data obtained by the kafka component and dataX is cleaned and converted into the big data storage S3 and clickhouse; the big data computing spark component calculates and processes the data in the big data storage S3 and clickhouse ;
[0036] (4) For real-time computing scenarios, use the real-time streaming big data processing component spark streaming to calculate and process the real-time data obtained from the kafka component;
[0037] (5) The offline calculation and processing results are exported through the data sparksink, and the data is inserted into the clickhouse and mysql databases;
[0038] (6) The real-time calculation and processing results are directly written into the clickhouse and mysql databases;
[0039] (7) The offline and real-time calculation results are output and displayed through the flow monitoring system, large screen and other methods.
[0040] A computer storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the above-mentioned data processing system for network traffic monitoring.
[0041] A computer device, comprising a memory, a processor and a computer program stored on the memory and running on the processor, the processor implements the above-mentioned data processing for network traffic monitoring when the processor executes the computer program system.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.

Similar technology patents

Malicious website prompt method and router

ActiveCN104125209Aimprove security
Owner:TENCENT TECH (SHENZHEN) CO LTD +1

Credible virtual machine platform

InactiveCN101957900AImprove stability and attack resistanceimprove security
Owner:706 INST SECOND RES INST OF CHINAAEROSPACE SCI & IND

Intelligent door lock identity authentication method and system, readable storage medium and mobile terminal

ActiveCN109712278APrevent Identity Leakageimprove security
Owner:深圳市小石安防科技有限公司

Classification and recommendation of technical efficacy words

Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products