Spark-based big data weblog acquisition, analysis and early warning method and system

A collection analysis and early warning system technology, applied in the computer field, can solve problems such as large-scale real-time data processing, etc., to achieve the effect of improving operational efficiency, high throughput capacity, and reducing work intensity

Inactive Publication Date: 2020-01-14
SHANGHAI BAOSIGHT SOFTWARE CO LTD
View PDF13 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] This patent document uses Hadoop/Hive for log data collection and query analysis, and can only process offline data, and

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Spark-based big data weblog acquisition, analysis and early warning method and system
  • Spark-based big data weblog acquisition, analysis and early warning method and system

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0131] Example 1:

[0132] 1. The original log file buried point

[0133] Collect server CPU, memory, network card traffic, disk read and write speed and other log information, and collect points by writing LinuxShell scripts. Application operation information, including search module, resource module, order module survival and response time information, etc., are collected by writing Java programs and stored in Tomca application logs. For external interface access information, through embedding in the application program, using a combination of real-time throwing and separate storage of log files.

[0134] 2. Log collection and storage

[0135] Each server deploys and runs Flume Agent, collects and transmits log files, saves them in the distributed file system HFDS, and pushes real-time data to the Kafka messaging cluster. Then Spark Streaming consumes the messages in Kafka and performs calculations. Spark mainly processes offline data in HDSF and performs business logic processin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Spark-based big data network log acquisition, analysis and early warning method and system. The system comprises: a log original file acquisition module (101) for acquiring alog original file; a log acquisition module (102) which is used for acquiring log data according to the log original file; a log storage module (103) which is used for storing log data; a data logic processing module (104) which is used for carrying out log data analysis and parallel calculation to obtain a log analysis result; an analysis result storage module (105) which is used for storing loganalysis results; and a visual display and early warning module (106) which is used for reading the log analysis result, displaying the log analysis result or/and sending an operation and maintenanceearly warning prompt. According to the method, real-time processing and offline processing are combined, related information of the website can be conveniently and rapidly analyzed and early warned, it is avoided that a large amount of time is spent in finding fault points, the operation efficiency is improved, and the stability and safety of the website are guaranteed.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to methods and systems for Spark-based big data network log collection analysis and early warning. Background technique [0002] At present, the existing patent protection scheme only proposes the analysis of offline data based on Hadopp / Hive technology, but does not give a method of how to collect and analyze online data, and how to give an early warning. In actual work, Internet companies will encounter two modes of real-time data stream real-time processing and offline data batch processing in the collection and analysis of network logs. The two modes are mixed, so our log system is required to be able to cope with massive network logs. Capable of real-time processing and offline processing, with high throughput and high fault tolerance. In the event of an emergency, give an early warning and allow manual maintenance and intervention. [0003] For example, patent docu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L12/24H04L29/08
CPCH04L41/069H04L67/10H04L67/1097
Inventor 易可可汪潮王威
Owner SHANGHAI BAOSIGHT SOFTWARE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products