Hadoop-based mass log data processing method

A data processing and log technology, which is applied in the direction of electronic digital data processing, special data processing applications, structured data retrieval, etc., can solve problems such as inability to log to support cross-node query, incapable infrastructure, and inability to solve problems

Inactive Publication Date: 2017-05-24
CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY
View PDF1 Cites 93 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

With the increase of users, the total amount of logs is increasing exponentially, and the traditional method of improving hardware indicators cannot solve the problem fundamentally.
[0003] The existing log processing method is centralized and cannot support log collection in various application systems. In addition, the existing log processing method cannot support cross-node query for logs
[0004] At the same time, the log data is not only a huge amount of data, but also mostly unstructured data. It is difficult for traditional relational databases to record this behavior by adding information entries. The previous infrastructure gradually becomes inadequate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hadoop-based mass log data processing method
  • Hadoop-based mass log data processing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] Such as figure 1 As shown, the implementation steps of the Hadoop-based massive log data processing method in this embodiment include:

[0019] 1) Establish Hadoop distributed cluster platform, Hadoop distributed cluster platform is installed with distributed file system HDFS, non-relational database Redis, Mysql database, HBase distributed database, Kafka distributed message cache system, REST interface and Strom streaming computing Framework; connect the target application system server to be processed with massive logs to the Hadoop distributed cluster platform, and pre-deploy the SHELL script for regularly uploading server software access log files in the target application system server;

[0020] 2) For the server software access log and business log in the target application system server, upload and push the server software access log files to the Hadoop distributed cluster platform regularly through the SHELL script, and the Hadoop distributed cluster platform w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Hadoop-based mass log data processing method. The method comprises the steps that on the basis of establishing a Hadoop distributed cluster platform, server software access logs and business logs in a target application system server are processed in a classified mode; log data collection, log data cleaning (limited to the server software access logs), log data analysis, log data storage and data export to an HBase distributed database or Mysql database are performed in sequence; then, a response is made to a query by a user based on the HBase distributed database or Mysql database in the Hadoop distributed cluster platform; and when a query request of the user is received, real-time business statistic data in the HBase distributed database or statistic data in the Mysql database is queried, and the query result is displayed. According to the method, log data can be collected from all application systems and centralized to be used for query, statistics and analysis, so that the processing speed and efficiency of application system logs are improved.

Description

technical field [0001] The invention relates to the field of computer data processing, in particular to a method for processing massive log data based on Hadoop. Background technique [0002] Existing log processing methods generally use EMC, IBM and ORACLE storage devices and database storage. When the amount of data reaches the upper limit of the log system, the manufacturer introduces new software and hardware for users to upgrade. However, with the increase of users, the total amount of logs is increasing exponentially, and the traditional method of improving hardware indicators cannot solve the problem fundamentally. [0003] The existing log processing method is centralized and cannot support log collection in various application systems. In addition, the existing log processing method cannot support cross-node query for logs. [0004] At the same time, the log data is not only a huge amount of data, but also mostly unstructured data. It is difficult for traditional ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/27G06F16/1815G06F16/182
Inventor 文勇军黄浩唐立军周庆华
Owner CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products