Unlock instant, AI-driven research and patent intelligence for your innovation.

A hadoop-based network security log k-means clustering analysis system and method

A cluster analysis and network security technology, applied in database management systems, structured data retrieval, instruments, etc., can solve problems such as the inability to mine the intrinsic value of massive heterogeneous data, and the poor scalability of data warehouses, so as to improve computing power, Improve the efficiency of query analysis and the effect of potential value mining

Inactive Publication Date: 2018-10-30
NORTHWEST UNIV
View PDF5 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] In order to overcome the above-mentioned deficiencies in the prior art, the purpose of the present invention is to provide a Hadoop-based network security log k-means clustering analysis system and method, on the basis of rationally utilizing the traditional data warehouse that has been built, the big data The platform is integrated to establish a unified data storage and data processing architecture, which overcomes the shortcomings of traditional data warehouses, which are poor in scalability, only good at processing structured data, and unable to mine the intrinsic value of massive heterogeneous data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A hadoop-based network security log k-means clustering analysis system and method
  • A hadoop-based network security log k-means clustering analysis system and method
  • A hadoop-based network security log k-means clustering analysis system and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0099] First, build a Hadoop distributed cluster environment, including 5 PCs. One master server and the remaining four are slave servers. Configure Hadoop on each machine, and then install and configure Sqoop, hive, and MySQL on the namenode. In this embodiment, the log records of all security devices in Shaanxi Li'an Electric Supermarket are used, and the file size is 16G. The log is regularly updated every day according to the requirements, and the query results are counted in the update business.

[0100] This method can realize fast statistical query through hive. Its advantages are: low learning cost, simple MapReduce statistics can be quickly realized through SQL-like statements, and no special MapReduce application needs to be developed, which is very suitable for statistical analysis of data warehouses. Using partitions can speed up the query speed of data shards and improve query efficiency. Realize the k-means algorithm through MapReduce, and evaluate the securit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A network security log k‑means clustering analysis system and method based on Hadoop, including a log data acquisition subsystem, a log data hybrid mechanism storage management subsystem, and a log data analysis subsystem; in the data storage layer, Hadoop and traditional data are used The hybrid storage mechanism of warehouse collaboration stores log data and provides an interface for Hive operations in the data access layer. The data storage layer and computing layer receive instructions from the Hive engine and implement efficient query and analysis of data through HDFS and MapReduce; in the log processing When mining and analyzing data, MapReduce is used to implement k‑means algorithm for cluster mining and analysis; Hadoop and traditional data warehouse collaboration architecture are used to make up for the shortcomings of traditional data warehouses in massive data processing and storage, while also making the original The traditional data warehouse makes full use of its resources; the k-means algorithm based on MapReduce is used for cluster analysis, which can conduct timely security level assessment and early warning of log data.

Description

technical field [0001] The invention belongs to the technical field of computer information processing, and in particular relates to a Hadoop-based network security log k-means cluster analysis system and method. Background technique [0002] With the explosion of data and the sharp increase in the amount of information, the existing traditional data warehouses of enterprises have been unable to cope with the growth rate of data. Traditional data warehouses are usually built with high-performance all-in-one machines, which are costly and poor in scalability, and traditional data warehouses are only good at processing structured data. This feature affects the mining of intrinsic value of traditional data warehouses when faced with massive heterogeneous data. This is the biggest difference between Hadoop and traditional data processing methods. We need to make reasonable use of the existing traditional data warehouse of the enterprise, and at the same time integrate the exist...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/182G06F16/2457G06F16/25
Inventor 高岭苏蓉高妮王帆杨建锋雷艳婷申元
Owner NORTHWEST UNIV