Abnormal behavior discovery method and system based on big data machine learning

A machine learning and big data technology, applied in the field of data security, can solve problems such as high cost, wrong judgment, and inability to judge abnormal behaviors and users in real time, so as to improve the accuracy of judgment, prevent misjudgment, and save labor costs and time cost effect

Active Publication Date: 2017-05-31
北京明朝万达科技股份有限公司
View PDF6 Cites 61 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] (1) The data source is single, and only the logs are analyzed and processed
[0011] (2) Unable to determine abnormal behavio

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Abnormal behavior discovery method and system based on big data machine learning
  • Abnormal behavior discovery method and system based on big data machine learning
  • Abnormal behavior discovery method and system based on big data machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] Glossary:

[0055]Hadoop: distributed system infrastructure, the core design is HDFS and MapReduce. HDFS provides storage for massive data, and MapReduce provides calculation for massive data.

[0056] Spark: A general-purpose parallel computing framework similar to Hadoop MapReduce. Unlike MapReduce, the intermediate output results of jobs can be stored in memory, so that the calculation speed is faster and it is better applicable to algorithms that require iteration, such as data mining and machine learning.

[0057] Lambda Architecture: A real-time big data processing framework proposed by Nathan Marz, which integrates offline computing and real-time computing, integrates a series of architectural principles such as immutability, read-write separation, and complexity isolation, and can integrate various big data components such as Hadoop and Spark.

[0058] Sqoop: big data component, used for data transfer between big data platform and traditional relational databas...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an abnormal behavior discovery method and system based on big data machine learning. The abnormal behavior discovery method disclosed by the invention comprises the following steps: carrying out pretreatment on the original security log data; extracting characteristic data from a pretreatment result; clustering the characteristic data, and determining an abnormal behavior library and a normal behavior library; acquiring new behavior sample data in the security log data, comparing with the normal behavior library and the abnormal behavior library, determining a new behavior to be a normal behavior or an abnormal behavior, and updating the normal behavior library or the abnormal behavior library with the new behavior sample data; and repeating the previous step, when the normal behavior library and the abnormal behavior library have enough normal behavior and abnormal behavior sample data, training a random forest model with sample data in the normal behavior library and the abnormal behavior library, and judging the abnormal behavior by utilizing the random forest model obtained through training. By adopting the scheme of the invention, the problem that quantity of label-containing samples in an initial stage is too low is solved, judging accuracy rate is improved, and misjudgement condition is effectively prevented from occurring.

Description

technical field [0001] The invention relates to the field of data security, in particular to a method and system for discovering abnormal behaviors based on big data machine learning. Background technique [0002] Traditional network security and data security technologies, such as various software and hardware firewalls, generally adopt a "fence-style" protection strategy, which artificially adds many restrictions to the network and application systems. Any data access action needs to be filtered by all preset rules. It not only affects the user experience of the system, but also increases the operating burden of the system. In addition, in existing security software, generating a built-in rule generally requires multiple stages such as vulnerability discovery, attack simulation, packet analysis, feature extraction, and rule generation. As attack methods are constantly updated, such a rule generation process also needs to be repeated, which consumes a lot of labor costs. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F21/55G06F21/57H04L29/06H04L12/26G06K9/62
CPCH04L43/16H04L63/1416H04L63/1425G06F21/552G06F21/577G06F18/23213G06F18/24147
Inventor 李学进王志海魏力喻波何晋昊蒲鹏飞
Owner 北京明朝万达科技股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products