DGA domain name detection method based on random forest

A random forest algorithm and domain name detection technology, applied in the field of network security, can solve the problems of high false negative or false positive rate, poor generalization of the method, high cost, high operating efficiency, good generalization performance, and resource utilization. less effect

Active Publication Date: 2016-05-11
STATE GRID CORP OF CHINA +3
View PDF7 Cites 67 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The above two types of methods have the following limitations: 1. In the method based on structural features, the existing two patents start from the similarity measurement, and obtain the threshold by calculating the sample pair to determine whether the domain name to be detected is a counterfeit domain name or an unknown linked domain name. horse website
The above method uses a relatively simple similarity measurement method, the characteristics considered are relatively single, the sett

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • DGA domain name detection method based on random forest
  • DGA domain name detection method based on random forest
  • DGA domain name detection method based on random forest

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The present invention will be further described below in conjunction with the accompanying drawings. The following examples are only used to illustrate the technical solutions of the present invention more clearly, but not to limit the protection scope of the present invention.

[0047] Such as figure 1 As shown, the DGA domain name detection method based on random forest includes the following steps:

[0048] Step 1, building a knowledge base, including building a black-and-white list sample library and a word dictionary.

[0049] The blacklist refers to malicious domain names obtained through open source channels, such as: malicious URLs published by Security Alliance Website Exposure Platform, malicious URL database published by Kingsoft Web Shield, MalwareDomainList, MalwareDomains, PhishTank, hpHosts, and CyberCrimeTracker malicious domain names list.

[0050] The white list refers to legal domain names obtained through open source channels, such as Alexa website...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a DGA domain name detection method based on random forest, comprising steps of constructing a knowledge database which comprises a black-and-white list sample library and a word dictionary, setting a domain name characteristic template, using the domain name in the black-and-white list as a training set, filtering the noise, performing training and off-line storage on the random forest algorithm model, obtaining a domain name to be detected, loading an optimal random forest algorithm model, and using the domain name to be detected as an input to obtain the prediction result. The invention does not rely on the DNS data which is obtained on line, can not only fast finish the DGA domain name detection, but also provides prediction to the other malicious domain name detection methods. Besides, the DGA domain name detection method is based on the forest algorithm and has an obvious advantage on the noise interference, and is less in resource consumption, high in operation efficiency and good in generalization.

Description

technical field [0001] The invention relates to a random forest-based DGA domain name detection method, which belongs to the field of network security. Background technique [0002] Malicious domain names refer to website domain names that spread worms, viruses, and Trojan horses, or conduct illegal activities such as fraud and pornography. As Domain-Flux and Fast-Flux technologies are more and more widely adopted by hackers, network attacks are more concealed, malicious tracking is more difficult, and security risks are more permanent. Among them, domain names generated by domain generation algorithms (Domain Generation Algorithm, DGA) are widely used in botnets (Botnet). In a network composed of a large number of hosts (Bot) infected by bots, the attacker (BotMaster) can manipulate the Bot to launch various types of network attacks through the control server, such as distributed denial of service (DDoS), spam (Spare) , Phishing (Phishing), click fraud (ClickFraud) and st...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04L29/06H04L29/12
CPCH04L63/1441H04L61/4511
Inventor 王红凯张旭东杨维永马志程廖鹏黄益彬于晓文张丹夏威宋文杰
Owner STATE GRID CORP OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products