Network security blog classification method and system based on feature extraction

A technology of network security and feature extraction, which is applied in the field of blog classification, can solve the problems of low semantic discrimination and unrecognition, and achieve reliable classification results

Active Publication Date: 2018-12-21
CENT SOUTH UNIV
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In our classification of network security blogs, our final categories are blogs related to IOC and blogs not related to IOC. However, among blogs not related to IOC, some blogs also describe cyber threats, but there is no comment on cyber threats. Behavioral characteristics analysis, such blogs and IOC-related blogs do not have a high degree of semantic distinction in the title, and existing methods cannot accurately identify them

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Network security blog classification method and system based on feature extraction
  • Network security blog classification method and system based on feature extraction
  • Network security blog classification method and system based on feature extraction

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0075] The present invention will be further described below in conjunction with examples.

[0076] Such as figure 1 As shown, the present invention discloses a method for classifying network security blogs based on feature extraction, which specifically includes the following steps:

[0077] Step 1: For network security blogs, use web crawler technology to crawl network security blogs from secure websites.

[0078] For example, in this embodiment, taking the security blog website malwarebytes as an example, its corresponding blog list page is https: / / blog.malwarebytes.com / page / 1, where the last 1 indicates the blog list page number. We traverse from the first blog list page until the page is empty, and use XPath to anchor all blog links on each list page. Then visit each blog link, and anchor the title, release time, and text of the blog with XPath. Finally, {link, title, release time, text} is stored in the database as a blog entry. It should be understood that the crawl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a network security blog classification method and a network security blog classification system based on feature extraction. The method comprise calculating the non-dictionaryword density for each blog; calculating blog malicious tendencies for each blog; counting high-frequency words common to all blogs; calculating the frequency of each high-frequency word in each blog Inverse document frequency; based on the blog's non-dictionary word density, blog malicious tendencies, and the frequency of each high-frequency word in each blog-Inverse document frequency, and codingbased on the correlation or uncorrelation between blog and IOC to train the preset classification model to obtain a blog classifier; gettomg the non-dictionary word density of the blog to be categorized, blog malice, and the frequency of high-frequency words Reverse the document frequency and input to the trained blog classifier to obtain the classifier output value indicating that the blog to beclassified is related or not related to the IOC. Through the above method, IOC-related blogs and IOC-independent blogs in network security technology blogs can be accurately classified.

Description

technical field [0001] The invention belongs to the field of blog classification, and in particular relates to a feature extraction-based network security blog classification method and system. Background technique [0002] In recent years, cyber threats have expanded in scope and frequency. Many companies have suffered huge losses due to cyber attacks, and how to deal with complex and changeable cyber threats has become the focus of attention of various companies. After analyzing cyber threats, many cyber security experts publish the obtained cyber threat intelligence in blogs. Such blogs contain a large number of network threat indicators (Indicator of Compromise, referred to as IOC), such as malicious URLs, Trojan virus names, and the like. These IOCs represent the behavioral characteristics of network threats and play an important role in detecting and defending against network attacks. However, there are still many blogs related to news and security product promotion...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 王建新宁翔凯李冬王伟平鲁鸣鸣
Owner CENT SOUTH UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products