A spam feature selection method and its detection method

A feature selection method and spam technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve the problems of time-consuming, difficult results, and high computational complexity, reducing feature dimensions and wide application. Effect

Active Publication Date: 2017-06-16
大庆乐此信息技术有限责任公司
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to provide a spam feature selection method and its detection method, to solve the existing feature selection method and spam detection method in the existing high computational complexity, time-consuming, and difficult to use in practical applications Problems with good results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A spam feature selection method and its detection method
  • A spam feature selection method and its detection method
  • A spam feature selection method and its detection method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

[0059] The present invention proposes a feature selection method of spam, such as figure 1 shown, including the following steps:

[0060] S101 extracting the features of the email based on the N-grams method of bytes, specifically including: segmenting the email into bytes of a preset length according to the byte stream to obtain the hash dictionary of the email; combining the preset samples with the hash dictionary Perform feature comparison to obtain a feature set corresponding to the hash dictionary;

[0061] Wherein, comparing the features of the preset sample with the hash dictionary to obtain the feature set corresponding to the hash dictionary is specifically: whe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a feature selection method and a detection method of junk mails. The feature selection method of the junk mails includes the steps that features of the mails are extracted on the basis of a byte N-grams method; according to the relevancy between the extracted features and a preset mail type, the features are ordered, so that initial feature subsets are generated; according to the Markov blanket approximation algorithm, redundant features in the initial feature subsets are deleted, so that candidate feature subsets are obtained; the candidate feature subsets are predicated through an online logistic regression classifier and evaluated according to predication results, so that the optimal feature subset is selected; according to the selected optimal feature subset, the junk mails are detected through the online logistic regression classifier. By the adoption of the feature selection method and the detection method of the junk mails, the calculation processes for feature selection and detection of the junk mails are simple, time complexity is low, and therefore the junk mail detection accuracy is greatly improved.

Description

technical field [0001] The invention relates to the technical field of computer network security, in particular to a spam feature selection method and a detection method thereof. Background technique [0002] With the rapid development of the Internet, e-mail has become a new type of information transmission tool, and it is widely used in various fields by virtue of its advantages of low price, convenience and speed. Then the widespread use has also brought some negative effects. A large number of spam mails are flooding people's mailboxes, which not only affects the normal use of users, but also damages the image of operators. Many spam systems have emerged as the times require, but they are faced with problems such as large amount of data and low operating efficiency. [0003] Traditional spam filtering methods, many machine learning methods including Flexible Bayes, decision tree, SVM, and Boosting have been applied to spam filtering. From the current research results, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/00G06Q10/10
Inventor 孙广路何勇军刘广明
Owner 大庆乐此信息技术有限责任公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products