Method for detecting image-based spam email by utilizing improved gauss hybrid model classifier

A Gaussian mixture model and Gaussian mixture technology, applied in the direction of instruments, computer parts, characters and pattern recognition, etc., can solve the problems of disadvantage, large amount of calculation, high algorithm time complexity, save program operation time and space, improve The effect of precision and recall

Inactive Publication Date: 2011-07-20
NANJING UNIV OF POSTS & TELECOMM
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Distinguish junk pictures by calculating the threshold value. Although this method uses statistical knowledge to calculate more accurately,

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for detecting image-based spam email by utilizing improved gauss hybrid model classifier
  • Method for detecting image-based spam email by utilizing improved gauss hybrid model classifier
  • Method for detecting image-based spam email by utilizing improved gauss hybrid model classifier

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The method is mainly divided into the following steps:

[0038] 1. Training based on the sample set

[0039] Step 1) label the image data set to be trained, and divide it into garbage images and normal images;

[0040] Step 2) using the "accelerated extraction algorithm of robust features" to extract the local invariant feature descriptors of each garbage picture and normal picture respectively;

[0041] Step 3) Carry out Gaussian mixture model fitting to the local invariant feature descriptor of each picture, adopt expectation maximization method to evaluate its weight, mean value and covariance matrix, as Gaussian mixture feature vector;

[0042] Step 4) improving the mean value clustering algorithm so that it clusters this special Gaussian mixture eigenvector, which involves the determination of the distance calculation method and the standard measurement function;

[0043] Step 5) using cross-entropy as the distance calculation method between Gaussian mixture dist...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for detecting a spam email by utilizing an improved gauss hybrid model classifier, comprising the following steps of: extracting invariant region features of spam information in a picture by utilizing an accelerative extract algorithm of a robust feature, executing the fitting of a gauss hybrid model on the invariant region features, and executing the evaluation of weight, mean and covariance matrixes by using an expectation maximization method, wherein the method specifically comprises the following steps of: labeling pictures of a data set to be detected, and dividing the pictures into spam pictures and regular pictures; extracting vectors of local invariant features of all data sets by utilizing the accelerative extract algorithm of the robust feature; executing density function fitting on the local invariant features by utilizing the gauss hybrid model to obtain mean and covariance matrixes of the all pictures; improving a mean clustering algorithm to make the mean clustering algorithm be suitable for clustering special feature vectors obtained in the previous step, taking cross entropy as an measurement index of the similarity of gauss hybrid distributions, and realizing the mean clustering algorithm based on the gauss hybrid model; and establishing a classifier by utilizing the mean clustering algorithm based on the gauss hybrid model.

Description

technical field [0001] The present invention uses the Gaussian mixture model in statistical thinking to perform density function fitting on the local invariant features of pictures. A mean clustering algorithm and a classification model based on Gaussian mixture model are proposed to detect image spam. It mainly solves the problems of low detection efficiency and recall rate of image spam in today's technology, and belongs to the field of data mining and machine learning. Background technique [0002] E-mail has become an important way for people to communicate on the Internet, but due to the huge commercial, economic and political interests, the amount of spam has expanded rapidly. The image-based spam that was prevalent at the beginning was to embed spam information such as advertisements into the image in the form of text. Hrishikesh et al. are using the mined text and color features to classify the mail [1]. In 2006, Fumera et al. proposed an OCR (Optical Character Rec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
Inventor 张卫丰王慕妮张迎周周国强许碧欢陆柳敏
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products