Method for detecting image-based spam by utilizing image local invariant feature

A technology of local invariant features and pictures, applied in computer parts, instruments, characters and pattern recognition, etc., can solve problems such as unfavorable, large amount of calculation, high algorithm time complexity, save program operation time and space, improve The effect of precision and recall

Inactive Publication Date: 2010-09-01
NANJING UNIV OF POSTS & TELECOMM
View PDF3 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In 2008, Mehta et al. detected a large number of spam generated by templates. Using repeated similarities, the accuracy of SVM classification reached 98%. At the same time, they proposed an algorithm for clustering images using GMM [6]: Each picture is reduced to 100×100 pixels, the texture shape and color features of each pixel are extracted, GMM is trained for each picture, and the similar distance in the picture is calculated to cluster, and the garbage picture is calculated by calculating the threshold, although This method is accurate but the amount of calculation is too large, and the time complexity of the algorithm is high, which is not conducive to practical application.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for detecting image-based spam by utilizing image local invariant feature
  • Method for detecting image-based spam by utilizing image local invariant feature
  • Method for detecting image-based spam by utilizing image local invariant feature

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] Image spam is detected based on local invariant features of pictures, using VC++6.0 as the development tool, and the processing of image features uses opencv1.0 open source library, and the detailed steps are as follows:

[0030] 1. Training phase: Obtain junk pictures and normal pictures to form a training set.

[0031] Step 1) label the picture of the data set to be trained, make the garbage picture (Image spam) be I i Normal picture (image ham) J i , where i={1, 2...N};

[0032] Step 2) adopt surf (accelerated extraction of robust features) algorithm to extract I i and J i The local invariant feature descriptor of each picture in , wherein each descriptor of the picture is described by an L-dimensional vector (L=64);

[0033] Step 3) Use the "mean value clustering algorithm" to cluster the 64-dimensional local invariant feature descriptors of each garbage picture and normal picture in the training set, and finally get 200 cluster centers. Using the 200 cluster c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for detecting image-based spam by utilizing the image local invariant feature, which comprises the steps of extracting invariant region feature of spam information in an image by utilizing the accelerated extraction algorithm with the robust feature, further generating a feature vector of the image, estimating parameters of a Gaussian mixture model by using the maximum likelihood algorithm and training a classifier of the Gaussian mixture model. Experiments show that the method can improve the recall rate of the spam and save the program computation time and the space. The classifier based on the Gaussian mixture model is obtained. The realizing method for detecting the image-based spam comprises three modules of the extraction of the image feature, the estimation of the parameters of the Gaussian mixture model and the detection of the image-based spam.

Description

technical field [0001] The present invention is a realization scheme of using the local invariant features of garbage pictures to train a Gaussian mixture model to detect image-type spam, which mainly solves the problems of low detection efficiency and recall rate of picture-type spam in today's technology, It belongs to the field of data mining and machine learning. Background technique [0002] E-mail has become an important way for people to communicate on the Internet, but due to the huge commercial, economic and political interests, the amount of spam has increased dramatically. The spam that was prevalent at first was to embed spam information such as advertisements into images in the form of text. Hrishikesh et al. are using the mined text and color features to classify emails [1]. In 2006, Fumera et al. proposed an OCR (Optical Character Recognition) technology to detect the text information of image spam, which has a better detection effect than other filtering sys...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/66
Inventor 张卫丰杨波周国强张迎周陆柳敏许碧娣王慕妮王宗辉韩蕊陆柳青
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products