Identification method and system for junk mail

A spam and identification method technology, applied in the field of communication, can solve problems such as performance degradation, and achieve the effect of stable identification performance

Active Publication Date: 2013-03-20
CHONGQING UNIV
View PDF1 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] A set of experiments was carried out through the corpus of TREC 2007 spam. The results show that when the number of modified keywords in the spam increases, the performance of standard recognition algorithms, such as the support vector machine algorithm, decreases rapidly.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Identification method and system for junk mail
  • Identification method and system for junk mail
  • Identification method and system for junk mail

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] In order to make the purpose, technical solution, and advantages of the present invention clearer, the specific implementation examples of a service function authorization method and system involved in the present invention will be further described in detail below with reference to the accompanying drawings.

[0042] refer to figure 1 , the method flowchart of a preferred embodiment of the present invention,

[0043] Step S101, setting recognizer parameters,

[0044] Step S102, converting the email into a vector,

[0045] Step S103, using the identifier to identify the mail,

[0046] Step S104, outputting the recognition result.

[0047] In the present invention, a mail document is expressed as a vector x=[f 1 ,..., f d ]. First collect all the words that appear in the mail database and compile them into a general dictionary. Then sort by the frequency of a specific word in spam, select d + words into the representation dictionary; similarly, sort according to ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an identification method and a system for a junk mail. The identification method and the system overcome a deficiency that in the prior art, when more keywords are modified in the junk mail, the performance of a standard identification algorithm such as a support vector machine algorithm is degraded more rapidly. The method comprises the steps of setting parameters of an identifier, converting the mail into a vector, identifying the mail with the identifier, and outputting an identification result. The method and the system can realize intelligent identification of the junk mail resisting impersonation attacks, and achieve a technological effect of more stable identification performance.

Description

technical field [0001] The invention relates to the technical field of communications, in particular to a spam identification method and system. Background technique [0002] At present, spam often uses some camouflage means to avoid spam recognition software. Attackers deliberately add some keywords that often appear in legitimate emails, or delete some keywords that often appear in spam to disguise spam. Usually, in order not to damage the valuable content of the spam to the sender, the attacker can only modify a part of the spam. [0003] A set of experiments was conducted with the TREC 2007 spam corpus. The results show that when the number of modified keywords in spam is more, the performance of standard recognition algorithms, such as the support vector machine algorithm, decreases rapidly. Contents of the invention [0004] The technical problem to be solved by the present invention is to provide a spam identification method and system. [0005] In order to solve ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/06H04L12/58
Inventor 周喜川严超胡盛东甘平黄智勇张玲
Owner CHONGQING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products