Method for real time filtering large scale rubbish SMS based on content

A technology of spam SMS and filtering method, which is applied in telephone communication, unauthorized/fraudulent telephone prevention, electrical components, etc., which can solve problems such as unsatisfactory effect, increased business process, and high misjudgment rate, so as to improve the accuracy of filtering efficiency, improve filtering speed, and reduce misjudgment

Inactive Publication Date: 2008-09-03
ZHEJIANG UNIV
View PDF0 Cites 82 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the traditional real-time filtering scheme of spam text messages, the method based on keyword matching is mainly used to filter, but the main disadvantages of this method are: (1) low efficiency
Each text message must be matched with dozens or even hundreds of keywords, so the efficiency is very low, especially when the text message traffic is very large, this method greatly increases the business process
(2) High misjudgment rate
Although ASA has been applied to a certain extent in China, the effect is not obvious
The main reasons are as follows: (1) ASA is based on the exact keyword matching method, and the sender of spam text messages will adopt various flexible methods, such as similar shape, sound similar, adding separators, etc., to bypass monitoring and continue sending; (2) ASA intercepts spam SMS according to the sending frequency and sending volume threshold (that is, the amount of SMS that can be sent within a certain period of time), which has nothing to do with the content of the sent SMS, but in practical applications, the sending frequency or sending volume threshold of SMS is related to the specific content , but it is difficult to determine; (3) Although ASA has a certain learning function, due to language differences, the learning function of ASA does not conform to the Chinese language habits, and the effect is not ideal
Obviously, systems based on traditional filtering technologies cannot meet the accurate, real-time, and efficient requirements of commercial systems for large-scale spam filtering systems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for real time filtering large scale rubbish SMS based on content
  • Method for real time filtering large scale rubbish SMS based on content
  • Method for real time filtering large scale rubbish SMS based on content

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Main principle of the present invention is as follows:

[0029] 1) Use two filtering modules of blacklist and whitelist to perform pre-filtering. The short message first enters the white list module, and if it belongs to the white list, it is directly released; otherwise, it enters the black list module, and if it belongs to the black list, the calling number is rejected. If it does not belong to the black list, then the short message enters the next step.

[0030] 2) When a malicious SMS group event occurs, traditional monitoring solutions usually cannot respond in a short time to intercept spam messages, so such events often cause huge losses to mobile communication operators in a short period of time. The goal of the frequency-based filtering module is to solve the problem of mass sending of malicious SMS. Its core idea is to model the sending characteristics of all real-time online valid users, and record the number of SMS sent by each user within a sliding time win...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a real time filtrating method for large-scale garbage message based on the content, including the steps as following: 1, pre-filtrating by using the black list and the white list; 2, carrying out the online filtrating by using the filtrating module based on the frequency; 3, carrying out the fast filtrating for the message content by using the method of twice hashing; 4, carrying out the pretreating of the message text for suspicion message, and converting the same into the phase vector; 5, judging the suspicion message by using the method of combination of Naive Bayesian classifier and support vector classifier. The invention can greatly improve the filtrating speed of garbage message, and efficiently reduce the produced erroneous judgement rate in the conventional key word filtrating method; can efficiently solve the problem of group sending garbage messages with malicious intent in the short time; can efficiently avoid to mistake the common message as the garbage message so as to reduce the erroneous judgement, and efficiently improve the filtrating accuracy of whole system by analyzing the message content on the semantics.

Description

technical field [0001] The invention relates to a spam short message filtering method, in particular to a content-based large-scale real-time spam short message filtering method. Background technique [0002] With the rapid development of mobile communication technology and the continuous improvement of mobile phone penetration rate, mobile phone text messages have become an important communication and communication method due to their advantages of shortness, rapidity, simplicity, and low price, and are increasingly favored by people. While the short message service brings convenience to the majority of users, problems such as the flood of spam short messages, short message fraud, and short message rumors have appeared, which have brought many negative effects on people's work and life, and even become a major social nuisance. According to statistics, a provincial-level mobile communication operator suffers a direct loss of nearly 10 million yuan each year due to spam text ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04Q7/32H04Q7/22H04M1/663H04M1/66
Inventor 徐从富陆冠中
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products