Junk short message filter method

A technology of spam SMS and filtering method, applied in wireless communication, special data processing applications, instruments, etc., can solve problems such as high manslaughter rate, inability to flexibly respond to changes in spam SMS, inability to effectively identify and filter spam SMS, etc. quick effect

Active Publication Date: 2011-02-16
BEIJING FEINNO COMM TECH
View PDF2 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] (1) The "manslaughter rate" is relatively high. Regardless of whether word segmentation is performed or not, spam SMS filtering based on matching will cause a lot of "manslaughter"
[0008] (2) Cannot flexibly respond to changes in spam messages
In order to build a training corpus of a certain scale, it takes a lot of money, and the training corpus must be updated continuously, otherwise it will be difficult to keep up with the pace of spam SMS changes
[0017] (2) Spam SMS filtering efficiency is low, not suitable for high real-time occasions
This method not only increases the computing pressure on the client, but also makes it easy to filter messages from strangers
[0028] To sum up, the spam SMS filtering methods in the prior art either filter only based on the content of the spam SMS, or filter only based on the propagation mode of the spam SMS, and cannot effectively identify and filter the spam SMS

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Junk short message filter method
  • Junk short message filter method
  • Junk short message filter method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

[0057] The core idea of ​​the present invention is to perform Chinese word segmentation on the input or stored short messages, delete words irrelevant to the expression of general short messages, and calculate the file fingerprints of the reserved short messages. If the number of times the file fingerprint of the short message appears in the cache exceeds a preset threshold, it can be judged that the short message is a spam short message, otherwise it is a normal short message.

[0058] figure 1 It is the flow chart of the basic junk message filtering method of the embodiment of the present invention; as figure 1 Said, said method comprises:

[0059] Step A, delete word...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a junk short message (SM) filter method, which comprises the following steps: step 10, deleting words which are not related with the SM texts in the SMs; step 20, calculating file fingerprints of the SMs in which the words not related with the SM texts are deleted; and step 30, if the number of the file fingerprints of the SMs in which the words not related with the SM texts are deleted exceeds a first preset threshold value, judging that the SMs are junk SMs. The method is a filter method based on the file fingerprints, which refers to not only similarity of the SMs in texts, but also mode of transmission of the junk SMs. The junk short message filter method of the invention has the advantages that identification and filter of the junk SMs are fast, thereby being suitable for occasions with higher requirements on instantaneity; and the filter method is not affected by change of nonessential texts in the junk SMs, thereby being capable of effectively dealing with continuous changes of the junk SMs.

Description

technical field [0001] The invention relates to the technical field of text information processing, in particular to a method for filtering spam short messages. Background technique [0002] Short message is one of the ways of information exchange that people often use. At the same time, spam text messages have also begun to flood gradually. Statistics show that among the huge number of short messages, about 30% belong to spam text messages. For ordinary users, spam text messages seriously interfere with daily life; for operators, spam text messages occupy a large amount of traffic space and reduce information transmission efficiency. [0003] The main contents of spam messages include advertising information, pornographic information, false winning information, fraudulent information and pranks, among which pornographic information and false winning information are the most common. [0004] In the prior art, the most common filtering methods include: [0005] 1. Spam SMS...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04W4/14H04W12/12G06F17/27H04W12/128
Inventor 牟小峰陈鹏
Owner BEIJING FEINNO COMM TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products