Method and system for filtering and classifying short messages

A classification method, SMS technology

Inactive Publication Date: 2010-07-21
北京炎黄新星网络科技有限公司
View PDF3 Cites 52 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Moreover, the currently commonly used short message filtering function is to completely filter the overall

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for filtering and classifying short messages
  • Method and system for filtering and classifying short messages
  • Method and system for filtering and classifying short messages

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0020] The steps of the method for filtering and secondary classification of short messages provided by the present invention are as follows:

[0021] Step 1, preprocessing the text of the short message (keyword processing, black and white list processing).

[0022] Before word segmentation, the content of the short message needs to be preprocessed, including deletion, standardization, and marking. Preprocessing can play a role in semantic segmentation, improve the accuracy of word segmentation, mark some important characteristics of spam messages, and lay the foundation for subsequent analysis.

[0023] First delete or mark the invalid part of the text message content to reduce interference and improve the efficiency of subsequent processing.

[0024] Perform unified conversion for the content of the short message, such as converting full-width digital symbols into unified half-width standard digital symbols, and identify some special changes in the content of the short message, such...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention creatively provides a spam short message filtering method which is based on a mode of transmission quantity characteristics and short message content characteristics and combines a Chinese character regular expression and an improved bayesian algorithm on the basis of traditional short message filtering. At the same time of improving the identification accuracy rate of spam short messages, the false report rate and the missing report rate of the spam short messages are reduced, and meanwhile, the spam short messages are classified for a second time so as to be convenient for the personalized setting of users. The method comprises the following steps of: (1) preprocessing short message texts; (2) matching transmission quantity: matching transmission content and a transmission quantity; (3) carrying out morphology word segmentation by using the Chinese character regular expression and a dictionary and word property method; (4) classifying by using a spam short message classifier: calculating the probability through the improved bayesian algorithm and identifying the spam short messages and non-spam short messages by using a short message characteristic rule defined by the Chinese character regular expression; and (5) using the classification of a short message type affiliation classifier to classify and process the identified spam short messages.

Description

Technical field: [0001] The invention is used for intercepting spam short messages, and in particular relates to a method and a system for filtering and secondary classification of short messages in a short message center of a telecommunication operator. Background technique: [0002] Mobile text messages have become a very important form of communication for Chinese people. However, we have to face the harassment of "spam text messages" at any time while enjoying the convenience between our thumbs. Spam text messages not only bring us harassment, but more seriously, spam text messages have become a tool for some criminals to disseminate and disseminate illegal and criminal information. [0003] At present, the commonly used SMS filtering methods and mechanisms mainly include: filtering based on keywords, filtering based on content, filtering based on SMS sending volume and sender analysis, etc. Most of the filtering methods follow the general spam processing methods, such ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04W4/14G06F17/30
Inventor 柳呈文
Owner 北京炎黄新星网络科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products