Short message clustering method and apparatus

A clustering method and short message technology, applied in special data processing applications, instruments, electronic digital data processing and other directions, can solve the problem of not considering the weight of keywords and errors, and achieve the effect of improving the efficiency of clustering

Pending Publication Date: 2018-12-07
FUJIAN NEWLAND SOFTWARE ENGINEERING CO LTD
View PDF5 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If the weight of keywords is not considered, but only the number of identical words is matched, then clustering often makes mistakes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Short message clustering method and apparatus
  • Short message clustering method and apparatus
  • Short message clustering method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] The present invention will be further described below in conjunction with the accompanying drawings.

[0048] figure 1 An embodiment of a text structure-based short message clustering method is shown, as figure 1 As shown, the method may include the following steps.

[0049] S1: Perform word segmentation preprocessing on the text messages in the text message collection; sort the phrases obtained by word segmentation of each text message according to the original sentence structure of the text message, so that each text message becomes composed of multiple words.

[0050] S2: Carry out similarity calculation based on the text message structure for any two short messages that have been preprocessed by word segmentation, and construct a similarity matrix;

[0051] S3: clustering process, including: comparing each value in the similarity matrix with a preset similarity threshold, when the similarity value is greater than the similarity threshold, the two short messages ar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of classification, in particular to a short message clustering method based on a text structure, the method comprising the following steps: S1, preprocessing word segmentation of short messages in a short message set; S2, calculating similarity degree based on short message structure for any two short messages pretreated by word segmentation, and constructing similarity degree matrix; S3, carrying out a clustering process, comprising: comparing each value in the similarity matrix with a preset similarity threshold value respectively; when the similarity valueis greater than the similarity threshold value, classifying two short messages corresponding to the similarity value into the same category, and finally obtaining a short message clustering result.

Description

technical field [0001] The invention relates to the field of data classification, in particular to a short message clustering method and device. Background technique [0002] In today's information age, smart phones are widely popularized, and various social applications are popular. Subsequently, our mobile phones are bound to accounts of banks, electricity, life, Internet shopping websites, etc. Notification text messages have become a part of people's lives. On the one hand, people can understand their account information more transparently. On the other hand, the numerous marketing advertisements mixed with notification text messages trouble mobile phone users. SMS clustering can reasonably cluster notification SMS, which is convenient for users to manage and mine valuable information, and can effectively solve people's distressed problems. [0003] For text messages, text messages themselves are different from long texts. They have unique characteristics, such as fewe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27G06K9/62
CPCG06F40/284G06F18/23G06F18/22
Inventor 赵东见王雷居燕峰李福
Owner FUJIAN NEWLAND SOFTWARE ENGINEERING CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products