Method for comment text classified extraction based on evolution clustering

A text classification and extraction method technology, applied in the field of comment text classification and extraction based on evolutionary clustering, can solve the problems of abnormal data sensitivity and stability, low clustering accuracy, and complex calculations, etc., to reduce complexity, The effect of high clustering accuracy and high sensitivity

Inactive Publication Date: 2018-07-10
GUANGDONG KINGPOINT DATA SCI & TECH CO LTD
View PDF2 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in real situations, due to the characteristics and complexity of comment data, traditional clustering methods face the problem of data "sparseness", and the sensitivity and stability to abnormal data are not strong, the calculation is complex, and the clustering accuracy is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for comment text classified extraction based on evolution clustering
  • Method for comment text classified extraction based on evolution clustering
  • Method for comment text classified extraction based on evolution clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The above and other technical features and advantages of the present invention will be described in more detail below in conjunction with the accompanying drawings.

[0029] Such as figure 1 As shown, it is a schematic flow chart of a comment text classification and extraction method based on evolutionary clustering provided by the present invention. This method includes the following steps:

[0030] Step S1: collect comment samples, perform word segmentation on comment content, and remove stop words, that is, data preprocessing.

[0031] When segmenting comments and removing stop words, a word segmentation tool is used to integrate multiple stop words. In particular, it is necessary to use a stop word with the fastest vocabulary update cycle.

[0032] Step S2: Process the text features, remove low-relevance or irrelevant feature items, and take χ 2 The statistical method is used to process the review text, χ 2 The formula for the statistical method is:

[0033] ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for comment text classified extraction based on evolution clustering. The method comprises: step S1, acquiring comment samples, performing word segmentation on commentcontents, and removing stop words; step S2, processing text features, removing feature items which are in low relevance or are irrelevant; step S3, according to a textual emotion vector space model, giving different weights to the text feature items; step S4, using a k-medoids evolution clustering algorithm to cluster the text features; step S5, counting clustering results in each time period, todraw a conclusion. Compared with the prior art, the method for comment text classified extraction based on evolution clustering solves a problem of data sparsity which the text features may face, andreduces calculation complexity. The method has high susceptibility and good stability on abnormal data, and has relatively high clustering precision.

Description

technical field [0001] The invention relates to the technical field of text classification and extraction, in particular to a comment text classification and extraction method based on evolutionary clustering. Background technique [0002] With the rapid development of Internet technology, public opinion media or platforms are places where hot events are generated and disseminated. Every day, a large number of netizens participate in discussions and generate a large amount of comment data. How to quickly obtain the emotional distribution of netizens from these data? The evolution of views and views will be extremely conducive to making targeted marketing strategies. [0003] The traditional clustering method is an unsupervised learning method, which is mainly used to process static data sets. However, in real situations, due to the characteristics and complexity of comment data, traditional clustering methods face the problem of data "sparseness", and the sensitivity and st...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/35
Inventor 侯大勇李青海简宋全邹立斌
Owner GUANGDONG KINGPOINT DATA SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products