On-line classroom discussion short text real-time grouping method and system based on text clustering

A technology of text clustering and grouping method, applied in the computer field to achieve the effect of overcoming the accuracy of clustering and enhancing the effectiveness

Active Publication Date: 2018-03-30
SOUTH CHINA UNIV OF TECH
View PDF12 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Through text preprocessing, keyword mining, rough clustering of quasi-frequent itemsets combined with TF-IDF to calculate the text distance between clusters, iteratively update the centroid, investi

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • On-line classroom discussion short text real-time grouping method and system based on text clustering
  • On-line classroom discussion short text real-time grouping method and system based on text clustering
  • On-line classroom discussion short text real-time grouping method and system based on text clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056] Aiming at the problem of low accuracy of short text clustering caused by sparse text features and low semantic contribution when traditional clustering methods are used in online classroom discussion short texts, an embodiment of the present invention provides an instant clustering method for online classroom discussion short texts. Grouping method, based on frequent itemset mining, filtering quasi-frequent itemsets, using semantic similarity rough clustering to determine the initial cluster group, adaptively determining the number of clusters based on survey statistics, and calculating the distance between texts in the cluster based on TF-IDF The center of mass is updated iteratively, effectively improving the accuracy of the K-means algorithm in short text clustering, and making the clustering results closer to actual needs.

[0057] Such as figure 1 As shown, a method for instant grouping of online classroom discussion short texts provided by an embodiment of the pre...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an on-line classroom discussion short text real-time grouping method and system based on text clustering. The method comprises the steps of conducting word-splitting preprocessing and stop-word preprocessing on text data; obtaining all text item keywords, counting all the text item keywords and storing the text item keywords into a keyword table keyTable; conducting frequent item set mining on a preprocessed text set, filtering all sub-item quasi-frequent item sets and conducting coarse cluster classification in combination with a keyword table definition quasi-frequentitem set similarity calculation rule; mapping points, the closest to the cluster center, of all clusters to the text set, calculating TF-IDF values of text word sets in all the clusters and iteratingthe center of mass to be optimal according to the distance; pushing the obtained K clusters in real time in group. Through the combination of the keyword table definition quasi-frequent item set similarity calculation rule, the clustering accuracy of an on-line discussion short text is effectively improved; through a quasi-frequent item set filtering strategy, the clustering efficiency is effectively improved, and a clustering method is accelerated; the text information content discussed on an on-line classroom is automatically classified into multiple themes, and the text content is groupedaccording to the themes.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to a text clustering-based instant grouping method and system for online classroom discussion short texts. Background technique [0002] The online cloud classroom platform that integrates the Internet and traditional educational resources has emerged in recent years, and major universities and educational institutions have set up cloud classroom online platforms one after another. Cloud classroom creates an instant online interactive classroom for users, and is very popular among online learners because of its high efficiency, convenience, and immediacy. In the interactive part, real-time grouping of online classroom discussion content can make the discussion content more clear and clear, and can effectively improve the reading efficiency of online learners. Data mining methods are often used for operation. [0003] In the prior art, a common method for grouping unmarked...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/3335G06F16/3346G06F16/35G06F40/284
Inventor 陆以勤夏儒斐黄国洪
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products