Unlock instant, AI-driven research and patent intelligence for your innovation.

A text clustering integration method and system based on a three-layer weighted model

A text clustering and integration method technology, applied in text database clustering/classification, unstructured text data retrieval, character and pattern recognition, etc., can solve the problems of high computational complexity and unacceptable computational cost.

Active Publication Date: 2022-06-03
YANCHENG INST OF TECH +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The disadvantage of the existing technology is that there is still a lack of a unified framework for weighting different research objects (points, clusters, partitions) in order to explore the impact of each weighting method on the final clustering results
In addition, although the method of fine-tuning the CA matrix can obtain better clustering results, the computational complexity is high, and the computational cost is unacceptable when dealing with large-scale data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A text clustering integration method and system based on a three-layer weighted model
  • A text clustering integration method and system based on a three-layer weighted model
  • A text clustering integration method and system based on a three-layer weighted model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0088] The preferred embodiments of the present invention will be described below with reference to the accompanying drawings. It should be understood that the preferred embodiments described herein are only used to illustrate and explain the present invention, but not to limit the present invention.

[0089] The embodiment of the present invention provides a text clustering integration method based on a three-layer weighted model, such as figure 1 shown, including:

[0090] Step S1: obtain a text set;

[0091] Step S2: preprocessing the text set;

[0092] Step S3: clustering and integrating the preprocessing results based on the k-means algorithm;

[0093] Step S4: constructing a three-layer weighted model to optimize the clustering integration result;

[0094] Step S5: Evaluate the optimized clustering integration result.

[0095] The working principle and beneficial effects of the above technical solutions are as follows:

[0096] Obtain the text set that needs to be c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a text clustering integration method and system based on a three-layer weighted model, wherein the method includes: step S1: obtaining a text set; step S2: preprocessing the text set; step S3: based on the k-means algorithm, Perform clustering integration on the preprocessing results; step S4: build a three-layer weighted model, and optimize the clustering integration results; step S5: evaluate the optimized clustering integration results. The text clustering integration method and system based on the three-layer weighting model of the present invention, the three-layer weighting model is a unified framework of point, cluster, and division three-layer weighting, explores the impact of each weighting method on the final clustering result, and makes up for the existing Technical gap. In addition, by fine-tuning the hypergraph adjacency matrix H, the three-layer weighting of points, clusters, and partitions is realized in turn. The scale of the processed matrix is ​​significantly smaller than the method based on the CA matrix, and it has lower computational complexity. When dealing with large-scale data sets , the efficiency advantage is obvious.

Description

technical field [0001] The invention relates to the technical field of text clustering integration, in particular to a text clustering integration method and system based on a three-layer weighted model. Background technique [0002] At present, cluster analysis is one of the hot topics in machine learning research and has been widely used in data compression, information retrieval, speech recognition, character recognition, image segmentation and text clustering, in biology, geology, geography and anomaly. Data detection and other fields are also receiving more and more attention. Cluster analysis is one of the methods of multivariate statistical analysis, and it is also an important branch of unsupervised pattern classification in statistical pattern recognition. Try to group them into one class, and dissimilar samples should be in different classes. Traditional clustering methods emerge in an endless stream, however, none of them can successfully identify clusters of di...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06K9/62
CPCY02D10/00
Inventor 徐森李娜徐秀芳花小朋皋军安晶蔡娜陈思博
Owner YANCHENG INST OF TECH