Unlock instant, AI-driven research and patent intelligence for your innovation.

Deep semi-supervised text clustering method based on pairwise constraint and cluster guidance

A semi-supervised, clustering technology, applied in the field of data processing, can solve problems such as poor clustering performance and robustness, and achieve the effect of strengthening guided learning, improving robustness, and improving accuracy

Pending Publication Date: 2022-03-25
CENT SOUTH UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In view of this, the embodiments of the present disclosure provide a deep semi-supervised text clustering method based on pairwise constraints and cluster guidance, which at least partially solves the problems of poor clustering performance and robustness in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep semi-supervised text clustering method based on pairwise constraint and cluster guidance
  • Deep semi-supervised text clustering method based on pairwise constraint and cluster guidance
  • Deep semi-supervised text clustering method based on pairwise constraint and cluster guidance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] Embodiments of the present disclosure will be described in detail below in conjunction with the accompanying drawings.

[0044] Embodiments of the present disclosure are described below through specific examples, and those skilled in the art can easily understand other advantages and effects of the present disclosure from the contents disclosed in this specification. Apparently, the described embodiments are only some of the embodiments of the present disclosure, not all of them. The present disclosure can also be implemented or applied through different specific implementation modes, and various modifications or changes can be made to the details in this specification based on different viewpoints and applications without departing from the spirit of the present disclosure. It should be noted that, in the case of no conflict, the following embodiments and features in the embodiments can be combined with each other. Based on the embodiments in the present disclosure, a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a deep semi-supervised text clustering method based on pairwise constraint and cluster guidance, which belongs to the technical field of data processing, and specifically comprises the following steps: preprocessing and vectorizing target text data to obtain a multi-dimensional vector; learning hidden layer features of the target text data according to the multi-dimensional vector, and inputting the hidden layer features into a preset algorithm for clustering to obtain an initial clustering center; calculating clustering loss; generating pairwise constraint loss by using cross entropy; calculating the cluster distribution loss of all the label clusters and all the label-free clusters; and calculating a joint loss function according to the reconstruction error, the clustering loss, the pairwise constraint loss and the cluster distribution loss, and obtaining a clustering result when a preset condition is reached according to iteration of the joint loss function. Through the scheme disclosed by the invention, the supervision information in the label is fully mined, the guide learning between the label and the label-free cluster is enhanced, the robustness of the deep semi-supervised clustering model is improved, and meanwhile, the text clustering precision is improved.

Description

technical field [0001] The embodiments of the present disclosure relate to the technical field of data processing, and in particular to a deep semi-supervised text clustering method based on pairwise constraints and cluster guidance. Background technique [0002] At present, in the face of the information overload problem caused by the continuous expansion of social networks, how to rationally organize and efficiently utilize the value information in the text has become a focus in the field of data mining, and the use of clustering technology to integrate and block can assist The in-depth mining, analysis and prediction of massive texts is also an important prerequisite for key knowledge discovery and extraction. It has important applications in many aspects such as entity classification, sentiment analysis, intelligent recommendation, and automatic question answering. [0003] Existing clustering methods such as Kmeans and other clustering methods are prone to cause the "cu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35G06K9/62
CPCG06F16/35G06F18/2155G06F18/23213G06F18/22
Inventor 王雅琳邹江枫王凯郭静宇袁小锋隋庆开彭渝彬桂卫华
Owner CENT SOUTH UNIV