Unlock instant, AI-driven research and patent intelligence for your innovation.

Deep semi-supervised text clustering method and device combined with user intention and medium

A technology of user intent and text clustering, which is applied in the fields of text processing and information extraction, can solve problems such as ignoring user intent, weak intent supervision, and inability to accurately express user clustering intent, and achieve great theoretical significance and practical value Effect

Pending Publication Date: 2022-06-24
GUIZHOU UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these methods have the following deficiencies: First, the text representation is different. In practical applications, the text representation is different, and different user clustering intentions should have different expression emphases; second, the intention supervision is weak, and the supervision information It can only guide the structural division of a small number of text samples, and cannot accurately express the user's overall clustering intention; finally, the user's intention is ignored, and it is impossible to obtain different data that meet the user's intention for the same batch of data samples according to specific application scenarios and downstream task requirements. The clustering results of

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep semi-supervised text clustering method and device combined with user intention and medium
  • Deep semi-supervised text clustering method and device combined with user intention and medium
  • Deep semi-supervised text clustering method and device combined with user intention and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0042] Example: as attached Figures 1 to 3 As shown, a deep semi-supervised text clustering method combined with user intent, the method includes the following steps: step 1: construct an intent information matrix; step 2: perform vector mapping on the text, and extract features from the text vector through a neural network ; Step 3: Use the intent information matrix to optimize the encoder to further obtain better feature representation; Step 4: Use KL divergence assisted optimization to obtain the initial clustering result; Step 5: Build an optimization function and use the intent information to guide the clustering direction of clusters .

[0043] In step 1, according to the paired constraint information given by the user, the association relationship between the data points is mined to construct an intent information matrix R of size n*n, where n is the size of the dataset. Given X is the original text data sample, each point x in the matrix ij The value of represents t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a deep semi-supervised text clustering method and device combined with a user intention and a medium. The method comprises the following steps of 1, constructing an intention information matrix; 2, carrying out vector mapping on the text, and carrying out feature extraction on text vectors through a neural network; 3, optimizing an encoder by using the intention information matrix to further obtain better feature representation; 4, performing KL divergence auxiliary optimization to obtain an initial clustering result; and 5, constructing an optimization function, and guiding the clustering direction of the class cluster by using the intention information. On the basis that constraint pair supervision information is given, intention information is mined by fully utilizing a deep neural network, the intention information is fused into feature representation, and a clustering process is supervised by utilizing the intention information, so that the problems of text representation difference, insufficient supervision strength and neglect of user intentions of semi-supervised text clustering are effectively solved, and the method is suitable for being popularized and applied. Therefore, the accuracy of the clustering result is improved, and the clustering result more suitable for downstream tasks is obtained.

Description

technical field [0001] The invention belongs to the technical fields of information extraction and text processing, and in particular relates to a deep semi-supervised text clustering method, device and medium combined with user intention. Background technique [0002] With the advent of the information age, large-scale data appears in front of human beings in the form of text. Text clustering is one of the most important algorithms in the field of data mining, which is to classify similar text documents into one class. Traditional unsupervised text clustering divides clusters according to the similarity between documents, and does not require any data attributes when dividing. With the diversification of application scenarios and the differentiated development of downstream tasks, for the same batch of data, different users have different clustering intentions, and users need to guide the clustering results according to their intentions. For example, for the same batch of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F40/216G06F40/284G06F40/30G06K9/62
CPCG06F16/35G06F40/216G06F40/284G06F40/30G06F18/2321
Inventor 黄瑞章李静楠秦永彬陈艳平
Owner GUIZHOU UNIV