Unlock instant, AI-driven research and patent intelligence for your innovation.

Small sample text classification method and system based on semi-supervised learning

A semi-supervised learning and text classification technology, applied in the field of semi-supervised text classification, can solve the problems of high training cost, inability to train, time-consuming and labor-intensive, etc., and achieve the effect of flexible application and saving the cost of manual marking

Pending Publication Date: 2022-02-11
GUANGDONG UNIV OF TECH
View PDF2 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the existing text classification methods based on deep learning need to collect enough training data and manually mark labels, which is time-consuming and labor-intensive
Moreover, for text classification of text data in some fields, it is necessary to collect special data sets, and it is difficult to make all data with labels
[0005] In the training method and text classification method of the text classification model disclosed in the prior art, by inputting the text samples under each task in multiple tasks into its corresponding private feature extractor and public feature extractor, the text samples under multiple different tasks The private feature extractor and classifier are trained at the same time to obtain the trained text classification model; however, this method cannot be trained under the conditions of small amount of data and incomplete data labels, and it is necessary to collect a large amount of training data and manually label labels, and the training cost high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Small sample text classification method and system based on semi-supervised learning
  • Small sample text classification method and system based on semi-supervised learning
  • Small sample text classification method and system based on semi-supervised learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0045] A few-shot text classification method for semi-supervised learning, such as figure 1 shown, including steps:

[0046] S1. Obtain the text to be classified;

[0047] S2. Input the text to be classified into the pre-trained lookup table, and map the text to be classified into a text representation through the lookup table;

[0048] S3. Input the text representation into the multi-layer perceptron to obtain the text label, and use the text label as the text classification result to complete the classification of the small sample text.

[0049] In this embodiment, a lookup table is used to obtain the text representation of the text to be classified, and then the multilayer perceptron obtains the text label as the text classification result according to the text representation. The present invention is used for text classification of text data with a small amount of data and incomplete data labels. It is necessary to label a large amount of text data, save the cost of manu...

Embodiment 2

[0051] A few-shot text classification method for semi-supervised learning, such as figure 1 shown, including steps:

[0052] S1. Obtain the text to be classified;

[0053]S2. Input the text to be classified into the pre-trained lookup table, and map the text to be classified into a text representation through the lookup table;

[0054] S3. Input the text representation into the multi-layer perceptron to obtain the text label, and use the text label as the text classification result to complete the classification of the small sample text.

[0055] The look-up table described in step S2 is a look-up table for training, which is obtained by training the initial look-up table. The method for obtaining the look-up table for training is: constructing an initial look-up table, and performing an initial look-up table on the initial look-up table through a variational autoencoder. Train, save the lookup table after training.

[0056] The variational self-encoder includes: an encoder...

Embodiment 3

[0079] A few-shot text classification system with semi-supervised learning, such as image 3 As shown, including: classification text acquisition module, lookup table execution module, multi-layer perceptron execution module;

[0080] The classification text acquisition module obtains the text to be classified, and inputs the text to be classified into the pre-trained lookup table execution module; the lookup table execution module uses the lookup table to map the text to be classified into a text representation, and inputs the text representation into the multi-layer perceptron for execution Module, the multi-layer perceptron execution module uses the multi-layer perceptron to obtain text labels through text representation, and uses the text labels as text classification results to complete the classification of small sample texts.

[0081] It also includes a lookup table generation module, which constructs an initial lookup table, trains the initial lookup table through a va...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a semi-supervised learning small sample text classification method and system, and relates to the field of semi-supervised text classification, and the method comprises the steps: S1, obtaining a to-be-classified text; S2, inputting the to-be-classified text into a pre-trained lookup table, and mapping the to-be-classified text into text representation by the lookup table; S3, inputting the text representation into a multi-layer perceptron to obtain a text label, wherein the text label is used as a text classification result. According to the method, the text representation of the to-be-classified text is obtained through the lookup table, and then the multi-layer perceptron obtains the text tag as the text classification result according to the text representation, so a better classification effect can also be obtained under the condition of performing text classification on the text data with small data volume and incomplete data tags; manual labeling of a large amount of text data is not needed, manual labeling cost is saved, and the method can be flexibly applied to different scenes of various data quantities and data label labeling conditions.

Description

technical field [0001] The present invention relates to the field of semi-supervised text classification, and more specifically, relates to a small-sample text classification method and system of semi-supervised learning. Background technique [0002] With the development of technology, the amount of data on the Internet has increased exponentially. In the face of massive text, intelligent processing technology can save computing resources and improve processing efficiency. Text classification is the basic technology of information retrieval and mining, which plays a vital role in managing text data. [0003] In recent years, text classification has gradually changed from a shallow learning model to a deep deep learning model. Compared with shallow learning-based methods, deep learning methods avoid artificially designing rules and features, and automatically provide semantically meaningful representations for text mining. Therefore, most text classification research work...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/289G06F16/35G06K9/62G06N3/04G06N3/08
CPCG06F40/289G06F16/35G06N3/08G06N3/048G06N3/045G06F18/241Y02D10/00
Inventor 张伟文翁茂彬叶海明
Owner GUANGDONG UNIV OF TECH