Text classification method and device and electronic equipment

A text classification and text technology, applied in the field of data processing, can solve the problems affecting the accuracy of model classification, large human and financial resources, consumption, etc., to reduce the consumption of human and financial resources, and achieve the effect of high classification accuracy

Active Publication Date: 2021-08-27
NETEASE (HANGZHOU) NETWORK CO LTD
View PDF7 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the annotation accuracy of the annotation text obtained by the latter two methods is low, which will affect the classification accuracy of the model. Although the first method can ensure the accuracy of the annotation text, it needs to consume a lot of manpower and financial resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method and device and electronic equipment
  • Text classification method and device and electronic equipment
  • Text classification method and device and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. The components of the embodiments of the invention generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations.

[0033] Accordingly, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely represents selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art wi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text classification method and device and electronic equipment. The method comprises the following steps: inputting a to-be-classified text into a trained text classification model, and obtaining a text category of the to-be-classified text; the training mode of the text classification model is as follows: determining a plurality of text categories and an attribute rule of each text category based on the text data of which the statistical frequency is higher than a preset threshold value and / or the text data of which the semantic similarity meets a preset condition; and based on the determined text category and the attribute rule of the text category, labeling a plurality of sample texts, and training the initial model based on the plurality of sample texts carrying labeling information to obtain a text classification model. According to the mode, the text category and the attribute rule of the text category are obtained through manual summarization according to a small amount of selected representative unannotated text data, and then the text is automatically annotated according to the summarized rule, so that the annotated text with relatively high labeling accuracy is obtained; therefore, the classification accuracy of the text classification model obtained by training according to the annotated text is relatively high.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a text classification method, device and electronic equipment. Background technique [0002] The task of text classification usually refers to inferring the text based on the existing text categories to obtain the category to which the text belongs. Common text classification tasks include sentiment classification problems, text sensitive information detection, etc. In practical applications, text classification models based on neural networks are mostly used to classify texts. In order to make the model achieve a certain classification accuracy, it is necessary to use a training data set containing a large number of labeled texts to train the text classification model. Among them, The labeled text carries category labels. [0003] In related technologies, in order to obtain a large amount of marked text, three ways of marking unmarked text are provided. The fir...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/216G06F40/30G06F40/44G06N3/08G06F16/35
CPCG06F40/216G06F40/30G06F40/44G06F16/35G06N3/08Y02D10/00
Inventor 汪硕芃张林箭宋有伟张聪范长杰胡志鹏
Owner NETEASE (HANGZHOU) NETWORK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products