Unlock instant, AI-driven research and patent intelligence for your innovation.

A text classification method and device

A text classification and text technology, applied in the field of data processing, can solve problems such as subjective differences in classification results, time and manpower loss, unreliable classification results, etc., and achieve strong global search capabilities, good clustering effects, and time efficiency. Satisfied with the accuracy

Active Publication Date: 2022-02-15
NANJING TECH UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the method of artificially judging text also has many shortcomings in practical applications. First, it takes a lot of time and manpower to face a large number of documents. The problem of subjectivity differences in the results leads to unreliable classification results and poor classification results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A text classification method and device
  • A text classification method and device
  • A text classification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] In order to enable those skilled in the art to better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described The implementations are only some of the implementations of the present application, not all of them. Based on the implementation manners in this application, all other implementation manners obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.

[0056] The present application provides a text classification method, the method comprising:

[0057] Perform preprocessing operations on the text in the training corpus to obtain the full set of initial features;

[0058] performing feature selection on the initial feature corpus to form a new feature corpus, a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a text classification method and device, wherein the method includes: preprocessing the text in the training corpus to obtain a complete set of initial features; performing feature selection on the complete set of initial features to form a new complete set of features , and construct a eigenvector space model based on the new feature ensemble, the eigenvector space model includes a preset number of feature items; cluster the eigenvector space model to obtain k clusters of k clusters Center vector; calculate the similarity between the feature items in each cluster and the center vector of the corresponding cluster, and for each cluster, select f feature items with the highest similarity in the cluster, and divide f×k The feature term serves as the final feature term for textual representation. The technical solution provided by the invention can improve the accuracy and efficiency of text classification.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a text classification method and device. Background technique [0002] With the continuous reform and innovation of the "Internet +" model, all walks of life have gradually strengthened their awareness of using network information and data, and more and more information or data are obtained through the Internet, and the growth rate is also faster and faster. and data are generally not directly usable by users. How to classify these huge amounts of text content information according to certain rules, so as to realize the effective management and rational utilization of document content becomes very important. In the process of text processing, text classification is indispensable, and it is an important research method in text mining. By judging the information contained in the text, identifying the general direction guided by the text content, and classifying it into ap...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06K9/62
CPCG06F16/35G06F18/23213G06F18/2411
Inventor 梁雪春陈谌权义萍
Owner NANJING TECH UNIV