Unlock instant, AI-driven research and patent intelligence for your innovation.

Text classification method and device

A text classification and text technology, applied in the field of data processing, can solve the problems of subjective differences in classification results, unreliable classification results, waste of time and manpower, etc., to achieve strong global search ability, time efficiency and accuracy satisfaction, aggregation class effect good effect

Active Publication Date: 2018-08-03
NANJING TECH UNIV
View PDF3 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the method of artificially judging text also has many shortcomings in practical applications. First, it takes a lot of time and manpower to face a large number of documents. The problem of subjectivity differences in the results leads to unreliable classification results and poor classification results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method and device
  • Text classification method and device
  • Text classification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] In order to enable those skilled in the art to better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described The implementations are only some of the implementations of the present application, not all of them. Based on the implementation manners in this application, all other implementation manners obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.

[0056] The present application provides a text classification method, the method comprising:

[0057] Perform preprocessing operations on the text in the training corpus to obtain the full set of initial features;

[0058] performing feature selection on the initial feature corpus to form a new feature corpus, a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a text classification method and device. The method comprises the steps that texts in a training corpus are preprocessed to obtain an initial feature set; feature selection is conducted on the initial feature set to form a new feature set, and feature vector space models are constructed on the basis of the new feature set, wherein the feature vector space models comprise a preset number of feature items; the feature vector space models are clustered to obtain k center vectors of k clusters; the similarity between the feature items in each cluster and the center vector ofthe corresponding cluster is calculated, f feature items with the prior similarity are selected from the clusters, and f*k feature items serve as final feature items used for text representation. According to the technical scheme, the accuracy and efficiency of text classification can be improved.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a text classification method and device. Background technique [0002] With the continuous reform and innovation of the "Internet +" model, all walks of life have gradually strengthened their awareness of using network information and data, and more and more information or data are obtained through the Internet, and the growth rate is also faster and faster. and data are generally not directly usable by users. How to classify these huge amounts of text content information according to certain rules, so as to realize the effective management and rational utilization of document content becomes very important. In the process of text processing, text classification is indispensable, and it is an important research method in text mining. By judging the information contained in the text, identifying the general direction guided by the text content, and classifying it into ap...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/35G06F18/23213G06F18/2411
Inventor 梁雪春陈谌权义萍
Owner NANJING TECH UNIV