Text fine classification method

A fine classification and text technology, which is applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of low accuracy of fine classification of short documents, and achieve the effect of improving classification accuracy and improving accuracy

Inactive Publication Date: 2015-08-05
WUHAN SHUWEI TECH
View PDF4 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of the above defects or improvement needs of the prior art, the present invention provides a text fine classification method to solve the problem of low accuracy of short document fine classification and significantly improve the accuracy of fine classification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text fine classification method
  • Text fine classification method
  • Text fine classification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035]In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0036] Such as figure 1 As shown, the text fine classification method provided by the present invention includes a two-stage classifier construction step, a word vector preprocessing step, a sensitive dictionary construction step, a text fine classification step and an evaluation and feedback step, specifically:

[0037] (1) Two-level classifier construction steps:

[0038] Such as figure 2 A...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text fine classification method, which belongs to the technical field of computer natural language processing or pattern recognition and solves the problems of low fine category division accuracy of the existing text classification method on short files. According to the method, firstly, two stages of classifiers are built according to an existing training sample, wherein each stage of classifier comprises an independent sensitive dictionary; then, operations such as term segmentation, special pause term removal and synonym network mapping are executed on the training samples of the classifiers, and the preprocessing of term vectors is realized; next, the feature selection is carried out according to the importance difference of the term vectors, and the sensitive dictionary of the classifiers is built; then, a KNN (k-Nearest Neighbor) algorithm is used for calculating a fine classification result of a target file; and finally, the classification result is evaluated and feed back, the sensitive dictionary is dynamically optimized, and the classification accuracy is further improved. Experiments prove that when the text fine classification method provided by the invention is adopted for by the short files, the fine classification accuracy can be obviously improved.

Description

technical field [0001] The invention belongs to the technical field of computer natural language processing or pattern recognition, and in particular relates to a text fine classification method, which can improve the fine classification accuracy of short documents. Background technique [0002] Text classification generally includes the process of text expression, classifier selection and training, classification result evaluation and feedback, and text expression can be subdivided into text preprocessing, indexing and statistics, feature extraction and other steps. [0003] Traditional text classification methods usually classify long documents with obvious differences between categories, such as web content classification (sports, news, finance and military, etc.). However, in some specific fields, such as short document classification such as automatic classification of police information for public security and sentiment analysis of Weibo, the gap between categories is ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 郑胜张胜邹复好蒋丹夏明周可
Owner WUHAN SHUWEI TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products