Imbalanced text classification method introducing keyword features
A text classification and keyword technology, which is applied in text database clustering/classification, unstructured text data retrieval, semantic analysis, etc., can solve the problems of inability to solve the diversity of washing samples, inability to solve sparse category underfitting, etc. , to achieve the effect of solving the category imbalance
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0107] The invention provides an unbalanced text classification method that introduces keyword features. First, a hierarchical classification system is defined for the military news field, including 32 leaf categories; the key of each category is extracted by using normalized point mutual information and improved information gain. words; fusion of keyword features and neural network semantic features for training. Through the above steps, the present invention can effectively solve the problem of text classification in the case of unbalanced categories. like figure 1 shown, including the following steps:
[0108] Step 1 includes:
[0109] Step 1-1: Define a hierarchical classification system and describe the hierarchical relationship between categories, such as "collaboration-verbal-expression of willingness-substantial cooperation", and labels at different levels are separated by "-". Provide text-level classification functions for news in the fields of politics, military,...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


