Text Classification Methods, Electronic Devices
A technology of text classification and word segmentation method, which is applied in text database clustering/classification, unstructured text data retrieval, electronic digital data processing, etc., and can solve the problems of limited coverage and accuracy, low ceiling, time-consuming and labor-intensive problems, etc.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0053] In the process of implementing this application, the inventors found that there is a set of sample data (such as sample text) in the existing text classification method, also known as the training sample set, and each data in the sample set has a label, that is, knowing the sample Concentrate the relationship between each data and its category. After inputting unlabeled data, each feature in the new data is compared with the corresponding feature of the data in the sample set, and the classification label of the most similar data (nearest neighbor) of the feature in the sample set is extracted. Generally speaking, only the top k most similar data in the sample data set are selected, usually k is an integer not greater than 20. Finally, the classification with the most occurrences among the k most similar data is selected as the classification of the new data.
[0054] However, when the above method is applied to samples with an unbalanced quantity, the prediction devia...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


