Text Classification Feature Screening Method Based on Feature Distribution Information
A feature screening and text classification technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as poor accuracy, and achieve improved efficiency and accuracy, high classification accuracy, and fast convergence speed. Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0036] The concrete steps of the inventive method are as follows:
[0037] 1. Concepts related to the present invention.
[0038] Tf*idf (Termfrequencyinversedocumentfrequency): It is a statistical method to evaluate the importance of a word for a document set or a document in a corpus. The importance of a word increases proportionally to the number of times it appears in the document, but decreases inversely proportional to the frequency it appears in the corpus.
[0039] Intra-class distribution (Intra-class distribution): refers to the distribution of a feature word in a certain type of document. If it is evenly distributed in each document of this type, the intra-class dispersion of the feature word in this type of document is low; On the contrary, if it is concentrated in a few documents and does not appear in other documents, then the feature word has a high degree of dispersion in this type of document.
[0040] Inter-class distribution: refers to the distribution of ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 