Text classification character screening method based on character distribution information
A feature screening and text classification technology, applied in special data processing applications, instruments, electrical and digital data processing, etc. Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0036] The concrete steps of the inventive method are as follows:
[0037] 1. Concepts related to the present invention.
[0038] Tf*idf (Term frequency inverse document frequency): It is a statistical method used to evaluate the importance of a word for a document set or a document in a corpus. The importance of a word increases proportionally to the number of times it appears in the document, but decreases inversely proportional to the frequency it appears in the corpus.
[0039] Intra-class distribution: refers to the distribution of a feature word in a certain type of document. If it is evenly distributed in each document of this type, the intra-class dispersion of the feature word in this type of document is low. ; Conversely, if it is concentrated in a few documents and does not appear in other documents, then the feature word has a high degree of dispersion within the class in this type of document.
[0040] Inter-class distribution (Inter-class distribution): refers ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com