New method of characteristic vector weighting for text classification and its device
A feature vector and text classification technology, applied in the field of computer science, can solve the problem of low accuracy of the classifier
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0068] figure 1 In the method of feature vector weights for text classification, the specific steps are as follows:
[0069] Step S1, the collection of training corpus and test corpus, one, download training corpus from the Internet according to six fields (consumer information field, entertainment and game field, financial and economic field, news field, personal communication field, sports field), remove webpage text Some "garbage", word segmentation, part-of-speech tagging, and finally a total of 30.87 million words of training corpus. 2. The test corpus was downloaded from the Internet according to the same principle, sorted out, and a total of 1119 test texts were obtained. Word segmentation was performed after the corpus was collected.
[0070] Step S2,
[0071] 1) The total vocabulary of each category, and remove the words whose frequency is below 0.0001%. This is because words that occur too infrequently for a class are of little importance to that class.
[0072]...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com