Text classification method based on chi square statistics and SMO algorithm
A text classification and text technology, applied in text database clustering/classification, calculation, unstructured text data retrieval, etc., can solve the problems of many features and noise, and achieve the effect of improving classification accuracy and efficiency
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0034] The present invention will be further described below in conjunction with the accompanying drawings and specific examples.
[0035] like figure 1 Shown, a kind of text classification method based on Chi-square statistics and SMO algorithm of the present invention, concrete steps are as follows:
[0036] (1), collect Internet texts, and divide the texts into training texts and test texts: collect texts from the Internet, classify each text, and classify the texts that have been class-labeled as training texts, and classify the texts that have been class-labeled as The text to be classified, the text to be classified is used as the test text;
[0037] (2), preprocessing the training text to obtain the training text vocabulary, such as figure 2 As shown, the steps are as follows:
[0038] a), open the training document, and segment each training text;
[0039] b), For each word in the training text, judge whether it is a Chinese character, letter, or number, if so, c...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com