Short text classification method based on CHI and classified association rule algorithm
A classification method and short text technology, which are applied in the fields of unstructured text data retrieval, text database clustering/classification, calculation, etc., can solve the problems of many manual interventions, difficult to determine thresholds, and low algorithm flexibility and program control.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0070] News headline classification method based on CHI and classification association rule algorithm.
[0071] The data set contains news headlines and texts of 5 categories (entertainment, finance, sports, IT, women), a total of 30,000 texts, of which 20,000 news headlines are training data, and 10,000 news headlines are test data, of which 2 The text of ten thousand pieces of training data is used as a long text for feature expansion knowledge base construction.
[0072] Category frequent factor:
[0073] Depend on Figure 6 It can be seen that if a unified minimum support threshold is set for frequent word set mining, the number of frequent word sets in each category varies greatly. In the figure, the unified minimum support threshold is 800. A total of 1025 frequent word sets have been excavated from the five categories. The number of frequent items in the financial category alone is 1022, accounting for 99.7%. The frequent word set category skew problem is more serious...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com