Feature selection and weight calculation method of imbalance text set
A feature selection and weight calculation technology, applied in computing, unstructured text data retrieval, text database clustering/classification, etc.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0043] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and examples. according to figure 1 Shown, the method that the present invention proposes is to realize by following steps successively:
[0044] Step 1: Perform text preprocessing on the unbalanced text set to extract words containing semantic information.
[0045] Step 1.1: Use Chinese lexical processing software to perform word segmentation and part-of-speech tagging on the file collection.
[0046] The experimental word segmentation process uses the Chinese lexical analysis system ICTCLAS (Institute of Computing Technology, Chinese Lexical Analysis System).
[0047] Step 1.2: Filter out stop words after word segmentation processing. Such as modal particles, prepositions, adverbs, etc.
[0048] If there are a large number of stop words in the text, it will cause noise interference to its effective information. After...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com