A Laos language text subject classification method
A topic classification and text technology, applied in the fields of natural language processing and machine learning, can solve the problems of ignoring information, text misunderstanding, etc., to avoid zero probability problems, improve accuracy, and improve the effect of classification
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0020] Embodiment 1: as Figure 1-2 Shown, a kind of Lao language text subject classification method, described method step is as follows: Step1, utilize web crawler technology to crawl Lao text, the text that has crawled five categories in total is respectively: economy, politics, education, tourism ,generally. Store them in the corresponding five folders. The folders are named after categories to facilitate subsequent retrieval and processing, and then perform text processing on the crawled articles to remove some noise words that have nothing to do with classification, so as to build a corpus; Further, the noise words can be set to include emoticons, numbers, spaces, and stop words; wherein emoticons, numbers, and spaces are removed by regular expressions, and stop words are removed by using a stop word table (appearing in the stop word table words are removed). When removing some unrelated noise words, the regular expression encoding is used as follows: u"^[\u0000-\u10ff...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com