Webpage text classification system based on maximum interval criterion
A maximum interval and text classification technology, which is applied in text database clustering/classification, unstructured text data retrieval, network data retrieval, etc., to achieve the effect of improving performance, strong applicability, and high accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0035] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments, so that those skilled in the art can better understand the present invention and implement it, but the examples given are not intended to limit the present invention.
[0036] Such as figure 1 As shown, the web page text classification system based on the maximum interval criterion in the preferred embodiment of the present invention includes the following modules:
[0037] The text preprocessing module is used for preprocessing the original text data and extracting the text data;
[0038] Described pretreatment comprises:
[0039] Text segmentation: Based on different languages, combine different word segmentation algorithms for text segmentation.
[0040] Text cleaning: Combining the domain and tasks of the text corpus, remove characters, numbers and texts that may interfere with text analysis: and, using the standard stop word list, remove s...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


